Thinking Out Loud

January 20, 2017

Goldengate Network Troubleshooting

Filed under: GoldenGate — mdinh @ 2:50 pm

We encounter the following error:

GGS ERROR 150  Oracle GoldenGate Capture for Oracle, pump.prm:  TCP/IP error 9 (Bad file descriptor).

Note: server collectors at target will not be started and ports not opened until pump at source is started.
Using nc to test port.

$ nc -v -z -w 3 target.local 7960; echo $?
nc: connect to target.local port 7960 (tcp) failed: Connection refused
1

$ nc -v -z -w 3 target.local 7970; echo $?
Connection to target.local 7970 port [tcp/*] succeeded!
0

Not preferred and does not mean it’s wrong – Entry starting at 16.

DYNAMICPORTLIST 7960-7980 (20)

In hindsight, I should have started a few more pump extracts
to determine if Entry values cycle back to 0 and
if port assignment will start at 7960 or will fail.

This is what happens when dozen of pump extracts areĀ stopped and started in groups while manager is RUNNING.

Sending GETPORTINFO request to MANAGER ...

Dynamic Port List

Starting Index 20
Reassign Delay 30 seconds

Entry Port  Error  Process     Assigned             Program
----- ----- ----- ----------   -------------------  -------
  16   7976     0       6260   2017/01/19 12:06:12  Server
  17   7977     0       6261   2017/01/19 12:06:12  Server
  18   7978     0       6262   2017/01/19 12:06:12  Server
  19   7979     0       6263   2017/01/19 12:06:12  Server

Houston, we have a problem.
Look at Error column.
Anything other than 0 is not good.
Look at the date assigned.
These look to be orphaned processes and manager still thinks port is assigned.

GGSCI> SEND MANAGER GETPORTINFO

Sending GETPORTINFO request to MANAGER ...

Dynamic Port List

Starting Index 18
Reassign Delay 30 seconds

Entry Port  Error  Process     Assigned             Program
----- ----- ----- ----------   -------------------  -------
   0   7841    98      31663   2016/12/30 08:03:18  Server
   1   7842    98      31664   2016/12/30 08:03:18  Server
   2   7843    98                
   3   7844    98                
   4   7845    98                
   5   7846    98       1243   2016/12/30 08:14:01  Server
   6   7847    98       4543   2016/12/30 08:34:28  Server
   7   7848    98       4815   2016/12/30 08:35:55  Server
   8   7849    98       5094   2016/12/30 08:37:07  Server
   9   7850    98       5151   2016/12/30 08:37:20  Server
  10   7851    98       5152   2016/12/30 08:37:25  Server
  11   7852    98      26856   2017/01/17 21:57:38  Server
  12   7853    98      32133   2017/01/17 22:30:35  Server
  13   7854    98      16390   2017/01/06 03:56:56  Server
  14   7855    98      32220   2017/01/17 22:30:41  Server
 
  15   7856     0       4774   2017/01/17 22:57:40  Server
  16   7857     0       4777   2017/01/17 22:57:52  Server
  17   7858     0       4779   2017/01/17 22:57:59  Server
  
  18   7859    98      26854   2017/01/17 21:57:38  Server
  19   7860    98      26855   2017/01/17 21:57:38  Server

This is what I like.
Notice the timestamp for Assigned are all the same.
This is because pump from source was started using wildcard, i.e. start p*
Oracle support does not recommend this and YMMV.

GGSCI> !
SEND MANAGER GETPORTINFO

Sending GETPORTINFO request to MANAGER ...

Dynamic Port List

Starting Index 14
Reassign Delay 30 seconds

Entry Port  Error  Process     Assigned             Program
----- ----- ----- ----------   -------------------  -------
   0   7960     0       7744   2017/01/19 12:15:13  Server
   1   7961     0       7745   2017/01/19 12:15:13  Server
   2   7962     0       7746   2017/01/19 12:15:13  Server
   3   7963     0       7747   2017/01/19 12:15:13  Server
   4   7964     0       7748   2017/01/19 12:15:13  Server
   5   7965     0       7749   2017/01/19 12:15:13  Server
   6   7966     0       7750   2017/01/19 12:15:13  Server
   7   7967     0       7751   2017/01/19 12:15:13  Server
   8   7968     0       7752   2017/01/19 12:15:13  Server
   9   7969     0       7753   2017/01/19 12:15:13  Server
  10   7970     0       7754   2017/01/19 12:15:13  Server
  11   7971     0       7755   2017/01/19 12:15:13  Server
  12   7972     0       7756   2017/01/19 12:15:13  Server
  13   7973     0       7757   2017/01/19 12:15:13  Server

GGSCI> sh ps -ef|grep ./server

512       7744  7742  0 12:15 ?        00:00:00 ./server -p 7960 -k -l /ggs1040/ggserr.log
512       7745  7742  0 12:15 ?        00:00:00 ./server -p 7961 -k -l /ggs1040/ggserr.log
512       7746  7742  0 12:15 ?        00:00:00 ./server -p 7962 -k -l /ggs1040/ggserr.log
512       7747  7742  0 12:15 ?        00:00:00 ./server -p 7963 -k -l /ggs1040/ggserr.log
512       7748  7742  0 12:15 ?        00:00:00 ./server -p 7964 -k -l /ggs1040/ggserr.log
512       7749  7742  0 12:15 ?        00:00:00 ./server -p 7965 -k -l /ggs1040/ggserr.log
512       7750  7742  0 12:15 ?        00:00:00 ./server -p 7966 -k -l /ggs1040/ggserr.log
512       7751  7742  0 12:15 ?        00:00:00 ./server -p 7967 -k -l /ggs1040/ggserr.log
512       7752  7742  0 12:15 ?        00:00:00 ./server -p 7968 -k -l /ggs1040/ggserr.log
512       7753  7742  0 12:15 ?        00:00:00 ./server -p 7969 -k -l /ggs1040/ggserr.log
512       7754  7742  0 12:15 ?        00:00:00 ./server -p 7970 -k -l /ggs1040/ggserr.log
512       7755  7742  0 12:15 ?        00:00:00 ./server -p 7971 -k -l /ggs1040/ggserr.log
512       7756  7742  0 12:15 ?        00:00:00 ./server -p 7972 -k -l /ggs1040/ggserr.log
512       7757  7742  0 12:15 ?        00:00:00 ./server -p 7973 -k -l /ggs1040/ggserr.log
512       7759  7741  0 12:16 pts/1    00:00:00 sh -c ps -ef|grep ./server
512       7761  7759  0 12:16 pts/1    00:00:00 grep ./server

GGSCI>

Good followup reading.
OGG GGS Error 150: No Dynamic Ports Available Orphan Ports Server Collector (Doc ID 965356.1)

A SERVER process is an "orphan" if netstat or lsof shows only a "listening" port, with no "ESTABLISHED" connections.

Updated Jan 25 2017
In TCP networking, what is a FIN_WAIT state?
https://kb.iu.edu/d/ajmi

Sending GETPORTINFO request to MANAGER ...

Dynamic Port List

Starting Index 32
Reassign Delay 30 seconds

Entry Port  Error  Process     Assigned             Program
----- ----- ----- ----------   -------------------  -------
   0   7841    98                
   1   7842    98                
   2   7843    98                
   3   7844    98                
   4   7845    98                
   5   7846     0       1397   2017/01/24 21:03:55  Server
   6   7847    98                
   7   7848     0       1398   2017/01/24 21:03:55  Server
   8   7849    98                
   9   7850     0       1399   2017/01/24 21:03:55  Server
  10   7851     0       1400   2017/01/24 21:03:55  Server
  11   7852     0       1547   2017/01/24 21:04:24  Server
  12   7853    98                
  13   7854    98                
  14   7855    98                
  15   7856    98                
  16   7857     0       1548   2017/01/24 21:04:24  Server
  17   7858     0       1549   2017/01/24 21:04:24  Server
  18   7859    98                
  19   7860    98                
  20   7861    98                
  21   7862    98                
  22   7863    98                
  23   7864    98                
  24   7865     0       1550   2017/01/24 21:04:24  Server
  25   7866     0       1587   2017/01/24 21:04:38  Server
  26   7867     0       1588   2017/01/24 21:04:38  Server
  27   7868     0       1589   2017/01/24 21:04:38  Server
  28   7869     0       1590   2017/01/24 21:04:38  Server
  29   7870     0       1596   2017/01/24 21:05:00  Server
  31   7872     0       6491   2017/01/24 21:31:14  Server
  

$ netstat -an|grep 7864
tcp        0      0 0.0.0.0:7864                0.0.0.0:*                   LISTEN      
tcp        1      0 10.80.25.64:7864            10.80.25.64:42242           CLOSE_WAIT  
tcp        0      0 10.80.25.64:42242           10.80.25.64:7864            FIN_WAIT2 

$ netstat -an|grep 7870
tcp        0      0 0.0.0.0:7870                0.0.0.0:*                   LISTEN      
tcp        0      0 10.80.25.64:7870            10.80.26.36:13810           ESTABLISHED

$ netstat -an|grep 7841
tcp        0      0 0.0.0.0:7841                0.0.0.0:*                   LISTEN      
tcp        0      0 10.80.25.64:39288           10.80.25.64:7841            TIME_WAIT
Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: