Author |
Message
|
haba1311 |
Posted: Sun Jan 01, 2006 7:10 am Post subject: AMQ9213: A communications error for TCP/IP occurred |
|
|
Newbie
Joined: 04 Feb 2005 Posts: 8
|
Please can somebody advise on the above error message. The full details are:
WMQ5.3 + FP 8
AIX 5.2.0.0
details in AMQERR01.LOG
01/01/06 03:23:55
AMQ9213: A communications error for TCP/IP occurred.
EXPLANATION:
An unexpected error occurred in communications.
ACTION:
The return code from the TCP/IP(select) [TIMEOUT] 360 seconds call was 0
(X'0'). Record these values and tell the systems administrator.
Also how do i interpret the tcp return codes
Thankss |
|
Back to top |
|
 |
fjb_saper |
Posted: Sun Jan 01, 2006 8:30 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
|
Back to top |
|
 |
csmith28 |
Posted: Sun Jan 01, 2006 11:41 am Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
There are no TCP/IP return codes in that error output and asside from teh 360 second TIMEOUT the error is rather ambiguous. Was it accompanied by another alert like "AMQ9999: Channel ended abnormally." or something else? Is the listener process running?
Heres a list of AIX TCP/IP Return Codes:
Code: |
#define EPERM 1 /* Operation not permitted */
#define ENOENT 2 /* No such file or directory */
#define ESRCH 3 /* No such process */
#define EINTR 4 /* interrupted system call */
#define EIO 5 /* I/O error */
#define ENXIO 6 /* No such device or address */
#define E2BIG 7 /* Arg list too long */
#define ENOEXEC 8 /* Exec format error */
#define EBADF 9 /* Bad file descriptor */
#define ECHILD 10 /* No child processes */
#define EAGAIN 11 /* Resource temporarily unavailable */
#define ENOMEM 12 /* Not enough space */
#define EACCES 13 /* Permission denied */
#define EFAULT 14 /* Bad address */
#define ENOTBLK 15 /* Block device required */
#define EBUSY 16 /* Resource busy */
#define EEXIST 17 /* File exists */
#define EXDEV 18 /* Improper link */
#define ENODEV 19 /* No such device */
#define ENOTDIR 20 /* Not a directory */
#define EISDIR 21 /* Is a directory */
#define EINVAL 22 /* Invalid argument */
#define ENFILE 23 /* Too many open files in system */
#define EMFILE 24 /* Too many open files */
#define ENOTTY 25 /* Inappropriate I/O control operation */
#define ETXTBSY 26 /* Text file busy */
#define EFBIG 27 /* File too large */
#define ENOSPC 28 /* No space left on device */
#define ESPIPE 29 /* Invalid seek */
#define EROFS 30 /* Read only file system */
#define EMLINK 31 /* Too many links */
#define EPIPE 32 /* Broken pipe */
#define EDOM 33 /* Domain error within math function */
#define ERANGE 34 /* Result too large */
#define ENOMSG 35 /* No message of desired type */
#define EIDRM 36 /* Identifier removed */
#define ECHRNG 37 /* Channel number out of range */
#define EL2NSYNC 38 /* Level 2 not synchronized */
#define EL3HLT 39 /* Level 3 halted */
#define EL3RST 40 /* Level 3 reset */
#define ELNRNG 41 /* Link number out of range */
#define EUNATCH 42 /* Protocol driver not attached */
#define ENOCSI 43 /* No CSI structure available */
#define EL2HLT 44 /* Level 2 halted */
#define EDEADLK 45 /* Resource deadlock avoided */
#define ENOTREADY 46 /* Device not ready */
#define EWRPROTECT 47 /* Write-protected media */
#define EFORMAT 48 /* Unformatted media */
#define ENOLCK 49 /* No locks available */
#define ENOCONNECT 50 /* no connection */
#define ESTALE 52 /* no filesystem */
#define EDIST 53 /* old, currently unused AIX errno*/
#define EINPROGRESS 55 /* Operation now in progress */
#define EALREADY 56 /* Operation already in progress */
#define ENOTSOCK 57 /* Socket operation on non-socket */
#define EDESTADDRREQ 58 /* Destination address required */
#define EDESTADDREQ EDESTADDRREQ /* Destination address required */
#define EMSGSIZE 59 /* Message too long */
#define EPROTOTYPE 60 /* Protocol wrong type for socket */
#define ENOPROTOOPT 61 /* Protocol not available */
#define EPROTONOSUPPORT 62 /* Protocol not supported */
#define ESOCKTNOSUPPORT 63 /* Socket type not supported */
#define EOPNOTSUPP 64 /* Operation not supported on socket */
#define EPFNOSUPPORT 65 /* Protocol family not supported */
#define EAFNOSUPPORT 66 /* Address family not supported by protocol family */
#define EADDRINUSE 67 /* Address already in use */
#define EADDRNOTAVAIL 68 /* Can't assign requested address */
#define ENETDOWN 69 /* Network is down */
#define ENETUNREACH 70 /* Network is unreachable */
#define ENETRESET 71 /* Network dropped connection on reset */
#define ECONNABORTED 72 /* Software caused connection abort */
#define ECONNRESET 73 /* Connection reset by peer */
#define ENOBUFS 74 /* No buffer space available */
#define EISCONN 75 /* Socket is already connected */
#define ENOTCONN 76 /* Socket is not connected */
#define ESHUTDOWN 77 /* Can't send after socket shutdown */
#define ETIMEDOUT 78 /* Connection timed out */
#define ECONNREFUSED 79 /* Connection refused */
#define EHOSTDOWN 80 /* Host is down */
#define EHOSTUNREACH 81 /* No route to host */
#define EPROCLIM 83 /* Too many processes */
#define EUSERS 84 /* Too many users */
#define ELOOP 85 /* Too many levels of symbolic links */
#define ENAMETOOLONG 86 /* File name too long */
#define ENOTEMPTY 87
#define EDQUOT 88 /* Disc quota exceeded */
#define ECORRUPT 89 /* Invalid file system control data */
#define EREMOTE 93 /* Item is not local to host */
#define ENOSYS 109 /* Function not implemented POSIX */
#define EMEDIA 110 /* media surface error */
#define ESOFT 111 /* I/O completed, but needs relocation */
#define ENOATTR 112 /* no attribute found */
#define ESAD 113 /* security authentication denied */
#define ENOTRUST 114 /* not a trusted program */
#define ETOOMANYREFS 115 /* Too many references: can't splice */
#define EILSEQ 116 /* Invalid wide character */
#define ECANCELED 117 /* asynchronous i/o cancelled */
#define ENOSR 118 /* temp out of streams resources */
#define ETIME 119 /* I_STR ioctl timed out */
#define EBADMSG 120 /* wrong message type at stream head */
#define EPROTO 121 /* STREAMS protocol error */
#define ENODATA 122 /* no message ready at stream head */
#define ENOSTR 123 /* fd is not a stream */
#define ECLONEME ERESTART /* this is the way we clone a stream ... */
#define ENOTSUP 124 /* POSIX threads unsupported value */
#define EMULTIHOP 125 /* multihop is not allowed */
#define ENOLINK 126 /* the link has been severed */
#define EOVERFLOW 127 /* value too large to be stored in data type */ |
_________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
haba1311 |
Posted: Mon Jan 02, 2006 12:52 am Post subject: |
|
|
Newbie
Joined: 04 Feb 2005 Posts: 8
|
you are correct, after the tcp error you get the following:
----- amqccita.c : 3084 -------------------------------------------------------
01/01/06 03:23:55
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'PGBCGI01.PDEHAA02' ended abnormally.
ACTION:
Look at previous error messages for channel program 'PGBCGI01.PDEHAA02' in the
error files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
01/01/06 03:35:06
AMQ9002: Channel 'PGBCGI01.PDEHAA02' is starting.
EXPLANATION:
Channel 'PGBCGI01.PDEHAA02' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
01/01/06 04:25:11
AMQ9213: A communications error for TCP/IP occurred.
EXPLANATION:
An unexpected error occurred in communications.
ACTION:
The return code from the TCP/IP(select) [TIMEOUT] 360 seconds call was 0
(X'0'). Record these values and tell the systems administrator.
----- amqccita.c : 3084 -------------------------------------------------------
01/01/06 04:25:11
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'PGBCGI01.PDEHAA02' ended abnormally.
ACTION:
Look at previous error messages for channel program 'PGBCGI01.PDEHAA02' in the
error files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
01/01/06 04:25:21
AMQ9002: Channel 'PGBCGI01.PDEHAA02' is starting.
EXPLANATION:
Channel 'PGBCGI01.PDEHAA02' is starting.
ACTION:
None.
After which the channel runs ok
then happens again:
01/01/06 18:40:00
AMQ9213: A communications error for TCP/IP occurred.
EXPLANATION:
An unexpected error occurred in communications.
ACTION:
The return code from the TCP/IP(select) [TIMEOUT] 360 seconds call was 0
(X'0'). Record these values and tell the systems administrator.
----- amqccita.c : 3084 -------------------------------------------------------
01/01/06 18:40:00
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'PGBCGI01.PDEHAA02' ended abnormally.
ACTION:
Look at previous error messages for channel program 'PGBCGI01.PDEHAA02' in the
error files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
01/01/06 18:46:35
AMQ9002: Channel 'PGBCGI01.PDEHAA02' is starting.
EXPLANATION:
Channel 'PGBCGI01.PDEHAA02' is starting.
ACTION:
None.
again the channle runs ok
The MQ listner process is running fine...hence the subsequent channel start is successful ok..
The sympton of the above is that the WICS MQ adapter which is running on the server where PGCBGI01 resides is unable to write to the rq on PGBCGI01 which is 'linked' to the local q on PDEHAA02. This failure to put messages causes the the WICS MQ adapter to die with the followig errors in WICS MQ adapter log(repeated 3 times):
[System: ConnectorAgent] [SS: UL_FEU_ECLIPSE_MQSeriesConnector] [Thread: VBJ ThreadPool Worke
r (#1902439807)] [Type: Error] [MsgID: 27004] [Mesg: Failed to put message (no ID assigned) in queue queue://PGBCGI01/U2378.W
ICS.ECLIPSE_MESSAGES.XIB?targetClient=1. JMS Provider (IBM) reported the following error: MQJMS2007: failed to send message t
o MQ queue. LinkedException: MQJMS2007: failed to send message to MQ queue.]
Then you get the following in the WICS MQ adapter log(repeated times):
[System: ConnectorAgent] [SS: UL_FEU_ECLIPSE_MQSeriesConnector] [Thread: VBJ ThreadPool Worke
r (#1902996863)] [Type: Error] [MsgID: 21010] [Mesg: Failed to complete processing due to an unrecoverable error. See previou
s error message(s) for details and correct the situation before continuing.]
Then(3 times)
[System: ConnectorAgent] [SS: UL_FEU_ECLIPSE_MQSeriesConnector] [Thread: VBJ ThreadPool Worke
r (#1902751103)] [Type: Error] [MsgID: 21003] [Mesg: Failed in doVerbFor for business object UL_FEU_ECLIPSE_ASBO_WMQ_Z2MATMAS
with verb Create. Explanation provided to InterChange server: javax.jms.JMSException: MQJMS2003: failed to disconnect queue
manager.]
Then(3times)
[System: ConnectorAgent] [SS: UL_FEU_ECLIPSE_MQSeriesConnector] [Thread: VBJ ThreadPool Worke
r (#1902751103)] [Type: Info] [MsgID: 17066] [Mesg: Application state is disconnected]
Then (3times)
[System: Server] [Thread: Restart Thread - UL_FEU_ECLIPSE_MQSeriesConnector- retry #1 (#19010
96319)] [Mesg: java.lang.NullPointerException
at CxCommon.Messaging.IIOP.IDLControllerProxy.stopSession(IDLControllerProxy.java:735)
at AppSide_Connector.AppEnd.reinit(AppEnd.java:1753)
at AppSide_Connector.AppEnd.access$000(AppEnd.java:46)
at AppSide_Connector.AppEnd$1.run(AppEnd.java:1656)
]
Adapter dies.
As short term fix a monitor jobs dectes adapter has died and re-starts the adapter successfully. WICS MQ adapter successfull puts meesages on the RQ on PGCGI01 which successfully sends messages to PDEHAA02, until the next time there is tcp error. |
|
Back to top |
|
 |
csmith28 |
Posted: Mon Jan 02, 2006 7:31 am Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
Is 'PGBCGI01.PDEHAA02' a Sender Channel? If so I doubt that it is starting successfully. It is more likely RETRYING after it encounters the first TIMEOUT.
Please display the XMITQ and post the results here. From what you have posted, this doesn't appear to be a problem with WICS and is more likely a Network latency between PGBCGI01 and PDEHAA02 or for some reason PDEHAA02 is not responding. _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
haba1311 |
Posted: Mon Jan 02, 2006 8:40 am Post subject: |
|
|
Newbie
Joined: 04 Feb 2005 Posts: 8
|
'PGBCGI01.PDEHAA02' is sender channel
the sender side is starting successfully and messages sent but when you get the tcp error on the reciever you get the following in the sender mq log:
01/01/06 17:46:35
AMQ9209: Connection to host 'uedp002a (xxxxxxx)' closed.
EXPLANATION:
An error occurred receiving data from 'uedp002a (xxxxxxxx) over TCP/IP.
The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
NB uedp002a being where the recieving qmgr re-sides.
IP address's have been replaced with xxxxxxx, for obvious reasons.
XMITQ qdepth is zero at the moment.
XMIQ def is:
DEFINE QLOCAL ('PDEHAA02') +
* CRDATE (2005-12-14) +
* CRTIME (13.14.4 +
* ALTDATE (2005-12-1 +
* ALTTIME (21.34.47) +
DESCR('WebSphere MQ Default Local Queue') +
PUT(ENABLED) +
DEFPRTY(0) +
DEFPSIST(NO) +
SCOPE(QMGR) +
GET(ENABLED) +
MAXDEPTH(5000) +
MAXMSGL(4194304) +
SHARE +
DEFSOPT(SHARED) +
MSGDLVSQ(PRIORITY) +
HARDENBO +
USAGE(XMITQ) +
TRIGGER +
TRIGTYPE(FIRST) +
TRIGDPTH(1) +
TRIGMPRI(0) +
TRIGDATA('PGBCGI01.PDEHAA02') +
PROCESS(' ') +
INITQ('SYSTEM.CHANNEL.INITQ') +
RETINTVL(999999999) +
BOTHRESH(0) +
BOQNAME(' ') +
QDEPTHHI(80) +
QDEPTHLO(20) +
QDPMAXEV(ENABLED) +
QDPHIEV(DISABLED) +
QDPLOEV(DISABLED) +
QSVCINT(999999999) +
QSVCIEV(NONE) +
DISTL(YES) +
CLUSTER(' ') +
CLUSNL(' ') +
DEFBIND(OPEN) +
REPLACE
Also if the channel between sender and recivers doesn't work and the xmitq on the sender side fills up, would you expect an error message in the mq logs when the adapter tries to write to the full XMITQ ? |
|
Back to top |
|
 |
csmith28 |
Posted: Mon Jan 02, 2006 8:56 am Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
This is almost certainly a network latency of some sort. Something is interupting this connection after it is established. I doubt it has anything to do with the application.
Quote: |
Also if the channel between sender and recivers doesn't work and the xmitq on the sender side fills up, would you expect an error message in the mq logs when the adapter tries to write to the full XMITQ ? |
Yes if a message is put to a full XMITQ you will get an error in the AMQERR*.LOG stating that the message was placed on the dead letter queue if one if defined.
Earlier you did say that the adapter was doing a PUT to a QREMOTE that points to an XMITQ correct? Your not PUTing directly to the XMITQ right? _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
haba1311 |
Posted: Tue Jan 03, 2006 4:00 am Post subject: |
|
|
Newbie
Joined: 04 Feb 2005 Posts: 8
|
Yes you are correct...messages are put in the XMITQ via a remotq.
Have asked networks/comms to put sniffer etc...cheers |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Jan 03, 2006 3:47 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Worst case scenario: a sniffer may responsible....  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|