|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
TCP/IP False Failure |
« View previous topic :: View next topic » |
Author |
Message
|
CMX |
Posted: Fri Nov 05, 2004 5:05 pm Post subject: TCP/IP False Failure |
|
|
Newbie
Joined: 05 Nov 2004 Posts: 4
|
We are running on the AIX RS/6000 system. In order to keep our connection alive, we are using HBINT as the KeepAlive component. But once in a while, the HBINT would not be able to locate the remote server and alert the admin that the channel is disconnect. But when the Admin goes to investigate, we cant find any indication of channel disconnection on both ends. It's like there is a glitch on the connection.
The network team suspects that the keepalive mechanism for our channel is too sensitive. Anyone know the reason why??  |
|
Back to top |
|
 |
csmith28 |
Posted: Fri Nov 05, 2004 5:48 pm Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
What versions of MQ & AIX are you running?
What exaclty indicated that your HBINT was not able to locat the remote server? Were there any entries in errpt, a return code or an entry in the AMQERR0*.LOG's or any other logs?
Is KeepAlive=YES in your qm.ini or are you just using the HBINT attribute?
Are you getting any AMQ9999 Channel Ended Abnormally entries in your /var/mqm/qmgrs/MQMGRNAME/error/AMQERR0*.LOG's?
I have had a similar problem in the past. It's as if the server that my MQManager is hosted on suddenly and for no apparent reason takes a Network Napp for intervals that range from a few seconds to 30 minutes.
No .FDC files are generated, errpt -a | more shows no hardware failures regarding the NIC or software failure regarding inetd or TCP/IP. My /var/mqm/qmgrs/MQMGR/errors/AMQERR01.LOG only shows AMQ9999 channel ended abnormally for my SDR/RCVR Channels and gank load of TCP/IP Connection reset by peer entries. I got the Network Group to put a Sniffer on the MQ Server for about three weeks once. The problem re-occured once while the Sniffer was in place but buy the time we got some one from the Network group to join the Bridge, service had been restored and the Sniffer Buffer was already full of normal traffic having pushed out any information from the "Network Napp" that the server took.
During these outages I could neither ping, traceroute or telnet to my MQ Server.
In every case, things suddenly just started to work again. No one knows why. I know I didn't do anything to restore service and the Network groupd claims the same. _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
CMX |
Posted: Fri Nov 05, 2004 6:06 pm Post subject: |
|
|
Newbie
Joined: 05 Nov 2004 Posts: 4
|
AIX ver. 4.3
MQSeries ver. 5.3
The error was inidicated on the error report in the /var/mqm/qmgrs/ directory. The funny part is that it only occured in a few of our projects.
So how did you resolve the situation when you had the problem? |
|
Back to top |
|
 |
CMX |
Posted: Fri Nov 05, 2004 6:10 pm Post subject: |
|
|
Newbie
Joined: 05 Nov 2004 Posts: 4
|
Also, we are only using HBINT to keep the connection alive. Would it be a better alternative to set the KeepAlive in the TCP Stanza instead? |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|