Author |
Message
|
nosnhoj |
Posted: Fri Jun 12, 2009 7:38 am Post subject: Heartbeat timeout and after effects... |
|
|
Apprentice
Joined: 07 Sep 2005 Posts: 40 Location: Markham On.
|
I have a sender/receiver channel connected to a mainframe, every once in a while we see the HBINT timeout in the logs:
AMQ9213: A communications error for TCP/IP occurred.
EXPLANATION:
An unexpected error occurred in communications.
ACTION:
The return code from the TCP/IP(select) [TIMEOUT] 360 seconds call was 0
Sometimes everything resumes normally.. but on occasion - our (unix) sender channel will show 'running', yet there will be messages in the xmitq. The only way to get rid of them is to stop the channel, remove the messages, resolve and restart the channel.
The logs on the unix side show nothing except the timeout, then immediately after show the sender starting.
The mainframe side shows channel ended abnormally, then channel adopted and started
We have added adoptmca to both sides, and have the hbinit and discint set.
After all that i guess my question is why would our heartbeat time out (we have other connections to the same mainframe that remain up) and what happens after such a timeout and an adoption to cause the xmit queue to fill.
Thanks |
|
Back to top |
|
 |
fjb_saper |
Posted: Sat Jun 13, 2009 5:40 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
You did not specify version and fix pack level of the qmgr. I believe this was addressed by some APAR and woulb be fixed in the latest fix packs. You would have to search the IBM site for this behavior or open a PMR.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
sumit |
Posted: Mon Jun 15, 2009 1:43 am Post subject: Re: Heartbeat timeout and after effects... |
|
|
Partisan
Joined: 19 Jan 2006 Posts: 398
|
nosnhoj wrote: |
The only way to get rid of them is to stop the channel, remove the messages, resolve and restart the channel. |
And you don't need to remove messages from XMITQ to perform these steps. _________________ Regards
Sumit |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Jun 15, 2009 6:54 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
And unless you see messages in the QM error logs that say the channel is in doubt, I'd be surprised if you really need to resolve the channel. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
nosnhoj |
Posted: Wed Jun 17, 2009 6:47 am Post subject: |
|
|
Apprentice
Joined: 07 Sep 2005 Posts: 40 Location: Markham On.
|
I know we are a bit behind....
Name: WebSphere MQ
Version: 530.11 CSD11
Mainframe spits this out when the timeout happens:
CSQX208E ???? CSQXRESP Error receiving data, 641
09.39.51 STC00028 +CSQX599E ???? CSQXRESP Channel CHANNEL.01 ended abnormally
09.39.51 STC00028 +CSQX475I ???? CSQXRESP Channel CHANNEL.01 adopted |
|
Back to top |
|
 |
sumit |
Posted: Fri Jun 19, 2009 7:37 am Post subject: |
|
|
Partisan
Joined: 19 Jan 2006 Posts: 398
|
nosnhoj wrote: |
..Channel CHANNEL.01 ended abnormally.. |
Any FDCs? _________________ Regards
Sumit |
|
Back to top |
|
 |
|