Author |
Message
|
KeeferG |
Posted: Tue Feb 08, 2005 10:09 am Post subject: How long till the MCA's detects a tcp error |
|
|
 Master
Joined: 15 Oct 2004 Posts: 215 Location: Basingstoke, UK
|
We are currently running tests to simulate network errors. One of the tests is to disable the network interface so that the ip address is no longer valid.
After disabling the ip address a dis chs(*) still shows the channels as running. As we are running non persistent messages and fast channels, messages are getting thrown away.
My questions are
a) what is the mechanism that informs the MCA of any tcp problems so that it can shut down
b) How quick should these errors be detected. We are seeing times of minutes before the channels realise there is any problems and stops sending messages.
Name: WebSphere MQ
Version: 530.8 CSD08
CMVC level: p530-08-L040921
BuildType: IKAP - (Production)
OS: Solaris _________________ Keith Guttridge
-----------------
Using MQ since 1995 |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 08, 2005 1:12 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
|
Back to top |
|
 |
KeeferG |
Posted: Wed Feb 09, 2005 2:20 am Post subject: |
|
|
 Master
Joined: 15 Oct 2004 Posts: 215 Location: Basingstoke, UK
|
Heart beats do not help here though. Heartbeats are only sent when there is no traffic. We are sending non persistent messages consistantly so the heartbeat is never called. As we are sending only non persistent messages at a rate of 50 to 100 per second outside of syncpoint as NPMSPEED=FAST the messages just get fired into nothingness. Keep alive is already set.
I am assuming we need to do something to the tcpip settings but need to know how it interacts with MQ. I dont think there is anything I can do in MQ to help me detect this. _________________ Keith Guttridge
-----------------
Using MQ since 1995 |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Feb 09, 2005 5:39 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Hmmmm, I see.
Please post whatever you find. I am interested in the resolution. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Feb 09, 2005 8:13 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Maybe a stupid question:
Your channel has I suppose a default go inactive time of at least 1 hour after the last message.
Now if you are pushing the messages through at the rate you are indicating, what is the purpose of the "Keep Alive" parameter. Wouldn't it just be counterproductive ?
Thanks for helping educating me... |
|
Back to top |
|
 |
Nigelg |
Posted: Thu Feb 10, 2005 1:39 am Post subject: |
|
|
Grand Master
Joined: 02 Aug 2004 Posts: 1046
|
Quote: |
a) what is the mechanism that informs the MCA of any tcp problems so that it can shut down
b) How quick should these errors be detected. We are seeing times of minutes before the channels realise there is any problems and stops sending messages.
|
a) The mechanism is the TCP API, the calls to send() and recv(). WMQ is dependent on these calls returning the error.
b) That is up to the TCP layer, to how quickly it is going to report any error. |
|
Back to top |
|
 |
KeeferG |
Posted: Thu Feb 10, 2005 2:17 am Post subject: |
|
|
 Master
Joined: 15 Oct 2004 Posts: 215 Location: Basingstoke, UK
|
fjb_saper: I agree. I inherited a system with HBINT set to 10 seconds, DISCINT set to 0, KeepAlive=YES and AdoptNewMCA=ALL for a setup with NPMSPEED(FAST) for a constant stream of non persistent messages. I have so far managed to get almost all of the settings back to default as they should play no part in this set-up.
Nigelg: Cheers Nigel. That confirms my thinking. One thing though. If we are constantly sending non-persistent messages down fast channels, how does thwe sending MCA get notified that the receiving end has gone away. IS some form of handshake still being performed even though the messages are sent out side of syncpoint.
Cheers _________________ Keith Guttridge
-----------------
Using MQ since 1995 |
|
Back to top |
|
 |
clbrasfield |
Posted: Thu Sep 15, 2005 3:24 am Post subject: had same problem - found this resolution |
|
|
Newbie
Joined: 12 Nov 2003 Posts: 3 Location: Atlanta
|
I had this same problem with MQ 5.3.1 on OS/390. The resolution was to use the RCVTIME parameter in the chinit for the queue manager (sending queue manager in this case).
My understanding is that MQ takes this value, adds it to the heartbeat interval value for the channel, and stops the channel (or puts to retry cylce) if no response is received from the receiving MCA within this summed time. This will happen as messages are being delivered across a channel, thus serving as a heartbeat while messages are being delivered across a channel. In our case we were always trasmitting messages.
By the way, we worked with IBM on this and were informed at one point that this should not be necessary if TCP/IP is at the latest maintenance level. I was told that there is a specific TCP/IP fix for this, but the RCVTIME parm can be used as a workaround. It's not the most elegant solution because of the delay in discovery, but it worked for us.
I know this response is not very timely, but perhaps it will be useful to anyone else that might come across this thread. |
|
Back to top |
|
 |
|