ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexGeneral IBM MQ SupportProblem with TCP keepalive

Post new topicReply to topic Goto page 1, 2  Next
Problem with TCP keepalive View previous topic :: View next topic
Author Message
awatson72
PostPosted: Mon Jun 05, 2006 10:34 am Post subject: Problem with TCP keepalive Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

On an AIX system, I have configured a Queue Manager to use keepalive by setting:

TCP:
KeepAlive=YES

in qm.ini

On a SVRCONN channel that has recently had traffic, but now has none, the status shows as "Running", substate "Receving", conname is 10.11.12.13

The SVRCONN channel has HBINT 10 and KAINT 15.

The AIX system has
tcp_keepidle = 3600

When I perform a TCPDump on AIX, shouldn't I see some activity every 15 seconds between the QM host AIX server, and the server hosting the application connecting with the SVRCONN channel, (10.11.12.13) ???

I see nothing, and it's leading me to believe that I haven't configured keepalive correctly, but I have done exactly as the documentation says.

Any insight appreciated.
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
wschutz
PostPosted: Mon Jun 05, 2006 10:46 am Post subject: Reply with quote

Jedi Knight

Joined: 02 Jun 2005
Posts: 3316
Location: IBM (retired)

Quote:
You can set the KeepAlive Interval (KAINT) attribute for channels on a per-channel basis. On platforms other than z/OS, you can access and modify the parameter, but it is only stored and forwarded; there is no functional implementation of the parameter. If you need the functionality provided by the KAINT parameter, use the Heartbeat Interval (HBINT) parameter, as described in Heartbeat interval (HBINT).

_________________
-wayne
Back to top
View user's profile Send private message Send e-mail AIM Address
awatson72
PostPosted: Mon Jun 05, 2006 12:27 pm Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

OK, so if HBINT is providing the functionality of the KAINT on non-Z/OS platforms, I should see a TCP packet every 15 seconds with my configuration, but I don't.

As an experiment , I set tcp_keepidle to 20, (10 seconds) at the AIX level, restarted the app, and the channel ( to make sure that the new connections would inherit the new setting), but I still see no activity.

I'm going through all this trouble because a new firewall is tearing down MQ connections every hour, and wreaking havoc on applications, (MDB and otherwise). I'm trying to follow the recommendations for solving this by making the firewall aware of the fact that, yes, MQ is still in fact quite dependent on that connection, please don't tear it down.
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
mvic
PostPosted: Mon Jun 05, 2006 12:44 pm Post subject: Reply with quote

Jedi

Joined: 09 Mar 2004
Posts: 2080

Is MD0C of any help?

http://www.ibm.com/support/docview.wss?rs=171&uid=swg24006699
Back to top
View user's profile Send private message
wschutz
PostPosted: Mon Jun 05, 2006 2:17 pm Post subject: Reply with quote

Jedi Knight

Joined: 02 Jun 2005
Posts: 3316
Location: IBM (retired)

Except:
Quote:
On server-connection and client-connection channels, heartbeats flow only when a server MCA is waiting on an MQGET command with the WAIT option which it has issued on behalf of a client application.
So if that mq client isn't in a mqget w/ wait state, hbint's are flowing....

EDIT: I meant to type:
So if that mq client isn't in a mqget w/ wait state, hbint's areN'T flowing....
_________________
-wayne


Last edited by wschutz on Wed Jun 07, 2006 5:51 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail AIM Address
awatson72
PostPosted: Tue Jun 06, 2006 4:52 am Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

I've reviewed the MD0C presentation, in fact I attended it at last year's T&M conference. It basically says to use keepalives for SVRCONN channels, which is exactly what I'm trying to get working. According to the quote provided by wschultz, keepalive itself is either on or off for the queue manager as a whole, controled by qm.ini, with no other configuration on distributed platforms, (even though the admin interfaces lead you to believe that you can change the interval).

The channel substate is being reported as "Receiving". Should heartbeats be flowing in that case, or is that not the information needed to tell for sure?

Thanks for the guidance.
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jun 06, 2006 5:06 am Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Are you sure that your tcp_keepidle setting took effect at the AIX level? You might have to restart the network interface?

Are you sure you restarted the queue manager after setting KeepAlive=Yes in qm.ini?

The documentation is quite specific that heartbeats for SVRConns only flow when the client app is issuing an MQGET with WAIT. I don't know what the substate "Receiving" indicates. So if the app is not in an MQGET with WAIT most of the time, then your firewall is always going to want to close the connection. AND it might even be right there. If the app is really sitting and doing nothing with MQ for an hour and not waiting for a message to come from someone else , it should probably be nice and close the connection.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
awatson72
PostPosted: Tue Jun 06, 2006 6:08 am Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

I'm a little confused -

wchutz says:
Quote:
So if that mq client isn't in a mqget w/ wait state, hbint's are flowing....


and jefflowery says:
Quote:
The documentation is quite specific that heartbeats for SVRConns only flow when the client app is issuing an MQGET with WAIT.


Sounds like this is conflicting information.


The application that is being compromised by the firewall timeout interval of one hour is a WebSphere MDB application, the listener port and QCF are configured with default parameters. The WAS resides on the other side of the firewall from the MQ server and queue manager.

Regarding the restarts of the QM and the network interface, these have been done to no avail.

Still stuck.
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Jun 06, 2006 4:07 pm Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Wayne made a typo. HBs only flow for MQClients while the MQClient is in a blocking MQGET with wait.

Are there MDBs in this scenario? Are they doing the gets? If so, they will certainly issue a fresh get with wait many times per hour.


But I was wondering, forget the HBs for a sec. If Keep Alive is in fact turned on, how does the server know if the client socket is still there???? There must be some sort of Keep Alive checking going on both ways on the wire. Wouldn't the firewall see THAT as activity? This is really a question for network and firewall experts; its not an MQ thing at all, but could help your MQ scenario if properly understood. Who knows, maybe a firewall is slick enough to know that keep alive traffic is not "real" traffic, and thus ignores it.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jun 06, 2006 4:42 pm Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Here's my understanding - based on my poor memory from the last time we really discussed in depth how MDB listener ports in WAS work.

The listener port is constantly (every few seconds) browsing the queue for messages on one connection. In addition, there are other connections in a pool that will be used for each instance of the MDB when a message arrives (up to a certain number). So if an MDB is configured to only every have a single instance, there will be two connections being used - and three if there are two and etc.

Each of these connections will be a single instance of the SVRCONN/CLNTCONN channel pair.

Now, suppose the queue is empty most of the time (as it should be). The browse thread is going to issue a GET with WAIT, get an *immediate* 2033, and go to sleep again until the next time it needs to check. Likewise all the other threads in the pool are going to remain idle - not in a GET with WAIT at all either.

So no heartbeats are going to flow, and the channel will look "inactive" to the firewall.

You should discuss options with your firewall administrators. Some of it may depend on the capabilities of the firewall in question. I *assume* that all modern firewalls would let you specific a timeout value for connections at the IP address level, and not require only a global value. But I'm not a firewall expert - so I don't know.

If they are able to make a specific change, but are unwilling to do it at the IP address level, and want tighter control - you can use the LOCALADDR parameter on the SVRCONN to specify what port number the channel will run under.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Jun 06, 2006 4:46 pm Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

jefflowrey wrote:
Now, suppose the queue is empty most of the time (as it should be). The browse thread is going to issue a GET with WAIT, get an *immediate* 2033, and go to sleep again until the next time it needs to check. Likewise all the other threads in the pool are going to remain idle - not in a GET with WAIT at all either.

So no heartbeats are going to flow, and the channel will look "inactive" to the firewall.

I disagree. An MQGET over a SVRCONN channel that returns a 2033 will most definitly generate traffic on the wire. The MQGET request up to the MQ server will be seen as one "message" to the Messages Count on the SVRCONN channel, and the result, 2033 or otherwise, will be seen as a second "message" on the channel as it streams back to the client, in this case the MDB.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jun 06, 2006 4:57 pm Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Yes, traffic on the wire - certainly. The GET statement will flow over to the MCA, and the RC2033 will flow back.

I guess I mean that this may not be big enough or take enough time for the firewall to notice according to it's rules of "inactive".

I also reserve judgement on whether or not the firewall might be getting confused between different instances of the SVRCONN either when deciding if they are inactive or in deciding what to shut down.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
awatson72
PostPosted: Wed Jun 07, 2006 4:39 am Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

What I’ve noticed is that when the MDB listener starts in WAS, there are two SVRCONN channels started under the definition that the QCF under the MDB is pointed to. When the MDB application is doing no work, which is usually, especially in my test environment, one of the channels is in MQGET, and the Messages count increases as the MDB polls. (Reference http://www.mqseries.net/phpBB2/viewtopic.php?t=29844
). The MDB polls every 5000 ms by default. The other channel is in state Receiving. After an hour of no messages arriving on the MDB queue, we can see the firewall tear down a connection, perhaps the “companion” connection, that one that is NOT doing the MQ Get with Waits every 5000ms, because that one sounds like it should appear active to the firewall. However, MQ and the MDB application are still relying on the “companion” connection so when a message does arrive beyond the one hour time-out the MDB fails. If my assumptions are correct, (and I’ll try to do some more verifiication), the question of why a keepalive/heartbeat isn’t happening for the “companion” channel, is still outstanding. IMHO, this channel should be kept alive by heartbeat or keepalive, but even with a sniffer hooked to the server, I still see no traffic that would suggest these are flowing.

Thoughts?
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Jun 07, 2006 4:45 am Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Yes. The MDB polls on one connection, and reserves a pool of one or more other connections for passing to an instance of the MDB when a message arrives...

I think if you configure the MDB to retry at least once, then this connection will get reestablished after the firewall kills it.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
awatson72
PostPosted: Wed Jun 07, 2006 5:06 am Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2004
Posts: 69
Location: Freeport, Maine

I agree that it would probably work to set a retry > 0 for the MDB, but I'm being told by the developer that doing so could cause significant problems with data in some cases, and besides, it's a little messy to have failures and retries going on in the app all day. My best solution is to make the firewall aware that the connection should not be killed.
_________________
Andrew Watson
L.L. Bean, Inc.
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum IndexGeneral IBM MQ SupportProblem with TCP keepalive
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.