Author |
Message
|
G_H |
Posted: Mon Mar 14, 2011 6:20 am Post subject: AMQ9513; possibly due to network failure? |
|
|
Newbie
Joined: 14 Mar 2011 Posts: 3
|
Greetings MQ experts,
Lately a customer has been having issues with their infrastructure. About once every week, usually after a weekend, MQ connections will fail with the message "AMQ9513: Maximum number of channels reached". I'm trying to determine the exact cause for this. First of all, I'd like to say I've investigated other threads in this forum regarding this error.
The Websphere MQ server is a version 6.0.2.3 running on Windows XP Professional SP2. We're using a single server-connection channel (not the system default). The following makes use of the queue manager:
- A systems integration software package.
- A web portal for monitoring the above package's message processing.
- A gateway appliance for exchanging messages over a VAN.
All three are developed by the company I work for so I have access to the source code. They're all making use of the MQ Java API (regular, not JMS). I've checked the integration software's connectors and a disconnect is properly called in every appropriate situation, also on exceptions. The web portal doesn't keep connections open but always disconnects as soon as its work is done (since it's irregular and not very frequent), again also in exception handling. I haven't checked the gateway appliance's code yet but it requires very few connections compared to the integration software.
Regarding the queue manager setup, MaxChannels and MaxActiveChannels have both been set to the same value (300, I believe) which is way higher than the maximum number of connections possibly needed by all the above apps combined. I've also set the TCP keepalive to catch out broken connections.
Now, I have been noticing that there are regularly network issues in our customer's infrastructure. So here's what I was wondering... Is it possible that multiple short connectivity interruptions are leading to the queue manager running out of available channels? One theory I can think of is that a connection is broken, then shortly afterwards restored but an application has already created a new connection (after trying to disconnect the previous one). Despite the TCP keepalive, it might take a short while for the queue manager to detect a connection that no longer has a client end. Suppose this would happen multiple times per minute, the apps might burn through the maximum number of channels before a sufficient amount of stale connections could be closed. Does this sound at all plausible?
I've tried testing by manually killing TCP connections on both the server and client side using the TCPView utility. In both cases the app creates new connections to recover from it. After closing, DISPLAY CHSTATUS(...) for the channel shows that all connections are gone and then new ones appear as the app recovers. Seems to behave rather well.
I'm hardly an MQ expert so apologies if there's dumb parts among my questions. Thanks in advance. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Mar 14, 2011 1:28 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Quote: |
The Websphere MQ server is a version 6.0.2.3...
|
Old. Time to upgrade.
Quote: |
....running on Windows XP Professional SP2.
|
Really?
Quote: |
I've also set the TCP keepalive to catch out broken connections.
|
First you have to tell the QM to use the O/S values for TCP Keep Alive. Then you have to make sure the QM is restarted, that the O/S does use Keep Alive, and that the O/S'es value has been changed to something substantially less than the default of 2 hours for this parm.
Quote: |
Is it possible that multiple short connectivity interruptions are leading to the queue manager running out of available channels? One theory I can think of is that a connection is broken, then shortly afterwards restored but an application has already created a new connection (after trying to disconnect the previous one). Despite the TCP keepalive, it might take a short while for the queue manager to detect a connection that no longer has a client end. Suppose this would happen multiple times per minute, the apps might burn through the maximum number of channels before a sufficient amount of stale connections could be closed. Does this sound at all plausible? |
Very plausible. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
gbaddeley |
Posted: Mon Mar 14, 2011 2:26 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
Quote: |
We're using a single server-connection channel (not the system default) |
May I suggest that you set up 3 server-connection channels, 1 for each app. It will give you a much better view and understand of what is going on.
Have you studied the output of DISPLAY CHSTATUS(*) ALL ?
It should show the IP addresses and program names of where most of the bogus connections are coming from and the last date/time they were used. _________________ Glenn |
|
Back to top |
|
 |
G_H |
Posted: Tue Mar 15, 2011 12:38 am Post subject: |
|
|
Newbie
Joined: 14 Mar 2011 Posts: 3
|
@PeterPotkay
Thanks very much for the answer. I'll keep looking in the direction of network issues then. Considering nothing changed to the code or setup and has only started happening recently, I figured it had to be something besides MQ or the apps.
Yeah, it's an old version. We're strongly considering installing v7. Since I use that for development and testing it would be preferable anyway.
And regarding the Win XP SP2, well, that's the customer's choice
@gbaddeley
It would be easier for the overview to have separate channels per app. So far I haven't out of convenience and because we didn't really see a need.
Since there's only one active channel, the output of DISPLAY CHSTATUS(*) ALL is identical to what I've got. I thought the ALL keyword was implicit if you don't define it. Regardless, I've seen the IP addresses, know which one does what and the number of connections each makes. Seemed to correspond to my expectations.
I didn't know you could see the time of last usage. I'll have to try it with ALL and see if there's more info. Cheers! |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Mar 15, 2011 6:36 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
By giving each component its own channel name, and limiting each channel to x instances, you can easily see who the offender is, and more importantly when the offender sucks up all their allowable instances, they only prevent themselves from taking more connections. Your other MQ Client apps, your QM to QM channels and your administrative MQ client channels for things like MQ Explorer or MO71 are not impacted. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
exerk |
Posted: Tue Mar 15, 2011 7:09 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
PeterPotkay wrote: |
By giving each component its own channel name, and limiting each channel to x instances... |
Would G_H not have to migrate to V7.0 for that, or use BlockIP2 if not? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Mar 15, 2011 7:32 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
The ability to limit SVRCONN channel instances is in the MQ 7 base product, yes.
In MQ 6 and earlier, a channel exit does the trick. We use Capitalware's MQAUSX for that since we already own it for locking down channels. I don't have experience with other exits that might be able to do this same thing. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
G_H |
Posted: Tue Mar 15, 2011 12:16 pm Post subject: |
|
|
Newbie
Joined: 14 Mar 2011 Posts: 3
|
Yeah, I saw in another thread regarding the subject that the per-channel limit was new for 7. I might as well just migrate to that version, since I keep looking for options in MQ Explorer for 6 that aren't there out of habit.
Probably best to first start by assigning separate channels to the apps, so they don't blow each other up when one loses connection. They're running from different hosts and it looks like only some IPs suffer the issue.
Thanks for the advice. On a related note, would there be some convenient method of running a DISPLAY CHSTATUS script and dumping the output somewhere when the number of connections reaches a certain threshold? Kinda like queue depth events but for channels? |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Mar 15, 2011 1:37 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
G_H wrote: |
Yeah, I saw in another thread regarding the subject that the per-channel limit was new for 7. I might as well just migrate to that version, since I keep looking for options in MQ Explorer for 6 that aren't there out of habit.
Probably best to first start by assigning separate channels to the apps, so they don't blow each other up when one loses connection. They're running from different hosts and it looks like only some IPs suffer the issue.
Thanks for the advice. On a related note, would there be some convenient method of running a DISPLAY CHSTATUS script and dumping the output somewhere when the number of connections reaches a certain threshold? Kinda like queue depth events but for channels? |
Run it at specific intervals. By "greping" and using wc -l you can count the number of instances of the channel. If that number is over your threshold send an email... This is all scriptable and runnable through crontab... or java with pcf messages etc...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|