ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » AMQ9513; possibly due to network failure?

Post new topic  Reply to topic
 AMQ9513; possibly due to network failure? « View previous topic :: View next topic » 
Author Message
G_H
PostPosted: Mon Mar 14, 2011 6:20 am    Post subject: AMQ9513; possibly due to network failure? Reply with quote

Newbie

Joined: 14 Mar 2011
Posts: 3

Greetings MQ experts,

Lately a customer has been having issues with their infrastructure. About once every week, usually after a weekend, MQ connections will fail with the message "AMQ9513: Maximum number of channels reached". I'm trying to determine the exact cause for this. First of all, I'd like to say I've investigated other threads in this forum regarding this error.

The Websphere MQ server is a version 6.0.2.3 running on Windows XP Professional SP2. We're using a single server-connection channel (not the system default). The following makes use of the queue manager:
  • A systems integration software package.
  • A web portal for monitoring the above package's message processing.
  • A gateway appliance for exchanging messages over a VAN.


All three are developed by the company I work for so I have access to the source code. They're all making use of the MQ Java API (regular, not JMS). I've checked the integration software's connectors and a disconnect is properly called in every appropriate situation, also on exceptions. The web portal doesn't keep connections open but always disconnects as soon as its work is done (since it's irregular and not very frequent), again also in exception handling. I haven't checked the gateway appliance's code yet but it requires very few connections compared to the integration software.

Regarding the queue manager setup, MaxChannels and MaxActiveChannels have both been set to the same value (300, I believe) which is way higher than the maximum number of connections possibly needed by all the above apps combined. I've also set the TCP keepalive to catch out broken connections.

Now, I have been noticing that there are regularly network issues in our customer's infrastructure. So here's what I was wondering... Is it possible that multiple short connectivity interruptions are leading to the queue manager running out of available channels? One theory I can think of is that a connection is broken, then shortly afterwards restored but an application has already created a new connection (after trying to disconnect the previous one). Despite the TCP keepalive, it might take a short while for the queue manager to detect a connection that no longer has a client end. Suppose this would happen multiple times per minute, the apps might burn through the maximum number of channels before a sufficient amount of stale connections could be closed. Does this sound at all plausible?

I've tried testing by manually killing TCP connections on both the server and client side using the TCPView utility. In both cases the app creates new connections to recover from it. After closing, DISPLAY CHSTATUS(...) for the channel shows that all connections are gone and then new ones appear as the app recovers. Seems to behave rather well.

I'm hardly an MQ expert so apologies if there's dumb parts among my questions. Thanks in advance.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Mon Mar 14, 2011 1:28 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Quote:

The Websphere MQ server is a version 6.0.2.3...

Old. Time to upgrade.

Quote:

....running on Windows XP Professional SP2.

Really?

Quote:

I've also set the TCP keepalive to catch out broken connections.

First you have to tell the QM to use the O/S values for TCP Keep Alive. Then you have to make sure the QM is restarted, that the O/S does use Keep Alive, and that the O/S'es value has been changed to something substantially less than the default of 2 hours for this parm.

Quote:

Is it possible that multiple short connectivity interruptions are leading to the queue manager running out of available channels? One theory I can think of is that a connection is broken, then shortly afterwards restored but an application has already created a new connection (after trying to disconnect the previous one). Despite the TCP keepalive, it might take a short while for the queue manager to detect a connection that no longer has a client end. Suppose this would happen multiple times per minute, the apps might burn through the maximum number of channels before a sufficient amount of stale connections could be closed. Does this sound at all plausible?

Very plausible.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Mon Mar 14, 2011 2:26 pm    Post subject: Reply with quote

Jedi Knight

Joined: 25 Mar 2003
Posts: 2538
Location: Melbourne, Australia

Quote:
We're using a single server-connection channel (not the system default)


May I suggest that you set up 3 server-connection channels, 1 for each app. It will give you a much better view and understand of what is going on.

Have you studied the output of DISPLAY CHSTATUS(*) ALL ?

It should show the IP addresses and program names of where most of the bogus connections are coming from and the last date/time they were used.
_________________
Glenn
Back to top
View user's profile Send private message
G_H
PostPosted: Tue Mar 15, 2011 12:38 am    Post subject: Reply with quote

Newbie

Joined: 14 Mar 2011
Posts: 3

@PeterPotkay
Thanks very much for the answer. I'll keep looking in the direction of network issues then. Considering nothing changed to the code or setup and has only started happening recently, I figured it had to be something besides MQ or the apps.
Yeah, it's an old version. We're strongly considering installing v7. Since I use that for development and testing it would be preferable anyway.
And regarding the Win XP SP2, well, that's the customer's choice

@gbaddeley
It would be easier for the overview to have separate channels per app. So far I haven't out of convenience and because we didn't really see a need.
Since there's only one active channel, the output of DISPLAY CHSTATUS(*) ALL is identical to what I've got. I thought the ALL keyword was implicit if you don't define it. Regardless, I've seen the IP addresses, know which one does what and the number of connections each makes. Seemed to correspond to my expectations.
I didn't know you could see the time of last usage. I'll have to try it with ALL and see if there's more info. Cheers!
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Mar 15, 2011 6:36 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

By giving each component its own channel name, and limiting each channel to x instances, you can easily see who the offender is, and more importantly when the offender sucks up all their allowable instances, they only prevent themselves from taking more connections. Your other MQ Client apps, your QM to QM channels and your administrative MQ client channels for things like MQ Explorer or MO71 are not impacted.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
exerk
PostPosted: Tue Mar 15, 2011 7:09 am    Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

PeterPotkay wrote:
By giving each component its own channel name, and limiting each channel to x instances...


Would G_H not have to migrate to V7.0 for that, or use BlockIP2 if not?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Mar 15, 2011 7:32 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

The ability to limit SVRCONN channel instances is in the MQ 7 base product, yes.

In MQ 6 and earlier, a channel exit does the trick. We use Capitalware's MQAUSX for that since we already own it for locking down channels. I don't have experience with other exits that might be able to do this same thing.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
G_H
PostPosted: Tue Mar 15, 2011 12:16 pm    Post subject: Reply with quote

Newbie

Joined: 14 Mar 2011
Posts: 3

Yeah, I saw in another thread regarding the subject that the per-channel limit was new for 7. I might as well just migrate to that version, since I keep looking for options in MQ Explorer for 6 that aren't there out of habit.

Probably best to first start by assigning separate channels to the apps, so they don't blow each other up when one loses connection. They're running from different hosts and it looks like only some IPs suffer the issue.

Thanks for the advice. On a related note, would there be some convenient method of running a DISPLAY CHSTATUS script and dumping the output somewhere when the number of connections reaches a certain threshold? Kinda like queue depth events but for channels?
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Tue Mar 15, 2011 1:37 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

G_H wrote:
Yeah, I saw in another thread regarding the subject that the per-channel limit was new for 7. I might as well just migrate to that version, since I keep looking for options in MQ Explorer for 6 that aren't there out of habit.

Probably best to first start by assigning separate channels to the apps, so they don't blow each other up when one loses connection. They're running from different hosts and it looks like only some IPs suffer the issue.

Thanks for the advice. On a related note, would there be some convenient method of running a DISPLAY CHSTATUS script and dumping the output somewhere when the number of connections reaches a certain threshold? Kinda like queue depth events but for channels?


Run it at specific intervals. By "greping" and using wc -l you can count the number of instances of the channel. If that number is over your threshold send an email... This is all scriptable and runnable through crontab... or java with pcf messages etc...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » AMQ9513; possibly due to network failure?
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.