ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » Qmgr in trouble

Post new topic  Reply to topic Goto page 1, 2  Next
 Qmgr in trouble « View previous topic :: View next topic » 
Author Message
garasan
PostPosted: Tue May 12, 2009 12:52 am    Post subject: Qmgr in trouble Reply with quote

Apprentice

Joined: 22 Jul 2008
Posts: 42
Location: Antwerp, Belgium

Hi,

Yesterday we ran into some problems with one off our qmgrs.
First some info about this qmgr:
Environment: Production
Version: MQ 6.0.2-5
OS: Sles 10

Past weekend we made some changes to 2 qmgrs.
We upgraded from 6.0.2-4 to 6.0.2-5 and we changed the Logbufferpages to 4096.
Since then only on this qmgr we see a amqzlaa0 process using a lot of cpu. The pid of this process relates to the Group PID of amqzxma0. (It is not an mqconnect from an application server or a any other external process as far as we can see.)
The pid is not found in the connection list of the qmgr.

Past midnight it caused two PRD offloads to lose connection to the qmgr.
Anybody seen this before?
_________________
Regards
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 12:57 am    Post subject: Re: Qmgr in trouble Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

garasan wrote:
we changed the Logbufferpages to 4096.


How was this performed? Did you change qm.ini (if it's called that on Solaris) or saveqmgr & recreate queue manager?

Do I read your post correctly in that you performed this procedure on 2 queue managers, but only 1 is displaying this unexpected behaviour?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
vol
PostPosted: Tue May 12, 2009 1:13 am    Post subject: Reply with quote

Acolyte

Joined: 01 Feb 2009
Posts: 69

the amqzlaa0 process only uses cpu in response to API requests from apps. determine the apps connected to the agent process, and check what they are doing.
Back to top
View user's profile Send private message
garasan
PostPosted: Tue May 12, 2009 1:21 am    Post subject: Reply with quote

Apprentice

Joined: 22 Jul 2008
Posts: 42
Location: Antwerp, Belgium

Hi Vitor,

this was performed by adjusting the qm.ini.
The qmgr was stopped during the change and started after the change.
(It is a linux box (sles 10) it is running on, not Solaris)

Hi Vol,
That's the problem. I'm not able to determine which app is connected to this particular agent process as it is not appearing in the connection list of this qmgr. Performed DIS conn(*) ALL to check all connections.)
_________________
Regards
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 1:21 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

vol wrote:
the amqzlaa0 process only uses cpu in response to API requests from apps.


Not strictly true IIRC; I think it's tangled up in the logging / queue loading process someplace & I remember problems in early versions of v5.3 in this area.

But I would agree that an app connection is the most likely thing if the poster hadn't already discounted that. Perhaps worth another check there to be sure perhaps.

An obvious question I should have asked first time as well as "how was this done" was "circular or linear logging"? Another intersting question is does the failing queue manager sit for a while, then go crazy with amqzlaa0 (perhaps when the first app tries to connect?) or does it go mad at start up?

If it blows on first connect, this might explain why there are no apps connected when you investigate; the connecting app having got a 2059 and shut back down.....
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
garasan
PostPosted: Tue May 12, 2009 1:32 am    Post subject: Reply with quote

Apprentice

Joined: 22 Jul 2008
Posts: 42
Location: Antwerp, Belgium

Hi Vitor,

Double checked to be sure but this PID is not to be found in the connection list. :-(
There are apps connected, but they have a different PID.

The qmgr is configured to use CIRCULAR logging.
The crazy thing is that the qmgr kept on working "normaly", although slower, for other apps.
The app servers cpu's that were impacted went crazy and crashed. (With logging that indicates that the qmgr is not available)

I didn't restart the qmgr and currently it works, although cpu usage for the qmgr machine is rather high for it's current activities.

The qmgr was started on saturday and the problem started on sunday noon.
_________________
Regards
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 1:49 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

garasan wrote:
this was performed by adjusting the qm.ini.
The qmgr was stopped during the change and started after the change.


Well it wouldn't have worked if the queue manager was running...!

I know this unsupported dodge is used to change the number of log files quite successfully; I'm less convinced it's a good way to change any of the other logging parameters. It certainly sounds like the queue manager's losing the plot and hanging up connections (as vol alluded to) under some circumstances. I wonder if some of your apps are using non-persistent messages, and when others try to use persistent messages (which are logged) your trouble starts.

In your place, I'd be inclined to shrug, accept the dodge hasn't worked for some reason and schedule a slot to recreate the queue manager with the right logging parameters. That's probably faster and cleaner than fiddling round trying to fix it. Especially as you've no access to a PMR here.

(Well you have, but the response from first line support will include a phrase to the effect "unsupported change to queue manager configuration").

Bite the bullet, accept you got unlucky, take 20 mins out and recreate the queue manager.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
garasan
PostPosted: Tue May 12, 2009 2:02 am    Post subject: Reply with quote

Apprentice

Joined: 22 Jul 2008
Posts: 42
Location: Antwerp, Belgium

O_o, not following here.

Wouldn't I just adjust the logbufferpages to 0 (Initial size) and restart?
Wasn't also not aware that this was an unsuported change.

Some extra info is that I found a lot of AMQ9209 errors in my logs during the moment the disconnects occured.

And we indead do use a mix of persistent and non persistent messages.
_________________
Regards
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 2:08 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

garasan wrote:
Wouldn't I just adjust the logbufferpages to 0 (Initial size) and restart?


Perhaps, but see my comments above about fiddling about. This might fix it, but recreating the queue manager will fix it. Or allow a PMR.

garasan wrote:
Wasn't also not aware that this was an unsuported change.


It's a commonly used dodge to change the number of log files, but officially log parameters are fixed at queue manager creation.

garasan wrote:
And we indead do use a mix of persistent and non persistent messages.


I think this proves something bad has happened to the queue manager's logging process, and convinces me even more that a clean slate is your best way forward.

Your choice of course obviously.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue May 12, 2009 5:10 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

I don't think there is any reason to recreate this QM. Changing LogBufferPages to 4096 after the QM has been created is not unsupported. The S.A.G. talks about needing to restart the QM after you make this change, implying the QM was already running with a different value for LogBufferPages at one time.

We ran at MQ 6.0.2.5 on SLES 10 on z/Linux for several months with LogBufferPages set to 4096 without problems, until we upgraded to 6.0.2.6 a few weeks ago for unrelated reasons.

Quote:
"The app servers cpu's that were impacted went crazy and crashed. (With logging that indicates that the qmgr is not available)"

Seems like an app problem to me. Just because an app gets a 2059 doesn't mean it should go nuts. Who knows what elese its doing. Maybe their code is asking your QM to work overtime.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 5:18 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

PeterPotkay wrote:
The S.A.G. talks about needing to restart the QM after you make this change,


Really? For Unix? Where?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue May 12, 2009 5:19 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/topic/com.ibm.mq.amqzag.doc/fa12640_.htm

Quote:
The value is examined when the queue manager is created or started, and might be increased or decreased at either of these times. However, a change in the value is not effective until the queue manager is restarted.

_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue May 12, 2009 5:28 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

PeterPotkay wrote:
http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/topic/com.ibm.mq.amqzag.doc/fa12640_.htm

Quote:
The value is examined when the queue manager is created or started, and might be increased or decreased at either of these times. However, a change in the value is not effective until the queue manager is restarted.


You're right - I was thinking of LogFilePages! Doh!

Shame it didn't work for this guy then.....
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
garasan
PostPosted: Tue May 12, 2009 6:10 am    Post subject: Reply with quote

Apprentice

Joined: 22 Jul 2008
Posts: 42
Location: Antwerp, Belgium

Thanks all for the added info and remarks.
We are starting to think it is also an app problem rather then an MQ problem.
We are trying to find the troublemaker app, but it seems that is the difficult part.
_________________
Regards
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Tue May 12, 2009 8:34 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

garasan wrote:
Thanks all for the added info and remarks.
We are starting to think it is also an app problem rather then an MQ problem.
We are trying to find the troublemaker app, but it seems that is the difficult part.


Shouldn't be that hard. Close the relevant channel for the connecting app (mode= force) and watch it go nuts...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » General IBM MQ Support » Qmgr in trouble
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.