ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » mq series problems on Tandem after a CPU outage

Post new topic  Reply to topic
 mq series problems on Tandem after a CPU outage « View previous topic :: View next topic » 
Author Message
rjl_state
PostPosted: Mon Nov 11, 2002 3:37 pm    Post subject: mq series problems on Tandem after a CPU outage Reply with quote

Apprentice

Joined: 04 Oct 2002
Posts: 48
Location: Des Moines, IA

We have been using MQ Series v. 5.1 on Tandem for about 9 months without many issues Then we had an extended CPU outage and ever since then MQ Series has been acting strange.

At first some channels would not come up running. We had to copy the channel to another name and then delete the old channel. (along with changing everything pointing to the channel).

We then had channels that the channel initiator wouldn't start.

We then has a queue server process that was using 20% of a CPU. We altered all of the local queues that were using the queue server one at a time until it stopped being so busy. Then we moved the offending queue back to the original queue server and the process didn't take as much of the CPU.

Today we had sender channels that appeared to be running but wouldn't send messages and receiver channels that appeared to be receiving messages (No errors from sending platforms), but no messages ever arrived on the queue.

Anyone else have experience with losing a CPU in a busy MQ Series environment? IBM doesn't seem to have a clue on how to clean this up.

We stopped and restarted the queue manager while the cpu was down.

Any advice would be welcome.

Thanks.
Back to top
View user's profile Send private message
mqonnet
PostPosted: Wed Nov 13, 2002 6:12 am    Post subject: Reply with quote

Grand Master

Joined: 18 Feb 2002
Posts: 1114
Location: Boston, Ma, Usa.

rjl, here are your answers that i could think of..

We have been using MQ Series v. 5.1 on Tandem for about 9 months without many issues Then we had an extended CPU outage and ever since then MQ Series has been acting strange.
---Any cpu outage does have some impact. If you are on GA code base, you must upgrade to the latest level. CSD01 and some other fixes are there in that area of cpu failover. But i would strongly feel that a recycle of your QM should resolve any issues that you have. And thats what you pointed at the end of this post.

At first some channels would not come up running. We had to copy the channel to another name and then delete the old channel. (along with changing everything pointing to the channel).
---This is wiered. You should be able to use the same channels just fine. You did not mention what errors were you getting when you tried to start the channels after cpu failure. Any fds created or any errors in the error logs.

We then had channels that the channel initiator wouldn't start.
---Did you check if the Channel initiator was running at the first place.

We then has a queue server process that was using 20% of a CPU. We altered all of the local queues that were using the queue server one at a time until it stopped being so busy. Then we moved the offending queue back to the original queue server and the process didn't take as much of the CPU.
---Load balancing is very critical and MQ does not have much say in it. It is the users responsibility to design their queues and QM's in such a way that it load balances itself.

Today we had sender channels that appeared to be running but wouldn't send messages and receiver channels that appeared to be receiving messages (No errors from sending platforms), but no messages ever arrived on the queue.
---Most of the times if you do see the channels as in running state, that does not necessarily mean that they are running. Any state that the channels are in before a cpu failover, they would again be in the same state, until the HBINT is reached. Thats the time when they try to communicate and re-establish their session again. So, to confirm that the messages were not flowing, you could PHYSICALLY STOP and RESTART channels on both ends and try sending messages again. Bet, it would work fine.

Anyone else have experience with losing a CPU in a busy MQ Series environment? IBM doesn't seem to have a clue on how to clean this up.
---Well. Comments such vague are inappropriate. If IBM has developed this product they very well know how to resolve the issue. My 2 cents.

We stopped and restarted the queue manager while the cpu was down.
---And i would think everything would have been fine after that.

Any advice would be welcome.
---You are welcome. Hope i did try to answer some of your queries. Hope that helps.

Cheers.
Kumar
_________________
IBM Certified WebSphere MQ V5.3 Developer
IBM Certified WebSphere MQ V5.3 Solution Designer
IBM Certified WebSphere MQ V5.3 System Administrator
Back to top
View user's profile Send private message Send e-mail Visit poster's website
LuisFer
PostPosted: Thu Nov 14, 2002 9:15 am    Post subject: Re: mq series problems on Tandem after a CPU outage Reply with quote

Partisan

Joined: 17 Aug 2002
Posts: 302

We have the MQ server's of the pathway with this configuration:
ALTER MQS-CHANINIT00, CPUS (0,1,2,3,......15)
ALTER MQS-ECBOSS(1,2,3,4.....,0)
ALTER MQS-QUEUE00, CPUS (0:2,1:3,2:4,.....).
a so on in all pathway servers and the AUTORESTART 10.
We had some CPU DOWN and the servers is restarting OK without recycle the QMGR.
For QServer Process loop, use the --qsoptions L of the ALTMQFLS control command, is more efficient
Back to top
View user's profile Send private message
LuisFer
PostPosted: Thu Nov 14, 2002 9:48 am    Post subject: Re: mq series problems on Tandem after a CPU outage Reply with quote

Partisan

Joined: 17 Aug 2002
Posts: 302

Sorry, the stop chl(xxxx) mode(force) & start chl(xxx) is required if the MQMCACAL or MQTCPRES process running in the cpu down.
We have a Macro for doing this one.
More, if you have a LOOP with a QServer (MQS-QUEUEXX) normally is for a LOCK with a QUEUE FILE (normally when the Qserver is reseting statics of the QUEUE). This one generates a FDC & a mensage in the MQERRLG1 file. If you change, with the ALTMQFLS control command, to other QServer, the cpu usage of the Qserver (l) turns normal.
We have had some problems & this solution works fine.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » mq series problems on Tandem after a CPU outage
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.