ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » HELP: stuck messages in cluster

Post new topic  Reply to topic
 HELP: stuck messages in cluster « View previous topic :: View next topic » 
Author Message
deadrow
PostPosted: Tue Jun 29, 2004 5:16 am    Post subject: HELP: stuck messages in cluster Reply with quote

Newbie

Joined: 04 May 2004
Posts: 5

Here it is my BIG issue: I administer a cluster with 4 queue managers (QM0 is a front-end queue manager and QM1,QM2,QM3 are the back-end ones). QM0 sends messages to the backends queue managers where a a cluster queue is published.
When all queue manager are up, everything works fine. The problems come when I want to turn off a back-end queue manager.
If I stop the receiver channel of QM1, no problems: the cluster recognises the channel change of status and delivers the messages only to QM2 and QM3.
If i stop QM1 while the cluster is working, I get a stuck message to QM1. The cluster is not able to recognise that QM1 is going down and a message is enqueued into its trasmission queue. Is there a way to avoid this?
My hypothesis is that the cluster samples periodically its channels' status. Therefore, once a channel goes up, the cluster considers it running till the channel's status is sampled again. Hence, the cluster enqueues a message beliving that the channel is running even though the remote queue manager has been stopped. Is this a gigantic bullshit? Is there any parameter I can set to correct this misbehavior (I tried to modify the heartbeat without result)?
It is interesting to notice that if the channel is inactive and I stop QM1, the cluster doesnt send messages to QM1 (I dont get the misbehavior with the first message). I get the stuck message only when I turn off the queue manager and the channel was running.
Back to top
View user's profile Send private message
mq_guru
PostPosted: Tue Jun 29, 2004 10:25 am    Post subject: Reply with quote

Novice

Joined: 23 Feb 2004
Posts: 13

Try to suspend the qmgr from the cluster and then try to stop the Qmgr that u r trying to do maintainance. Once u fixed, then issue a resume Qmgr and that should work better. I am just guessing.

-vivek
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Tue Jun 29, 2004 1:14 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Quote:

Is this a gigantic bullshit?


Isn't it all?

What do you mean exactly by "You turn off the QM?".

The algorithem that determines where to send a message checks several things, and if all are equal, and the message is not addressed to a specific QM in the cluster, it will round robin. So whatever you are doing to "turn off" the QM is not satisfying the algorithims logic to consider that destination lower in the round robin. A retrying channel is not the same as a stopped channel is not the same as a running channel to the algorithim.

The algorithim will send the message to a running/Inactive channel over a starting channel, a starting channel over a retrying channel, a retrying channel over a stopped channel.

If you want to stop round robining to this QM, you can issue the SUSPEND command, but note that any messages that are specifically targeted for thsi QM will still try to go to it, even though it is suspended.

You could PUT_INHIBIT the queue. Not a good solution if you got lots of queues, but it does allow you to pick and choose what comes over or not. Again, if there is a message addressesed specifically for this queue/QM, the message will still come over, and now go to the DLQ.

Or you can manually STOP the CLUSRCVR channel. That will keep the algorithim from sending any messages to this QM, unless, again, the messages is addressed specifically for that QM.

It is my understanding that the algoritim goes through all its checks with fresh info about the channels and queues for every message, but I am not 100% sure on that.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Jun 29, 2004 5:14 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Quote:

I get the stuck message only when I turn off the queue manager and the channel was running.


What is the status of the channel at this point? I assume retrying? And if so, the algorithim should not try and send a message to that QM. If it is, I can think of only 3 reasons.

1.) You placed a message that was specifically addressed to that queue.

2.) That channel should go retrying almost immediatly. For that tiny little bit of time inbetween, the channel is still running, and thus the algorithim thinks the QM is 100%. There is no tuning parameter I know of that can make the sender side relize any faster that the receiver side is down. Its pretty fast as is.

3.) There is a bug in the code.


_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
JasonE
PostPosted: Wed Jun 30, 2004 1:27 am    Post subject: Reply with quote

Grand Master

Joined: 03 Nov 2003
Posts: 1220
Location: Hursley

What release / fixpack?

Do you have this on?
http://www-1.ibm.com/support/search.wss?apar=include&q=IC36185
Back to top
View user's profile Send private message
deadrow
PostPosted: Wed Jun 30, 2004 6:03 am    Post subject: Reply with quote

Newbie

Joined: 04 May 2004
Posts: 5

When I wrote "turn off" I meant shutdown the queue manager.

I am using WebsphereMQ v. 5.3 CSD06. IC36185 is fixed with the CSD05

I am not specifically addressing that queue.

The purpose of my test was to simulate a problem on a cluster queue manager to observe how the cluster react. I expected that when a queue manager is not reachable (In my test, I shut it down), the cluster doesnt address it. Instead, I noticed, that my cluster sends a message to that queue manager, before noticing that it has been shutdown.
I am wondering if there is a parameter that makes the cluster react faster to such event.
I send a message every 2 second. My cluster has 3 backend queue managers, hence a backend queue manager is addressed every 6 seconds (as consequence of the round robin).
In my opinion 6 second is not a tiny bit.

Thank you to everybody for helping me
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Wed Jun 30, 2004 7:22 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Well, here is what I think, and I would like to see what others say as well.

Assume QM1 is sending to QM2.

If QM2 goes down, there is going to be an unavoidable period of time where the SNDR channel on QM1 is still running, and has not yet recognized that QM2 is in the process of going down. During this time, messages will still be put to that destination by the algorithim. There is no way around this, and no way to make it realize the situation any faster.

If you know QM2 needs to come down, then stop the CLUSRCVR first on QM2. For the scenarios where QM2 crashes unexpectedly, I think this is a vulenerability that cannot be avoided.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
JasonE
PostPosted: Wed Jun 30, 2004 7:32 am    Post subject: Reply with quote

Grand Master

Joined: 03 Nov 2003
Posts: 1220
Location: Hursley

I'd agree, and the question probably is when does the channel actually get out of a running state. As the channel goes into a retry state, we should reallocate the messages to other possible destinations (assuming they were not put with bind on open).
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » HELP: stuck messages in cluster
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.