ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Cluster receiver priorities: glitch?

Post new topic  Reply to topic Goto page Previous  1, 2, 3  Next
 Cluster receiver priorities: glitch? « View previous topic :: View next topic » 
Author Message
Vitor
PostPosted: Tue Jun 07, 2011 6:53 am    Post subject: Re: Cluster receiver priorities: glitch? Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

svu wrote:
Actually, the best thing would be to find something from IBM where they say "sometimes that algorithm does not work, under the following circumstances:..." Am I asking too much?


Yes you are. The algorithm always works, it just doesn't do what you want it to do or expect it to do.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 6:55 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

WMBDEV1 wrote:
What happens to the messages that were on that QM (not new ones which will be routed to the other QM in the cluster)? When would they be able to be processed? Is that HA?
I understand what you mean. Yes, WMQ needs some subsystems (like storage) to be HA as well. But that does not affect my primary issue, right?
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Jun 07, 2011 6:56 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

WMBDEV1 wrote:
What happens to the messages that were on that QM (not new ones which will be routed to the other QM in the cluster)? When would they be able to be processed? Is that HA?




This is the classic problem when using a WMQ cluster as an HA solution - the "stuck message" scenario. Again discussed many, many times in here.

And (in summary) not a problem if you have low-priority, low-value messages which can sit on a downed queue manager for however long it takes to be brought back up. Or recreated if the hardware's U/S.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 6:58 am    Post subject: Re: Cluster receiver priorities: glitch? Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
And both of those statements are the simplistic version. I offer into evidence the number of discussions in this forum surrounding the parameters that influence message distribution and their effect on each other.
It is simplistic version considering that all other parameters (weight and rank, network priority, also individual queue priorities etc) are essentially equal. Does that make those 2 statements true or not? If not - what would be the fully and absolutely correct statements?
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Jun 07, 2011 7:00 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

svu wrote:
Yes, WMQ needs some subsystems (like storage) to be HA as well.


How will that help if the server crashes with a non-disc issue? Or the application that reads the messages fails on that instance? How many "subsystems" need to be HA before you've built something that's actually better done by an HA solution? Like the multi-instance queue manager feature in WMQv7 which is an HA solution where a cluster is not?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:01 am    Post subject: Re: Cluster receiver priorities: glitch? Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
Yes you are. The algorithm always works, it just doesn't do what you want it to do or expect it to do.

Sorry, you lost me here... If the algorithm always works, the rules #1-#15 would cause the messages to appear on the primary QM (prio 5). So, either the algorithm sometimes does not work (what would be the conditions?) or I am missing something in those rules (or perhaps I am overlooking something in our environment). That was my question - what could be that something "non-evident" that could affect the behavior of the algorithm?
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:04 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
Like the multi-instance queue manager feature in WMQv7 which is an HA solution where a cluster is not?
Ok, let's assume that you convinced me that the cluster is not full HA solution. Could we return to the original issue please? I hope even non-HA cluster should follow the algorithm...

Last edited by svu on Tue Jun 07, 2011 7:08 am; edited 1 time in total
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Jun 07, 2011 7:06 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

svu wrote:
I hope even non-HA cluster should follow the algorithm...


It should, and probably does, but it takes very little to disrupt traffic flow. As other posters have said, you're 99.9% sure nothing happened. That's not certain. One network blip would do it.

From the bottom of the link you posted:

Quote:
The distribution of user messages is not always exact, because administration and maintenance of the cluster causes messages to flow across channels. The result is an uneven distribution of user messages which can take some time to stabilize. Because of the admixture of administration and user messages, place no reliance on the exact distribution of messages during workload balancing


I've bolded the relevant section.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:12 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
One network blip would do it.
Does it mean there are no retries in the cluster?
Quote:
From the bottom of the link you posted: I've bolded the relevant section.
Right. That is something reasonably official. Thank you! As usual, the most important details are at the bottom:)
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Jun 07, 2011 7:17 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

svu wrote:
Vitor wrote:
One network blip would do it.
Does it mean there are no retries in the cluster?


Retry what? Remember that all the workload balancing is doing is deciding which of the cluster targets the message is to be delivered to. Once that's been decided, the normal channel logic (including retry) comes into play. And why would the balancing ever need to retry?

This is the other classic "stuck message" scenario, where the message is not stuck on the downed queue manager, but stuck in the SCTQ waiting for the designated queue manager to come back.


svu wrote:
As usual, the most important details are at the bottom:)


Proving it's important to read all the documentation, not just some of it.
_________________
Honesty is the best policy.
Insanity is the best defence.


Last edited by Vitor on Tue Jun 07, 2011 7:18 am; edited 1 time in total
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:17 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Just for the record, the url for v7 algorithm: https://www-304.ibm.com/support/docview.wss?uid=swg21127527
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:22 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
Proving it's important to read all the documentation, not just some of it.
Sure! Still... my personal amusement is not 100% satisfied... The cluster was in the stable state, it means no control messages would be sent around... Yes, the disclaimer from IBM formally covers it all, but still it would be interesting to find out what actually happened... Something definitely broke the algorithm.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Jun 07, 2011 7:26 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

svu wrote:
it would be interesting to find out what actually happened...


You have fun with that.

svu wrote:
Something definitely broke the algorithm.


Or caused it to make a decision you were not expecting. That's not broken.

But if you can prove that all the conditions were correctly met & the algorithm did not behave as documented, you're within your rights to raise a PMR.

Or indeed raise a PMR for assistance in discovering why it made the decision it did.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
svu
PostPosted: Tue Jun 07, 2011 7:30 am    Post subject: Reply with quote

Voyager

Joined: 30 Jan 2006
Posts: 99

Vitor wrote:
Or caused it to make a decision you were not expecting.
Right.

Quote:
Or indeed raise a PMR for assistance in discovering why it made the decision it did.
Yes, we always have that option...
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Jun 07, 2011 9:03 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/topic/com.ibm.mq.csqzah.doc/qc10940_.htm

Look at #8 and # 9.

If the channel to your primary QM is any other state other than INACTIVE or RUNNING, for even a brief second, then the algorithm will route the messages to the other QM, because the channel to that QM is probably sitting at an INACTIVE state (#8 ) and thus preferred until the primary channel returns to RUNNING or INACTIVE.

A channel that is normally stopping and starting due to the disconnect interval will cycle thru the intermediate channel states listed in #9. If a new message arrives at that moment, the cluster will send it to the channel that is in state # 8 instead.

You can minimize this by using a very large DISCINT value to insure the channels basically never go INACTIVE (I still wouldn't use zero for the value). But one little network blip will still leave you vulnerable to this.

This is not a 100% bullet proof solution. You WILL find messages going to the wrong QM sooner or later when you don't want them to.

There have been more in depth discussions on this exact topic in the past, that came to the same conclusion.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page Previous  1, 2, 3  Next Page 2 of 3

MQSeries.net Forum Index » Clustering » Cluster receiver priorities: glitch?
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.