ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » (Resolved) Load balance anomaly

Post new topic  Reply to topic
 (Resolved) Load balance anomaly « View previous topic :: View next topic » 
Author Message
bbburson
PostPosted: Tue Nov 23, 2004 1:46 pm    Post subject: (Resolved) Load balance anomaly Reply with quote

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

I'm new to MQ clustering, so please bear with me. I have searched this forum for an explanation of the behavior I'm seeing, but have not found this exact situation discussed.

The setup:
Four queue managers, two full repositories and two partials:

FR1 (MQver 530.7, Solaris 9)
FR2 (MQver 530.8, Solaris 9)
PR1 (MQver 530.7, HP 11.0)
PR2 (MQver 530.7, Solaris 9)

Cluster queue CL1 defined locally on FR1 and PR1
Cluster queue CL2 defined locally on FR2 and PR2

For each queue manager in turn I connect a client and put a message to each cluster queue; and I do that twice per queue (because the sample program amqsputc seems to be compiled with BIND_ON_OPEN option).

All the messages go where I expect them to with one glaring exception: every time I connect to FR1 and do puts to CL2 ALL the messages end up on FR2 (instead of half-and-half on FR2 and PR2). For some reason full repository FR1 favors the other full repository and never shoots any messages to the partial repository box.

The reverse situation, where I connect to FR2 and put to CL1, works as I would expect; half the messages end up on FR1 and half on PR1.

The cluster channels are all up and running. FR1 knows about both instances of CL2 (using runmqsc DIS QCLUSTER(CL2) command). If I disable puts to CL2 on FR2, the messages then will flow to PR2.

Any ideas why this is not behaving as expected?

Thanks,
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Nov 23, 2004 1:55 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Quote:

For each queue manager in turn I connect a client and put a message to each cluster queue; and I do that twice per queue (because the sample program amqsputc seems to be compiled with BIND_ON_OPEN option).

Actually, I bet it is using the queue's default option, which by default is bind on open, so change your queues' attribute to not fixed.

As for why the messages all go to FR2, and not FR2 and PR2...well, for some reason, the algorithem on FR1 thinks the queue on FR2 is preferable. Maybe PR2 is suspended from the cluster. Note that this would allow messages to go to PR2 if they had no other choice, but they would not go to PR2 if there was another choice. Try switching all 4 queues to bind not fixed, and try your test again. Also, try connecting to PR1, and putting to Q2. In that case, is the mix 50/50?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Nigelg
PostPosted: Wed Nov 24, 2004 12:34 am    Post subject: Reply with quote

Grand Master

Joined: 02 Aug 2004
Posts: 1046

It may be that a lot of msgs have already flowed from FR1 to PR2, and so the chl seq num on the channel is much higher than the seq num on the chl to FR2, so the algorithm chooses FR2 until such time as the seq nums match.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Wed Nov 24, 2004 5:47 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Nigelg wrote:
It may be that a lot of msgs have already flowed from FR1 to PR2, and so the chl seq num on the channel is much higher than the seq num on the chl to FR2, so the algorithm chooses FR2 until such time as the seq nums match.


I respectfully disagree. Nigel, you never addressed my test results in this thread.
http://www.mqseries.net/phpBB2/viewtopic.php?t=17607&start=15

If clustering choose a path strictly based on the value of sequence numbers, I could send 1,000,000 message from FR1 to FR2 only, and then it could take days, months or years before another application sent enough messages from FR1 to PR2 to get its sequence number over 1,000,000, during which time no messages would round robin to FR1? We both know that it can't work like that.

My testing has shown that while the sequence number is *rising* faster than another one, it will count that against it as a chosen path. Details are in the thread mentioned above.

Again, I hate disagreeing with someone looking at the actual source code, but my testing shows something else, and logic also dictates that clustering should not work this way.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
bbburson
PostPosted: Wed Nov 24, 2004 7:05 am    Post subject: further testing reveals... Reply with quote

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

Thanks for the replies! The DEFBIND setting on the queues certainly makes a difference.

Today I changed all four queues to DEFBIND(NOTFIXED) and the messages balanced between FR2 and PR2 as I had expected them to all along. That happens whether I do
Code:
Connect
  put
  put
  put
  put
Disconnect
-or-
Code:
Connect
  put
Disconnect
four times in a row.

The mystery part is still why my original tests (using Connect/Put/Disconnect four times in a row) failed to load balance only when my client connected to FR1, doing puts to CL2. I used all the combinations of connection points (FR1, FR2, PR1, PR2) to each of the clustered queues (CL1, CL2), and only that one combo failed to load balance (of course connections to the qmgr where the queue is local put all the messages locally, as expected).

Oh, well, I'll make a note to always set my clustered queues to be DEFBIND(NOTFIXED) and maybe I won't have to worry about this particular thing again.

BTW, Peter, I've been following some of the discussions here and I agree with you that channel sequence number does not come into play in this scenario.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » (Resolved) Load balance anomaly
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.