ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Workload balancing not working as expected

Post new topic  Reply to topic Goto page 1, 2, 3  Next
 Workload balancing not working as expected « View previous topic :: View next topic » 
Author Message
RatherBeGolfing
PostPosted: Wed Sep 15, 2004 6:08 am    Post subject: Workload balancing not working as expected Reply with quote

Centurion

Joined: 12 Nov 2002
Posts: 118
Location: Syracuse, NY, USA

I have been playing with this for a couple of days and can't crack the problem. I have 3 QMgrs. QM1 is z/os (5.3.1). QM2 is Win2000 (5.3+csd6) and QM3 is Win2000 (5.3+csd6).

I have a cluster Queue (QueueA) that is defined on both QM2 and QM3. Bind option set to Not Fixed.

I have a COBOL program that is putting say 10 messages to QueueA. I specify MQOO-BIND-NOT-FIXED as an open option. The program issues an MQConn once, then for each message to be put it issues an MQOpen, MQPut and MQClose. Following the last message to be put, I issue an MQDisc. I would have expected 5 messages to appear on QM1's instance of QueueA and 5 to appear on QM2's instance of QueueA.

What actually happens is that all 10 messages go to QM2. If I inhibit the put attribute of QueueA on QM2, all 10 messages go to QueueA on QM1.

I've searched threads in this site and I've read the appropriate sections of the manuals on MQOpen and Cluster workload management and can't see what I'm doing wrong.

Just for "fun" I even pulled the MQConn and MQDisc into my Open-Put-Close loop - didn't work. I haven't tried MQPut1 yet.....

Any advice is wildly appreciated!
_________________
Cheers,
Larry
MQ Certifiable


Last edited by RatherBeGolfing on Wed Sep 15, 2004 10:30 am; edited 1 time in total
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Sep 15, 2004 6:44 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

What queue manager is your COBOL program connected to?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
RatherBeGolfing
PostPosted: Wed Sep 15, 2004 6:49 am    Post subject: Reply with quote

Centurion

Joined: 12 Nov 2002
Posts: 118
Location: Syracuse, NY, USA

The COBOL program is connecting to the z/OS queue manager (QM1). Also, it has been bound with the CSQBSTUB access module.
_________________
Cheers,
Larry
MQ Certifiable
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Sep 15, 2004 6:52 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

You should see all messages going to the local queue on QMgr A, then. Local queues are always preferred - unless that's different in z/OS.

Unless you've installed an exit.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
RatherBeGolfing
PostPosted: Wed Sep 15, 2004 6:57 am    Post subject: Reply with quote

Centurion

Joined: 12 Nov 2002
Posts: 118
Location: Syracuse, NY, USA

QueueA does not exist on QM1, only on the Windows2000 queue managers, QM2 and QM3. And, we don't have any exits installed.
_________________
Cheers,
Larry
MQ Certifiable
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Sep 15, 2004 7:45 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

RatherBeGolfing wrote:
QueueA does not exist on QM1, only on the Windows2000 queue managers, QM2 and QM3. And, we don't have any exits installed.

Then this:
RatherBeGolfing wrote:
I have a cluster Queue (QueueA) that is defined on both QM1 and QM2. Bind option set to Not Fixed.

was a typo.

Are you specifying any other OPEN options other than MQOO_OUTPUT and MQOO_BIND_NOT_FIXED?

Are you definately setting the Queue Manager name to blanks before each put?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
RatherBeGolfing
PostPosted: Wed Sep 15, 2004 8:53 am    Post subject: Reply with quote

Centurion

Joined: 12 Nov 2002
Posts: 118
Location: Syracuse, NY, USA

Jeff, yes that was a typo - should had said I have a cluster QueueA defined on QM2 and QM3 - sorry

I have no other open options besides MQOO-OUTPUT and MQOO_BIND_NOT_FIXED. And, I force spaces into the
MQOD-OBJECTQMGRNAME before each MQOpen statetment.

Thanks for hangin' in there with me!
_________________
Cheers,
Larry
MQ Certifiable
Back to top
View user's profile Send private message
Nigelg
PostPosted: Wed Sep 15, 2004 11:19 pm    Post subject: Reply with quote

Grand Master

Joined: 02 Aug 2004
Posts: 1046

All other things being equal, i.e. cluster queue not PUT inhibited, qmgr not suspended, channels RUNNING, and a load of other stuff, the dest qmgr is determined by the msg sequence number on the CLUSSDR; the msg is sent to the qmgr with the lowest number. If a lot of msgs have already been sent to QM3 in the scenario above, then the WLB will put these msgs to QM2 until the msg sequence number of the chls to QM2 and QM3 is the same, and then alternate the msgs between them.
Try stopping and restarting the chls to QM2 and QM3 from QM1, to reset the msg seq nr, and then put msgs as before.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Thu Sep 16, 2004 9:53 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

I have never heard of the sequence # ever coming into play for the round robin algorithim, and my testing has shown that is not the case either. It cant be true, since squence #s are at the channel level and not atthe queue level. What if one app dumped a million messages to QM2 / QM3. And the App2 cam along and dumped 1000 messages to QM3 /QM4 (different queue). They wouldn't all go to QM4 just because the sequence number for QM3 was higher.

Try the test again, but this time, insure both channels to both QM2 and QM3 are running. If the channel to QM2 starts up before the one to QM3 does, MQ is fast enough to move all 10 messages to QM2 before QM3 ever gets a chance. And/or use more test messages in your test.

Below is the algorithim used (from the cluster manual):
Code:

Algorithm: The steps in choosing a destination for a message:

If a queue name is specified, eliminate queues that are not PUT enabled.
Eliminate remote instances of queues that do not share a cluster with the local queue manager.

Eliminate remote CLUSRCVR channels that are not in the same cluster as the queue.

If a queue manager name is specified, eliminate queue manager alias' that are not PUT enabled.
Eliminate remote CLUSRCVR channels that are not in the same cluster as the local queue manager.

If the result above contains the local instance of the queue, choose it.
If the message is a cluster PCF message, eliminate any queue manager you have already sent a publication or subscription to.
If only remote instances of a queue remains, choose Resumed queue managers over Suspended ones.
If more than one remote instance of a queue remains, include all MQCHS_INACTIVE and MQCHS_RUNNING channels.
If less than one remote instance of a queue remains, include all MQCHS_BINDING, MQCHS_INITIALIZING, MQCHS_STARTING, and MQCHS_STOPPING channels.
If less than one remote instance of a queue remains, include all MQCHS_RETRYING channels.
If less than one remote instance of a queue remains, include all MQCHS_REQUESTING, MQCHS_PAUSED and MQCHS_STOPPED channels.
If more than one remote instance of a queue remains and the message is a cluster PCF message, choose locally defined CLUSSDR channels.
If more than one remote instance of a queue remains to any queue manager, choose channels with the highest NETPRTY to each queue manager.
If more than one remote instance of a queue remains, choose the least recently used channel.

_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Nigelg
PostPosted: Fri Sep 17, 2004 12:10 am    Post subject: Reply with quote

Grand Master

Joined: 02 Aug 2004
Posts: 1046

Quote:
I have never heard of the sequence # ever coming into play for the round robin algorithim, and my testing has shown that is not the case either. It cant be true, since squence #s are at the channel level and not atthe queue level. What if one app dumped a million messages to QM2 / QM3. And the App2 cam along and dumped 1000 messages to QM3 /QM4 (different queue). They wouldn't all go to QM4 just because the sequence number for QM3 was higher.


The default round robin algorithm IS at the channel level, not the queue level. In your example, that is just what does happen.

Read your own quote...
Quote:
If more than one remote instance of a queue remains, choose the least recently used channel.

It chooses the least recently used channel by comparing the sequence numbers.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Fri Sep 17, 2004 4:56 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

So, the big queustion is, does
Quote:

If more than one remote instance of a queue remains, choose the least recently used channel.

mean:
that any message for any queue counts, or is the algorithim smart enough to only consider the last channel used for a message that is part of this particular round robin.

For example, AppA puts 1,000,000 messages as fast as it can to QueueA, which is hosted on QM2 and QM3. They round robin equally for the sake of discussion. Then AppB starts putting messages to QueueB which is hosted on QM3 and QM4, but it puts those messages once a second. Are you saying that all of AppB's messages will go to QM4, because App1 is using the channel to QM3 so heavily? That doesn't sound right to me.

If this were the case 1 big app would be disrupting the round robining for every other app in a cluster, and I just don't think IBM would design it that way (but I could be wrong).

When you read
Quote:

If more than one remote instance of a queue remains, choose the least recently used channel.

you are assumming
Quote:

by comparing the sequence numbers

I assumed it meant by comparing which channel AppB last sent a message down, not which channel any app, including the Cluster Repositories, last sent a message.

Do you have access to the algoritim code?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
hguapluas
PostPosted: Mon Sep 20, 2004 7:08 am    Post subject: Reply with quote

Centurion

Joined: 05 Aug 2004
Posts: 105
Location: San Diego

Per the Clustering book;
"...the algorithm determines which destinations are suitable. Suitability is based on the state of the channel (including any priority you might have assigned to the channel), and also the availability of the queue manager and queue. The algorithm uses a round-robin approach to finalize its choice between the suitable queue managers."

Make sure you defined both QueueA's exactly the same on both 2000 boxes. Any differences, no matter how slight, will alter the round-robin and produce varied results.

The round-robin algorithm works as I have tested this on cluster queues and seen it in action. Also make sure you are not specifying a specific QM when sending the message. Just send it to the Queue "QueueA" and the cluster will do the rest.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Mon Sep 20, 2004 11:04 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

hguapluas wrote:

Make sure you defined both QueueA's exactly the same on both 2000 boxes. Any differences, no matter how slight, will alter the round-robin and produce varied results.


Given 2 queues in a cluster with the same name, the ONLY queue attribute that the algorithim considers (to the best of my knowledge) is whether the queue is PUT_INHIBITED or not. I have never seen any documentation that says the algorithim considers any other queue attribute when it decides where to send messages. If the queues have any other differences (Description, Max Depth, Persistence, etc), it should not matter.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
hguapluas
PostPosted: Tue Sep 21, 2004 6:38 am    Post subject: Reply with quote

Centurion

Joined: 05 Aug 2004
Posts: 105
Location: San Diego

You're right. That applies more to using the MQINQ call rather than the workload balancing algorithm.
Back to top
View user's profile Send private message
mdurman
PostPosted: Wed Sep 22, 2004 4:54 pm    Post subject: Reply with quote

Newbie

Joined: 28 Jun 2001
Posts: 3
Location: Whittier, California

Many people believe that cluster workload balancing is on a per-queue basis, but that's not true. It is on a per-channel basis. We've just been through this recently ourselves.

The final check workload balancing makes is which channel, of the channels that can be chosen, has the oldest used date/time, and chooses that one.

So (and this was our scenario), let's assume you have 3 Queue Managers, QM1, QM2 and QM3. You have two clustered queues called QA and QB, defined on QM2 and QM3. You have an app that connects to QM1, opens the clustered queues using BIND_NOT_FIXED, and puts a large (let's say 100 KB) app message to QA followed by a small (let's say 500 byte) log message to QB. It does this 1,000 times.

Assuming there is no other activity on the Queue Managers, all the large messages will end up on one Queue Manager and all the small messages will end up on the other.

What happens is this. The app puts the app message to QA. Workload balancing selects a Queue Manager (let's say QM2). The message is sent there. The app then puts the log message to QB. Workload balancing checks the channel status, and selects QM3, because that has the oldest channel date/time, and the message is sent there.

The app then sends the next app message to QA. Workload balancing checks the channel status and selects QM2, because that has the oldest date/time now, and sends the message there. The app then puts the log message to QB, and QM3 is again selected on date/time.

That ping/pong between channels continues until all 1,000 messages have been put.

The end result appears to be that workload balancing is not functioning, or that you have queues opened in BIND_ON_OPEN mode, but it really is functioning as documented.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2, 3  Next Page 1 of 3

MQSeries.net Forum Index » Clustering » Workload balancing not working as expected
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.