Author |
Message
|
bbburson |
Posted: Thu Aug 16, 2007 9:55 am Post subject: Alternate cluster transmit queue? |
|
|
Partisan
Joined: 06 Jan 2004 Posts: 378 Location: Nowhere near a queue manager
|
Can a cluster connection be set up to use another transmit queue than SYSTEM.CLUSTER.TRANSMIT.QUEUE? I'm pretty sure the answer is no, but I'm exploring options to keep a poorly -managed or -behaving queue manager from affecting other connections within the cluster. I figure a question to the assembled wisdom is always worth a shot. |
|
Back to top |
|
 |
jefflowrey |
Posted: Thu Aug 16, 2007 10:05 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Why would you expect a poorly managed or misbehaving queue manager to affect other connections in the cluster?
Clusters try to form fully connected networks.
The only messages that should be affected by an issue with a single queue manager in a cluster will be messages going TO or FROM that queue manager.
A build up of messages on a S.C.T.Q that are all going to one queue manager will not affect any messages going to any other queue manager.
Unless the buildup hits the MAXDEPTH of S.C.T.Q....
Messages from S.C.T.Q are pulled by CorrelID. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
bbburson |
Posted: Thu Aug 16, 2007 10:50 am Post subject: |
|
|
Partisan
Joined: 06 Jan 2004 Posts: 378 Location: Nowhere near a queue manager
|
Thanks, Jeff. I was thinking I had seen slow-down in cluster message flow when S.C.T.Q was backing up, but I'm probably misremembering the particulars. I wondered how the various messages were kept apart and destined for the proper cluster sender channel. Reading from the que with CorrelID makes sense.
As I said initially, I didn't think there was a way to have alternate cluster transmit queues. And your reply lets me know (a) there is not and (b) I shouldn't worry about it.
Thanks again. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Aug 16, 2007 11:05 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Bruce,
I have seen the whole cluster slow down because of a misbehaving cluster qmgr. Main reason being that other queues on the misbehaving qmgr did not get their messages delivered in time. Main problem: one of the queues was poorly serviced or not serviced at all and hitting it's max depth.
Consequent slow down as messages get delivered to the DLQ and back up in the SYSTEM.CLUSTER.TRANSMIT.QUEUE of the sender and thus by staying in front of other messages, sent to a perfectly serviced queue on the same manager, prevent them from being delivered until they have either expired or the messages in front of them have reached the DLQ.
Best defense: suspend the offending qmgr until the situation has been resolved (a) increase max qdepth, b) make sure the queue is serviced properly). The big assumption here is that qmgrs in your cluster are interchangeable and do load balancing.
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jefflowrey |
Posted: Thu Aug 16, 2007 11:13 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
FJ -
Did that affect messages that were NOT going to the bad qmgr?
Or did it only affect all messages going TO the bad qmgr? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Aug 16, 2007 11:18 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
jefflowrey wrote: |
FJ -
Did that affect messages that were NOT going to the bad qmgr?
Or did it only affect all messages going TO the bad qmgr? |
It only did affect the messages going to the bad qmgr but it affected the response time in request reply of services that were not affected but colocated on that same qmgr...
Like I said, if you have to wait in line until the preceding msgs hit the DLQ on the destination qmgr you are not moving or getting processed..., even if YOUR destination queue is being serviced correctly....
Mostly caused by batch processes flooding the queue while they weren't expected to run. We tried to alleviate this some by setting message priorities differently so that online would always take precedence over batch... And bigger queue depths _________________ MQ & Broker admin |
|
Back to top |
|
 |
jefflowrey |
Posted: Thu Aug 16, 2007 11:34 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Yes. Messages going to a particular qmgr, on a given S.C.T.Q are serviced in FIFO order. And a "poison" message can therefore occur.
But that doesn't slow down the CLUSTER.
Just the applications...
And I agree that one does need to suspend/stop the affected qmgr as soon as possible.
This is why things like qdepth monitoring/events on S.C.T.Q are good things. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Aug 16, 2007 11:59 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
jefflowrey wrote: |
Yes. Messages going to a particular qmgr, on a given S.C.T.Q are serviced in FIFO order. And a "poison" message can therefore occur.
But that doesn't slow down the CLUSTER.
Just the applications...
And I agree that one does need to suspend/stop the affected qmgr as soon as possible.
This is why things like qdepth monitoring/events on S.C.T.Q are good things. |
I believe it is FIFO within the priority. And no it was not the applications' fault it was just one application's fault. All other collocated services were responding fine within 300ms but the messages could take minutes to get there because of the full queue...., so in a sense and because of the way cluster communications work, one full queue will affect the whole cluster...
Load balancing admins beware. You really don't want a queue getting beyond max depth. It's a recipe for disaster in response times.
Proof that it slows down the cluster and not the apps. Just increase the max qdepth to make the backlog in the SCTQ disappear and you'll see the effect immediately...
Hints from a different post. There is a parameter on the channel that governs the number of retries before putting the message to the DLQ.
Retrying the message multiple times and adding the DLQ header takes substantially more time than delivering it to the intended destination.
See parms MRRTY and MRTMR
This said, I still have to add working as designed  _________________ MQ & Broker admin |
|
Back to top |
|
 |
bbburson |
Posted: Thu Aug 16, 2007 12:39 pm Post subject: |
|
|
Partisan
Joined: 06 Jan 2004 Posts: 378 Location: Nowhere near a queue manager
|
Thanks, guys, for extending the discussion. Just for completeness I'll point out that the slowdown because a queue is full and messages are going to DLQ is not specific to WMQ clusters. The same thing happens with SDR/RCVR channel pairs where messages for behaving applications can be delayed because they are stuck in line behind the ones that are being retried and eventually put to DLQ. In those cases you'll see the channels in PAUSED and RETRY states (which may also happen with CLUS channels I just can't recall for sure).
...and I agree working as designed. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Aug 20, 2007 1:54 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Because the CLUSSNDRs are pulling from the S.C.T.Q. by Correl ID, a very deep S.C.T.Q. can slow things down, even if everything is flowing correctly.
Get by Correl ID (and/or Msg ID) will take longer on a queue with 1,000,000 messages than on a queue with 10 messages, regardless of whether the application is a business app or a sending MCA.
Aside from that, if you want to minimize the effect QueueA on the recieving QM filling up has on messages trying to get to QueueB thru QueueZ, set MRRTY and MRTMR to zero. The RCVR channel will immediatly dump them to the DLQ when it sees QueueA is still full. Of course it does still take some time for it to notice that QueueA is full, so there will still be some slowdown versus QueueA just accepting the incoming messages. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Aug 20, 2007 3:18 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
fjb_saper wrote: |
Hints from a different post. There is a parameter on the channel that governs the number of retries before putting the message to the DLQ.
Retrying the message multiple times and adding the DLQ header takes substantially more time than delivering it to the intended destination.
See parms MRRTY and MRTMR
This said, I still have to add working as designed  |
Peter I'm just going to add here that the default is retry 10 times.., and with the defaults a backup of even a few thousand messages will cause a dramatic slow down...
Don't know what the default on MRTMR is.(time between retry intervals in ms) _________________ MQ & Broker admin |
|
Back to top |
|
 |
|