MQSeries.net :: View topic - Alternate cluster transmit queue?

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Alternate cluster transmit queue?

Alternate cluster transmit queue?

« View previous topic :: View next topic »

Author

Message

bbburson

Posted: Thu Aug 16, 2007 9:55 am Post subject: Alternate cluster transmit queue?

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

Can a cluster connection be set up to use another transmit queue than SYSTEM.CLUSTER.TRANSMIT.QUEUE? I'm pretty sure the answer is no, but I'm exploring options to keep a poorly -managed or -behaving queue manager from affecting other connections within the cluster. I figure a question to the assembled wisdom is always worth a shot.

jefflowrey

Posted: Thu Aug 16, 2007 10:05 am Post subject:

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Why would you expect a poorly managed or misbehaving queue manager to affect other connections in the cluster?

Clusters try to form fully connected networks.

The only messages that should be affected by an issue with a single queue manager in a cluster will be messages going TO or FROM that queue manager.

A build up of messages on a S.C.T.Q that are all going to one queue manager will not affect any messages going to any other queue manager.

Unless the buildup hits the MAXDEPTH of S.C.T.Q....

Messages from S.C.T.Q are pulled by CorrelID.
_________________
I am *not* the model of the modern major general.

bbburson

Posted: Thu Aug 16, 2007 10:50 am Post subject:

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

Thanks, Jeff. I was thinking I had seen slow-down in cluster message flow when S.C.T.Q was backing up, but I'm probably misremembering the particulars. I wondered how the various messages were kept apart and destined for the proper cluster sender channel. Reading from the que with CorrelID makes sense.

As I said initially, I didn't think there was a way to have alternate cluster transmit queues. And your reply lets me know (a) there is not and (b) I shouldn't worry about it.

Thanks again.

fjb_saper

Posted: Thu Aug 16, 2007 11:05 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

Bruce,

I have seen the whole cluster slow down because of a misbehaving cluster qmgr. Main reason being that other queues on the misbehaving qmgr did not get their messages delivered in time. Main problem: one of the queues was poorly serviced or not serviced at all and hitting it's max depth.

Consequent slow down as messages get delivered to the DLQ and back up in the SYSTEM.CLUSTER.TRANSMIT.QUEUE of the sender and thus by staying in front of other messages, sent to a perfectly serviced queue on the same manager, prevent them from being delivered until they have either expired or the messages in front of them have reached the DLQ.

Best defense: suspend the offending qmgr until the situation has been resolved (a) increase max qdepth, b) make sure the queue is serviced properly). The big assumption here is that qmgrs in your cluster are interchangeable and do load balancing.

Enjoy

_________________
MQ & Broker admin

jefflowrey

Posted: Thu Aug 16, 2007 11:13 am Post subject:

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

FJ -
Did that affect messages that were NOT going to the bad qmgr?

Or did it only affect all messages going TO the bad qmgr?
_________________
I am *not* the model of the modern major general.

fjb_saper

Posted: Thu Aug 16, 2007 11:18 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

jefflowrey wrote:

FJ -
Did that affect messages that were NOT going to the bad qmgr?

Or did it only affect all messages going TO the bad qmgr?

It only did affect the messages going to the bad qmgr but it affected the response time in request reply of services that were not affected but colocated on that same qmgr...

Like I said, if you have to wait in line until the preceding msgs hit the DLQ on the destination qmgr you are not moving or getting processed..., even if YOUR destination queue is being serviced correctly....

Mostly caused by batch processes flooding the queue while they weren't expected to run. We tried to alleviate this some by setting message priorities differently so that online would always take precedence over batch...

And bigger queue depths
_________________
MQ & Broker admin

jefflowrey

Posted: Thu Aug 16, 2007 11:34 am Post subject:

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Yes. Messages going to a particular qmgr, on a given S.C.T.Q are serviced in FIFO order. And a "poison" message can therefore occur.

But that doesn't slow down the CLUSTER.

Just the applications...

And I agree that one does need to suspend/stop the affected qmgr as soon as possible.

This is why things like qdepth monitoring/events on S.C.T.Q are good things.
_________________
I am *not* the model of the modern major general.

fjb_saper

Posted: Thu Aug 16, 2007 11:59 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

jefflowrey wrote:

I believe it is FIFO within the priority. And no it was not the applications' fault it was just one application's fault. All other collocated services were responding fine within 300ms but the messages could take minutes to get there because of the full queue...., so in a sense and because of the way cluster communications work, one full queue will affect the whole cluster...

Load balancing admins beware. You really don't want a queue getting beyond max depth. It's a recipe for disaster in response times.

Proof that it slows down the cluster and not the apps. Just increase the max qdepth to make the backlog in the SCTQ disappear and you'll see the effect immediately...

Hints from a different post. There is a parameter on the channel that governs the number of retries before putting the message to the DLQ.
Retrying the message multiple times and adding the DLQ header takes substantially more time than delivering it to the intended destination.
See parms MRRTY and MRTMR

This said, I still have to add working as designed

_________________
MQ & Broker admin

bbburson

Posted: Thu Aug 16, 2007 12:39 pm Post subject:

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

Thanks, guys, for extending the discussion. Just for completeness I'll point out that the slowdown because a queue is full and messages are going to DLQ is not specific to WMQ clusters. The same thing happens with SDR/RCVR channel pairs where messages for behaving applications can be delayed because they are stuck in line behind the ones that are being retried and eventually put to DLQ. In those cases you'll see the channels in PAUSED and RETRY states (which may also happen with CLUS channels I just can't recall for sure).

...and I agree working as designed.

PeterPotkay

Posted: Mon Aug 20, 2007 1:54 pm Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Because the CLUSSNDRs are pulling from the S.C.T.Q. by Correl ID, a very deep S.C.T.Q. can slow things down, even if everything is flowing correctly.

Get by Correl ID (and/or Msg ID) will take longer on a queue with 1,000,000 messages than on a queue with 10 messages, regardless of whether the application is a business app or a sending MCA.

Aside from that, if you want to minimize the effect QueueA on the recieving QM filling up has on messages trying to get to QueueB thru QueueZ, set MRRTY and MRTMR to zero. The RCVR channel will immediatly dump them to the DLQ when it sees QueueA is still full. Of course it does still take some time for it to notice that QueueA is full, so there will still be some slowdown versus QueueA just accepting the incoming messages.
_________________
Peter Potkay
Keep Calm and MQ On

fjb_saper

Posted: Mon Aug 20, 2007 3:18 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

fjb_saper wrote:

Peter I'm just going to add here that the default is retry 10 times.., and with the defaults a backup of even a few thousand messages will cause a dramatic slow down...
Don't know what the default on MRTMR is.(time between retry intervals in ms)
_________________
MQ & Broker admin

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » Clustering » Alternate cluster transmit queue?

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP