MQSeries.net :: View topic - channel not stopping

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » channel not stopping

Goto page Previous 1, 2

channel not stopping

« View previous topic :: View next topic »

Author

Message

Zappa

Posted: Tue Oct 19, 2010 7:30 am Post subject:

Acolyte

Joined: 06 Oct 2005
Posts: 55
Location: UK

Iâ€™ve been avoiding the HA debate as much as possible and donâ€™t want to spark off another one, we do use HACMP for DBâ€™s and filesystems but the problem we would have in using something like IC91 is where to fail it over to i.e. the NOT SO cheap licensing of a server that does nothing 24/7â€¦ I hear them asking â€œHow much? For doing what?â€
When and if a server dies then a clustered delivery of data should then route it to whatever is available as in active/active clustering not just half of it â€“ surely!
Hot topic I knowâ€¦

bruce2359

Posted: Tue Oct 19, 2010 7:52 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Not so much a hot-topic; but rather, the general misunderstanding of what WMQ clusters offer, and what they do not.

Your OP identifies messages stranded in the SCTQ due to the downstream qmgr (or channel) failing.

One of the other replies to your OP brought up the issue of messages that successfully arriveng at a cluster destination queue, AND the destination qmgr that hosts the cluster queue failing shortly after the message arrives, but before it can be consumed. In this instance, the message is stranded, too; but this time on the failed qmgr.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Vitor

Posted: Tue Oct 19, 2010 7:54 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Zappa wrote:

we do use HACMP for DBâ€™s and filesystems but the problem we would have in using something like IC91 is where to fail it over to i.e. the NOT SO cheap licensing of a server that does nothing 24/7â€¦

HACMP supports Active/Active and "where" is answered by "where do you fail the DBs to?". I've used HACMP to support WMQ & WMB in exactly the configuration you're talking about, with the "failover" server doing work when there's no emergency.

Zappa wrote:

When and if a server dies then a clustered delivery of data should then route it to whatever is available as in active/active clustering not just half of it â€“ surely!

No, and don't call me Shirley.

There is a world of difference between an HACMP cluster and a WMQ cluster; it's one of the great problems that the word "cluster" is used for so many different architectures in IT. At it's simplest, a WMQ cluster is designed to believe a dropped communication link is a transient problem that should be dealt with by retrying a few times. An HACMP cluster believes a loss of communication is a reason to bring up an alternate instance. To tie back to your question, it's 2 different interpretations of the word "available", both valid in their context.

If I was in your position, and had HACMP on site (i.e. already licensed), and had existing DBs and so forth under HACMP (presumably running Active/Active so there's no "wasted" DB server) that I could slot into I'd put the queue managers under HACMP & solve all my problems.

But I'm not in your position. You are.

_________________
Honesty is the best policy.
Insanity is the best defence.

zonko

Posted: Tue Oct 19, 2010 9:25 am Post subject:

Voyager

Joined: 04 Nov 2009
Posts: 78

When a cluster channel stops RUNNING, and goes to RETRYING in this case, any msgs remaining on the cluster xmitq for that channel are read from the queue and put back through the cluster workload balancing mechanism. If there are alternative destinations available in the cluster, for example on cluster qmgrs served by a RUNNING channel rather than a RETRYING one, the msgs are sent to that qmgr. The msgs will only remain on the cluster xmitq if there is no alternative destination, as will be the case for msgs destined for a queue which only has a single instance in the cluster, or if the qmgr was specified when the msg was originally put, or if the msg was put BIND_ON_OPEN.

Obviously, msgs which have already been sent such that the channel is indoubt will not be reallocated. The chance of this happening can be minimised by setting the BATCHHB attribute.

bruce2359

Posted: Tue Oct 19, 2010 9:34 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Yes, but ...

From the WMQ v7 Clusters manual:

If a local queue within the cluster becomes unavailable while a message is in
transit, the message is forwarded to another instance of the queue (but only if the queue was opened (MQOPEN) with the BIND_NOT_FIXED open option).

[edit]
BIND option can be specified by the application at MQOPEN and/or set at the queue.

From the MQSC manual:
DEFBIND
Specifies the binding to be used when the application specifies
MQOO_BIND_AS_Q_DEF on the MQOPEN call, and the queue is a
cluster queue.
OPEN The queue handle is bound to a specific instance of the cluster
queue when the queue is opened.
NOTFIXED
The queue handle is not bound to any particular instance of the
cluster queue. This allows the queue manager to select a specific
queue instance when the message is put using MQPUT, and to
change that selection subsequently should the need arise.
The MQPUT1 call always behaves as if NOTFIXED had been specified.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Vitor

Posted: Tue Oct 19, 2010 10:30 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

zonko wrote:

or if the qmgr was specified when the msg was originally put, or if the msg was put BIND_ON_OPEN.

This all hangs on the OP's assertion that workload balancing happens correctly for all messages when all the queue managers are up.
_________________
Honesty is the best policy.
Insanity is the best defence.

PeterPotkay

Posted: Tue Oct 19, 2010 12:43 pm Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Zappa wrote:

The values are this low because I donâ€™t want too many MSGS stuck on the SCTQ as there are other QMGRS in the cluster sharing the same queue names and I want these to pick up the load when one isnâ€™t available. â€“ If there are other ways of doing this then please let me know.

This is not needed to accomplish your goal, and is hurting you in other ways.

The cluster will reroute the messages if they are not bound to that particular QM even if the channel is just retrying. No need to force it to go into a STOPPED status with very low retry values.

But because you have such low retry values, the channel will go into STOPPED rather quickly and require manual intervention on your part when the potentially brief outage is over instead of auto recovering on its own.
_________________
Peter Potkay
Keep Calm and MQ On

bruce2359

Posted: Tue Oct 19, 2010 12:47 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Please display the attributes of the clustered queue definition. Use DISPLAY QL( ). Then post the results here.

Does the application specify MQOO_BIND_FIXED?
Or _NOT_FIXED?
Or _AS_Q_DEF?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Zappa

Posted: Wed Oct 20, 2010 1:41 am Post subject:

Acolyte

Joined: 06 Oct 2005
Posts: 55
Location: UK

The queues are DEFBIND(NOTFIXED) and the QMGR is rarely specified on puts.

Unfortunately my predicament is inherited, had this been a greenfield site Iâ€™d certainly want HACMP for WMQ/WMB as you all rightly encourage. Our WMB servers do have HACMP and these were once configured to fail over to each other but quite simply the one server does not have the capacity for both brokers, whereas one on its own can just about handle the total load. All HACMP is being used for currently is the non active/active components which is less than 2% of the total volume, everything else is load balanced with WMQ clustering.
The application DB is HACMPâ€™d to an idle standby, I had suggested that the brokers could fail over to this also but my UNIX SA responsible for HACMP didnâ€™t like the sound of this too much and discouraged it. I would also have a hard time explaining that weâ€™d need to license many more cpuâ€™s worth of WMQ/WMB for an idle standby, the budget would be blown (plus no bonus for me in the foreseeable future).
At the risk of being ridiculed Iâ€™m irresponsibly thinking â€œwith a small config change to our channels I can make this HAâ€™ish (if thatâ€™s a word)â€ and this might be welcomed in these hard times.
I do currently need to make the round peg fit the square hole so in my case stopping the channels quickly when one QMGR is downed prevents thousands of stranded messages.
Any advice is very much welcomed to aid my challengesâ€¦

Vitor

Posted: Wed Oct 20, 2010 4:14 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Zappa wrote:

At the risk of being ridiculed Iâ€™m irresponsibly thinking â€œwith a small config change to our channels I can make this HAâ€™ish (if thatâ€™s a word)â€ and this might be welcomed in these hard times.

As if I'd ridicule anybody.

I think you've articulated the key problem - it's HA'ish. Not HA. You might be able to make this cheap HA solution work, but the results will not be predictable. Unless you really enjoy restarting channels manually, there's still going to be a window where messages will get stuck before the channel finally stops. What happens if one of those messages is business critical? A large, important or time sensitive message? Those same people who welcomed your cost savings in these hard times will be baying for your blood.

Zappa wrote:

I do currently need to make the round peg fit the square hole so in my case stopping the channels quickly when one QMGR is downed prevents thousands of stranded messages.

What you really need is to get management buy-in to the fact that you're pushing a round peg into a square hole, and the solution has weaknesses. Specifically you can prevent thousands of stranded messages but you can't prevent (or guarantee to prevent) stranded messages. Nor can you accurately predict how many or which messages will be stranded. Those who own the content must understand that.

They must also understand that if the queue manager is downed for a significant period (hardwware failure rather than comms failure) those messages will sit for the duration.
_________________
Honesty is the best policy.
Insanity is the best defence.

Vitor

Posted: Wed Oct 20, 2010 4:16 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Zappa wrote:

The queues are DEFBIND(NOTFIXED) and the QMGR is rarely specified on puts.

Before someone else says it:

DEFBIND(NOTFIXED) is only a default. An application can specify BIND_ON_OPEN if it choses in the same way it can specify a queue manager.

If the queue manager is raely specified, that means it's sometimes specified.
_________________
Honesty is the best policy.
Insanity is the best defence.

PeterPotkay

Posted: Wed Oct 20, 2010 4:35 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Zappa wrote:

I do currently need to make the round peg fit the square hole so in my case stopping the channels quickly when one QMGR is downed prevents thousands of stranded messages.
Any advice is very much welcomed to aid my challengesâ€¦

See my previous post.
_________________
Peter Potkay
Keep Calm and MQ On

Zappa

Posted: Wed Oct 20, 2010 5:19 am Post subject:

Acolyte

Joined: 06 Oct 2005
Posts: 55
Location: UK

Hope this doesnâ€™t come across in the wrong way but you are preaching to the converted. I know what Iâ€™m proposing isnâ€™t best practise and I do need management buy in to bolster resources for a proper HA solution, hence me trying to avoid the topic. Like I say I've inherited this config!

Iâ€™m not overly sure why you are saying that Iâ€™d have to randomly start the stopped channels though, would they not only be stopped if the values were too low? The values I chose were just a test and none of this is in production yet. If the values were high enough to cope with network blips but short enough not to isolate too many msgs then I can't help but think that itâ€™s better than what I have now.

Vitor

Posted: Wed Oct 20, 2010 5:25 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Zappa wrote:

If the values were high enough to cope with network blips but short enough not to isolate too many msgs then I can't help but think that itâ€™s better than what I have now.

Well if you can hit that happy medium then well done. I suspect that any values low enough to isolate "not too many msgs" (however you determine that) will need to be so low that the channels will stop more often than you'd like. And at inconvienient times.

I repeat (because I feel it's important) that you need to explain this to the great and the good. They need to understand this, and will potentially have input to that "not too many" number.
_________________
Honesty is the best policy.
Insanity is the best defence.

PeterPotkay

Posted: Wed Oct 20, 2010 6:27 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

If you want to set the value that tells the QM how long to wait between retries to a smaller value, that's fine.

But don't set your Long Retry Count to a small number. Once that is exhausted, the channel hard stops and then you need to manually start it. There's no need for that. Let it retry for a long time over and over. Let it recover on its own when the underlying problem goes away.

You probably do not want or need the channel to be hard stopped.
_________________
Peter Potkay
Keep Calm and MQ On

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » General IBM MQ Support » channel not stopping

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP