Author |
Message
|
fredmoore |
Posted: Wed Nov 07, 2018 4:31 am Post subject: Is dynamic client conn rebalancing in qmgr group possible? |
|
|
Novice
Joined: 23 Mar 2009 Posts: 24
|
Hi folks,
any help with this is appreciated.
Scenario
1\ Many MQ clients connect to a 4 different queue managers, forming a queue manager group, using a client channel definition table (CCDT)
2\ One of the queue manager becomes unavailable, and the clients connected to that queue manager will scatter across the remaining queue managers
3\ The unavailable queue manager comes up again after a very short downtime, and some of the new connections will be directed to it
4\ EVENTUALLY the connections will be balanced across the 4 queue manager
Side information
The client applications are actually instances of an IBM Integration Bus MessageFlow putting messages via an MQOutput node that has a policy pointing to a CCDT.
Problem
Clients will not disconnect as long as they don't hit an error, so it takes a very long time for the connections to be balanced over 4 queue managers after the failure of one of them, during this time the performance on the overloaded queue manager is unacceptable.
Question
Is there a way to configure the system to perform some sort of "dynamic rebalancing" of active connections that is transparent to the application?
Thanks in advance for your help!
Cheers,
F. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Nov 07, 2018 5:12 am Post subject: Re: Is dynamic client conn rebalancing in qmgr group possibl |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
fredmoore wrote: |
Clients will not disconnect as long as they don't hit an error, so it takes a very long time for the connections to be balanced over 4 queue managers after the failure of one of them, during this time the performance on the overloaded queue manager is unacceptable |
I'm surprised that any connections rebalance without disconnecting and reconnecting. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
fredmoore |
Posted: Wed Nov 07, 2018 5:56 am Post subject: |
|
|
Novice
Joined: 23 Mar 2009 Posts: 24
|
Quote: |
I'm surprised that any connections rebalance without disconnecting and reconnecting. |
Let me try to explain it better:
1\ connections are balanced only upon reconnect
2\ due to the client logic not disconnecting often enough this means that it will take a long time for client connections to be evenly distributed again across the 4 queue managers (roughly the time it takes for most of them to disconnect and reconnect once)
Hence my question about dynamic and application transparent rebalancing of "live" connections: is this achievable?
TIA,
F. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Nov 07, 2018 6:19 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
fredmoore wrote: |
1\ connections are balanced only upon reconnect |
How are you currently implementing connection balancing? What tool?
What asks for balancing? The client app? Something else? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
fredmoore |
Posted: Wed Nov 07, 2018 6:26 am Post subject: |
|
|
Novice
Joined: 23 Mar 2009 Posts: 24
|
Quote: |
How are you currently implementing connection balancing? What tool? |
No tool other that MQ itself: we have a simple CCDT with four queue managers in it, forming a queue manager group
Quote: |
What asks for balancing? The client app? Something else? |
Balancing (and rebalancing) is required to distribute the load across all available queue managers and as evenly as possible, for performance purposes: it's a busy system. |
|
Back to top |
|
 |
exerk |
Posted: Wed Nov 07, 2018 6:31 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Set affinities so that when a queue manager does become 'available' again, any clients that disconnect from a non-affinity queue manager will connect to their preference.
For example, if you have 16 clients, set 4 each to have an affinity to one queue manager, and to connect to any available thereafter, or in order of preference to the other three.
It does mean having multiple CCDT files spread across 'groups' of clients, but it's about the only way I can think of to do it. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Nov 07, 2018 6:49 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
fredmoore wrote: |
Quote: |
How are you currently implementing connection balancing? What tool? |
No tool other that MQ itself: we have a simple CCDT with four queue managers in it, forming a queue manager group
Balancing (and rebalancing) is required to distribute the load across all available queue managers and as evenly as possible, for performance purposes: it's a busy system. |
Nothing (not even an external tool like an F5) will break a connection that an application is maintaining. If the application is using a connection, nothing within the network or MQ will willfully break that, because that's going to cause data loss within the application. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Nov 07, 2018 7:19 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
I'm puzzled by your initial premise - that a qmgr becomes unavailable.
Does this occur frequently? If so, what are the root causes? Network? Server hardware failure? O/S software failure? MQ internal failure? Is your MQ server software current? Is your MQ client software current?
I've seen far more client-side failures than server-side. Clients reconnect can be painfully slow - affecting SLAs. Unless MQ internals have changed, each channel entry in the CCDT entry is tried alphabetically by channel name, nine times before the next entry is tried. Do you have only one CCDT for all clients? As suggested earlier in this thread, smaller individual CCDTs might improve reconnect. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
fredmoore |
Posted: Wed Nov 07, 2018 7:44 am Post subject: |
|
|
Novice
Joined: 23 Mar 2009 Posts: 24
|
Quote: |
Does this occur frequently? If so, what are the root causes? Network? Server hardware failure? O/S software failure? MQ internal failure? Is your MQ server software current? Is your MQ client software current? |
It does not happen frequently, when it does it is sometimes simply due to planned hw/sw maintenance, plus a "pinch" of all sorts of unplanned stuff.
The point here was how to exploit available resources evenly, especially when the queue manager comes back again. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Nov 07, 2018 8:19 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
fredmoore wrote: |
Quote: |
Does this occur frequently? If so, what are the root causes? Network? Server hardware failure? O/S software failure? MQ internal failure? Is your MQ server software current? Is your MQ client software current? |
It does not happen frequently, when it does it is sometimes simply due to planned hw/sw maintenance, plus a "pinch" of all sorts of unplanned stuff.
The point here was how to exploit available resources evenly, especially when the queue manager comes back again. |
There is no single "thing" that is going to fix this.
Both client-bindings and server-bindings apps are responsible for behaving, well, responsibility - like connecting or reconnecting, as necessary. Client-bindings apps suffer the slings and arrows of network and small-platform fortunes. Less so with server-bindings apps.
This thread has given you some things to ponder.
Does the SLA tolerate some transaction throughput misses? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Nov 07, 2018 8:50 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
fredmoore wrote: |
The point here was how to exploit available resources evenly, especially when the queue manager comes back again. |
So let me see if I have your requirement straight.
You have 4 queue managers, with a group of applications connected across them. One queue manager becomes unavailable, and the applications in response reconnect to one of the others & restart their work.
When the errant queue manager becomes available again, you're looking for a way to forcibly disconnect the applications (causing whatever they're doing to fail) so they reconnect to their "preferred" queue manager.
Are you looking to do this only to the applications which are on the "wrong" queue manager, or are you happy to abend all the applications and stop all processing until they reconnect? On a "busy system" I'd expect the former. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Nov 07, 2018 9:25 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
The OP mentioned that apps don't disconnect frequently. So, the app MQCONNects, then waits around for some transaction activity, repeat, repeat - never or seldom disconnecting - like a long-running (redundant) batch job.
Interesting, but bad app design, trying to circumvent (improve on) on MQ's plodding CCDT reconnect process should a failure occur. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Nov 07, 2018 10:12 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
bruce2359 wrote: |
The OP mentioned that apps don't disconnect frequently. So, the app MQCONNects, then waits around for some transaction activity, repeat, repeat - never or seldom disconnecting - like a long-running (redundant) batch job.
Interesting, but bad app design, trying to circumvent (improve on) on MQ's plodding CCDT reconnect process should a failure occur. |
I know, I know - I'm trying to get the OP thinking about what he's asking for and how it can't be fixed at the MQ level
 _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Nov 07, 2018 1:20 pm Post subject: Re: Is dynamic client conn rebalancing in qmgr group possibl |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
fredmoore wrote: |
Problem
Clients will not disconnect as long as they don't hit an error, |
Please be more precise. By "error" you mean an application logic error? Something else?
fredmoore wrote: |
so it takes a very long time for the connections to be balanced over 4 queue managers after the failure of one of them, during this time the performance on the overloaded queue manager is unacceptable. |
This is different from the app deciding to MQDISC. This is qmgr shutting down.
How long is a "very long time?" And how do you determine when (how long) before the workload appears balanced? How do you know when it is balanced? What tooling do you use? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Nov 07, 2018 5:15 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Vitor wrote: |
When the errant queue manager becomes available again, you're looking for a way to forcibly disconnect the applications (causing whatever they're doing to fail) so they reconnect to their "preferred" queue manager.
|
This is how DataPower does it with its MQ QM Groups. It is constantly checking for the return to availability of its preferred queue manager and once it sees that, it gracefully ends the connections to the secondary QM and reconnects to the preferred. Works reliably and out of the box. But its the application (DataPower in this case) that does the rebalancing of connections to the queue managers. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
|