Author |
Message
|
Studv01 |
Posted: Fri Jan 23, 2015 4:52 am Post subject: My client connections are not failing back automatically |
|
|
Apprentice
Joined: 23 Jan 2015 Posts: 27
|
Team
I have my version 7.0 running in production configured on top of veritas HA cluster. Our clients use CCDTs to make mq connectivity. Failover works perfect; my problem here is when QMGR on node one goes down all clients connect to next available QMGR on node two. But when QMGR on node one Is back online I have to recycle client apps to restore traffic back to QMGR on node 1. Can any one through some light on automatic client connection failback? |
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Jan 23, 2015 4:56 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Do you have one queue manager or two?
The same queue manager managed by a Veritas cluster that can only run on one node or the other (its a resource of the Veritas cluster) is NOT two queue managers.
Please describe your servers and queue manager(s) more precisely. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
Studv01 |
Posted: Fri Jan 23, 2015 5:34 am Post subject: |
|
|
Apprentice
Joined: 23 Jan 2015 Posts: 27
|
Thanks for looking in to this.
We have 2 QMGR configured one on each node. Qmgr1 on node1 and qmgr2 on node2. Both queue mangers were setup identical except they run on each server of their own. As I menctioned before our clients use CCTD for mq connectivity. |
|
Back to top |
|
 |
Studv01 |
Posted: Fri Jan 23, 2015 5:42 am Post subject: |
|
|
Apprentice
Joined: 23 Jan 2015 Posts: 27
|
I know this is not some magic that you have answers handy with out actual details.
I was looking for pointer where IBM actually talks about failback. So far I saw documents talking about failover and HA availability with failover functinalit6 but nowhere I could find failback techniques. It's such a pain to recycles 10s of our applications when ever we had to bring down queue managers for maintenance, |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Jan 23, 2015 5:45 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
So what you are expecting is the following chain of events:
- client connects to server 1 and executes some work
- client notices that server 1 has failed
- client reconnects to server 2
- client does some work
- client notices that server 2 has failed
- client reconnects to server 1
Yes?
It's not failback, then. It's not even failover. It's reconnection. |
|
Back to top |
|
 |
Studv01 |
Posted: Fri Jan 23, 2015 5:52 am Post subject: |
|
|
Apprentice
Joined: 23 Jan 2015 Posts: 27
|
chain of events:
1. client connects to server 1 and executes some work
2. client notices that server 1 has failed
3. client reconnects to server 2
4. client does some work
5. server 1 is backup online
6. client continues to send requests to server2
At event 5 even new requests from clients are getting to qmgr2. Applications won't even look for qmgr1 availability.
Need to configure mq or need some mq client tuning suggestions, such that client apps connect automatically to qmgr1 when it is backup online |
|
Back to top |
|
 |
JosephGramig |
Posted: Fri Jan 23, 2015 9:07 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
Please be more clear. Are you more precisely saying:
- Client connects to Qmgr1 and exec some work
- Client gets broken connection to Qmgr1
- Client reconnect to Qmgr2 and exec some work
- Qmgr1 becomes available again
- Client never tries to connect to Qmgr1 again
If so, this most likely expected.
Explain how the app knows how to connect to either Qmgr1 or Qmgr2. CCDT and QmgrGroups? |
|
Back to top |
|
 |
JosephGramig |
Posted: Fri Jan 23, 2015 9:20 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
Normal HA behavior:
- Node1 has IP1
- Node2 has IP2
- Clustering software will swing VIP1 between Node1 and Node2 depending on the "Active" node
- Qmgr1 can be on Node1 or Node2 and it is critical you lock the listener to the VIP1 (to disallow connection vi IP1 or IP2)
- Qmgr1 is active on Node1
- App1 connects to Qmgr1 on VIP1 on Node1
- Node1 fails and App1 will get a broken connection to Qmgr1
- Clustering software will swing VIP1 to Node2 and fire up Qmgr1 on Node2
- App1 reconnects to Qmgr1 on VIP1 on Node2
- FAILBACK
- Node2 fails (or is failed back) and App1 will get a broken connection to Qmgr1
- Clustering software will swing VIP1 to Node1 and fire up Qmgr1 on Node1
- App1 reconnects to Qmgr1 on VIP1 on Node1
So it depends on your reconnect logic and the version of MQ Client and how you configured the MQ Client.
Yes, I left out your other active Qmgr2 because that more than doubles your fun. |
|
Back to top |
|
 |
Studv01 |
Posted: Fri Jan 23, 2015 9:52 am Post subject: |
|
|
Apprentice
Joined: 23 Jan 2015 Posts: 27
|
Hi Joseph
Thanks for illustrating a failover scenario;
Here is what I am Observing with my apps.
. Qmgr1 with VIP1 and listener1 runs on node1 and qmgr2 with VIP2 and listener2 runs on node2.
. In recent days we had few situation we had to offline Qmgr1 on node1 not failOver. And even while failover Qmgr1 offline on node1 and then online on node2 with VIP1 and corresponding listener.
. While we offline Qmgr1 on node1 and while failing over. Client apps will lookup next available QMGR that is qmgr2 and connects to qmgr2 running with VIP2 and listener2;
. From that time client apps connect to qmgr2 running on VIP2 and listener2. We observed this pattern even after we make Qmgr1 online on node1/node2.
. We have to recycle apps to bring traffic back to normal that is to Qmgr1.
. After recycle it does not matter where Qmgr1 runs either on node1 or node2 Apps will establish connection back on Qmgr1.
. Yes we do use qmgrouping. We specific "*Qmgr1" in the connection details so that app will lookup all available QMGR in the CCDT. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Jan 23, 2015 10:36 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Studv01 wrote: |
Hi Joseph
Thanks for illustrating a failover scenario;
Here is what I am Observing with my apps.
. Qmgr1 with VIP1 and listener1 runs on node1 and qmgr2 with VIP2 and listener2 runs on node2.
. In recent days we had few situation we had to offline Qmgr1 on node1 not failOver. And even while failover Qmgr1 offline on node1 and then online on node2 with VIP1 and corresponding listener.
. While we offline Qmgr1 on node1 and while failing over. Client apps will lookup next available QMGR that is qmgr2 and connects to qmgr2 running with VIP2 and listener2;
. From that time client apps connect to qmgr2 running on VIP2 and listener2. We observed this pattern even after we make Qmgr1 online on node1/node2.
. We have to recycle apps to bring traffic back to normal that is to Qmgr1.
. After recycle it does not matter where Qmgr1 runs either on node1 or node2 Apps will establish connection back on Qmgr1.
. Yes we do use qmgrouping. We specific "*Qmgr1" in the connection details so that app will lookup all available QMGR in the CCDT. |
Functioning as designed.
There's nothing in this architecture as you've described it that will alert the clients that Qmgr1 is now back on line and they should abandon Qmgr2. Hence they remain connected to QMgr2 because it's all fine.
If you want them to fail back you need to (as you've discovered) make the clients reestabilish a connection. An alternative (and maybe it's better, maybe it's not) is when you restart Qmgr1 you stop all or some of the SVRCONN channels and make the clients reconnect.
You've got load balancing (between Qmgr1 & QMgr2) as well as fail over. You need to account for that. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
fjb_saper |
Posted: Sat Jan 24, 2015 3:13 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Do not restart the app. If you want the app to failover back to qmgr1 just make qmgr2 unavailable to the app, example would be bouncing it, or just stopping the svrconn channels used by the app.
Remember to start those channels after failover of the app!!!
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|