Author |
Message
|
Anant.v |
Posted: Thu May 17, 2018 7:16 am Post subject: How to keep cluster queue sequence number in sync after a DR |
|
|
 Apprentice
Joined: 26 Nov 2014 Posts: 40 Location: Malaysia
|
Hi All,
We are facing many cluster issues after a DR exercise, mainly due to the fact that the cluster queue sequence number does not remains the same after the failback to PROD from DR. Mostly some queue manager will have a higher sequence number compared to other copies maintained by the other cluster member queue managers.
Can anyone please suggest if a PROPER SAN replication can fix this ? Or is a cluster refresh needed no matter what type of replication is used.
I am trying to figure out if there is a way to avoid refresh cluster after a DR, and if the sequence numbers can be maintained the same by any means.
Thanks to you all, for always enlightening me with new ideas !!  |
|
Back to top |
|
 |
Vitor |
Posted: Thu May 17, 2018 7:23 am Post subject: Re: How to keep cluster queue sequence number in sync after |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Anant.v wrote: |
Can anyone please suggest if a PROPER SAN replication can fix this |
What kind of improper SAN replication are you currently using?
How are you failing over from Prod to DR and back again? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Anant.v |
Posted: Thu May 17, 2018 8:12 am Post subject: |
|
|
 Apprentice
Joined: 26 Nov 2014 Posts: 40 Location: Malaysia
|
The SAN replication was a true synchronization. i.e. to copy all QMGR related directories, so that the state of cluster be preserved. In short :
1. MQ is replicated to DR locality - synchronous disk array replication of logs and data (including configuration)
2. Stop MQ in PROD
3. Stop disk array replication
4. Start DR QMGR.
The main question I have is , is there a way to preserve the cluster state information. |
|
Back to top |
|
 |
Vitor |
Posted: Thu May 17, 2018 8:22 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Anant.v wrote: |
The main question I have is , is there a way to preserve the cluster state information. |
The way you're doing it. State is persisted by MQ on disc.
You're either missing a directory (unlikely as it works Prod -> DR) or the replication isn't working DR -> Prod for something.
Or some of the disc is not replicating as fast as you think it is. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
gbaddeley |
Posted: Thu May 17, 2018 4:16 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
What do you mean by "cluster queue sequence number" ? What errors are you experiencing? _________________ Glenn |
|
Back to top |
|
 |
mvic |
Posted: Sun May 20, 2018 6:44 am Post subject: Re: How to keep cluster queue sequence number in sync after |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
Anant.v wrote: |
We are facing many cluster issues after a DR exercise, mainly due to the fact that the cluster queue sequence number does not remains the same after the failback to PROD from DR. Mostly some queue manager will have a higher sequence number compared to other copies maintained by the other cluster member queue managers. |
I guess you are looking inside the amqrfdm outputs, to know this, as sequence numbers are not dumped anywhere else.
But it is normal, when you run REFRESH CLUSTER, for some/all sequence numbers within that qmgr to jump up to the current epoch time.
I assume you had ran REFRESH CLUSTER in DR during this exercise (this would again be normal).
If you had/have network connectivity from your DR site to your prod site, and the same qmgr names and/or CLUSRCVR channel names in DR and prod, then your prod site will get very confused.
Now you need to run REFRESH CLUSTER on all qmgrs in your prod site which had some "new" version of themselves brought up in DR.
Before doing that, ensure that nothing remains running in your DR site that might continue to try to re-insert itself into your prod cluster. |
|
Back to top |
|
 |
|