Author |
Message
|
smdavies99 |
Posted: Wed Feb 05, 2014 11:50 am Post subject: DR host failover and channel sequence numbers |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
I have a little question that is puzzling me
Suppose I have TWO DataCentres A and B. There are a number of Client systems all running WMQ Servers with SDR/RCVR channels to the main DC.
Everything is hunky dory and is working away until someone pulls the plug on the DC and it fails over to the DR site.
A reconfig of the network re-routes the channel connections from the remote site to the DR host.
What happens to the channel sequence number? I guess that it will need resetting at both ends (on the SDR Channels) before data can flow again.
IF this is the case, is there any way you can automate this?
If anyone has setup such a system, what did you do about the sequence numbers.
Sure I can create a little script that does it and execute it remotely. This is where *IX wins over windows 'rsh' is a lovely tool.
That is fine for just few remote QM's but what happens if we get into the 100's.
Sure there is Windows remote shell but frankly it seems to be a bit of a bodge when compared to 'rsh' especially with SSL OOTB. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Feb 05, 2014 12:51 pm Post subject: Re: DR host failover and channel sequence numbers |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
smdavies99 wrote: |
What happens to the channel sequence number? |
Well it depends how you're keeping the DR site current. If the disk hosting the queue managers is replicated on any kind of real time or near real time schedule then the problem sorts itself. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Feb 05, 2014 1:02 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
There is no disk replication between the sites. They are 1000 miles apart. The DB replication is going to utilise the majority of the network link between the two sites. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Feb 05, 2014 1:06 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
sender channel starts will reset the seq #s of receiver channels. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Feb 05, 2014 1:07 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
smdavies99 wrote: |
There is no disk replication between the sites. They are 1000 miles apart. |
So surely the contents of the queues are a larger concern; specifically anything persistent sitting in a queue at the primary site waiting to be processed, which won't be replicated. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Feb 05, 2014 1:24 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
The important data on the queues is being replicated host to host.
The DR site is more or less passive. No Broker or appication is running.
The messages replicated to it have an expiry time set so they are self-cleaning.
I'm more concerned about the Remote sites. 90% of the data flows are from the Centre to the remotes.
It used to be that in order to get a SDR/RCVR channel working against a different QM you had to reset the sequence number.
Is this no longer the case? _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
exerk |
Posted: Wed Feb 05, 2014 2:05 pm Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
PowerShell for Windows is as powerful as rsh on *NIX. Have a word with your Windows Sys Admins to find out just how powerful. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Feb 05, 2014 2:05 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
smdavies99 wrote: |
It used to be that in order to get a SDR/RCVR channel working against a different QM you had to reset the sequence number.
Is this no longer the case? |
How about to the same qmgr on a different IP? (exception being the MI qmgr on it's defined IPs)
Used to be you had to reinitialize the sequence number if the server changed IP... (change in conname)
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
zpat |
Posted: Wed Feb 05, 2014 2:32 pm Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
The sequence numbers have to match up.
During DR when talking to a different copy of the QM, they won't - until reset.
Although if both the QMs are DR copies - they may match up.
We use real-time disk replication these days. But previously I used to reset the channels with MO71 - takes seconds.
You can even select multiple sender channels and perform a single MO71 action against all of them in one go.
Hardly worth automating. _________________ Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Feb 05, 2014 4:41 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
MO71 can change 100s of SNDR channels with a couple of clicks, but only if they are all on the same QM.
smdavies99 says he has 100s of remote QMs. That would take a while and lot of clicking. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Feb 05, 2014 4:49 pm Post subject: Re: DR host failover and channel sequence numbers |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
smdavies99 wrote: |
Suppose I have TWO DataCentres A and B. There are a number of Client systems all running WMQ Servers with SDR/RCVR channels to the main DC.
|
These client systems in are in other locations and other data centers, not in Data Center A or Data Center B?
smdavies99 wrote: |
A reconfig of the network re-routes the channel connections from the remote site to the DR host.
What happens to the channel sequence number? I guess that it will need resetting at both ends (on the SDR Channels) before data can flow again.
|
What do you mean "network re-routes the channel connections"?
They are chaning MQ channel definitions?
They are changing a DNS name that used to resolve to Data Center A and now resolves to B?
Are the QMs in Data Center B always up and running?
Are the QMs in Data Center B configured manually and similiarly to their counter parts in Data Center A, but otherwise there is no replication of any kind automatically doing anything between a QM in A and its partner in B?
Does the QM in B have the same name as its counterpart in A?
Does the QM in B have all the same exact MQ objects (including RCVR channels) as its counterpart in A?
Do the client systems in these other locations have connnectivity at all times in Data Center B, or only if A goes away and some firewall is opened up? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
smdavies99 |
Posted: Thu Feb 06, 2014 1:39 am Post subject: Re: DR host failover and channel sequence numbers |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
PeterPotkay wrote: |
These client systems in are in other locations and other data centers, not in Data Center A or Data Center B?
|
They are scattered around a continent.
PeterPotkay wrote: |
What do you mean "network re-routes the channel connections"?
They are chaning MQ channel definitions?
They are changing a DNS name that used to resolve to Data Center A and now resolves to B?
|
The DNS route is changed from the main to the DR site.
PeterPotkay wrote: |
Are the QMs in Data Center B always up and running?
Are the QMs in Data Center B configured manually and similiarly to their counter parts in Data Center A, but otherwise there is no replication of any kind automatically doing anything between a QM in A and its partner in B?
Does the QM in B have the same name as its counterpart in A?
Does the QM in B have all the same exact MQ objects (including RCVR channels) as its counterpart in A?
Do the client systems in these other locations have connnectivity at all times in Data Center B, or only if A goes away and some firewall is opened up? |
The QM's (apart from the one that takes the primary messages are not running in the DR site. The have identical names and channels to the QM's in the primary site. all the objected are the same.
so you have two identical sites (from a WMQ perspective) whe nwe switch over the IP Routing gets switched over and the QMGRS are started up.
Hence my question about resetting of the channel sequence numbers. I don't see anyway round having to do this. We can automate it but it will be a real PITA to test. The client is giving us 1 day for DR failover testing (WTF!!!!!) It is going to be a long day methinks because there are a whole bunch of less resilient systems to get failing over first.
We don't have to worry about a few messages/updates going astray. There is a mechanism already in place to resend all the last 'n' messages from the main site. There are a few updates coming up from the remote sites but all of these can have their data entered manually into the Application. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
zpat |
Posted: Thu Feb 06, 2014 3:52 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
Automating the remote end resets would mean connecting to each QM in turn and issuing stop/reset/start on the channel.
There are support pacs for issuing runmqsc commands remotely that a script could use I guess.
Or maybe send a message to a triggered queue (via remote queues defined on the hub QM), the triggered script would issue the commands.
Then MO71 multi-select could be used to send the messages in a single action. _________________ Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Last edited by zpat on Thu Feb 06, 2014 4:21 am; edited 1 time in total |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Feb 06, 2014 4:12 am Post subject: Re: DR host failover and channel sequence numbers |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
smdavies99 wrote: |
Hence my question about resetting of the channel sequence numbers. I don't see anyway round having to do this. |
I agree.
You could code some hokey solution for those remote QMs where if your monitoring tool sees:
A. A channel sequence # error
B. The DNS name resolves to the IP address of the DR site
C. This is the first time you have seen a sequence # to this IP address
Then auto reset the sequence #, set the flag that this has happened, email smdavies99 that it happened.
This way it auto resets the sequence #s the one time you cutover, but otherwise doesn't auto reset sequence #s that would occur for other reasons.
Kinda hokey, might be a bit of set up work, but would save you work the day of the DR. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Feb 06, 2014 5:00 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Ok, my memory is not what it was.
It may be that you need to reset the seq # of a sender channel, rather than merely start it, to reset the seq# of the receiver channel.
I wouldn't solve this problem with any kind of fancy monitoring, or any kind of fancy remote scripting solution.
I would solve this problem with a script that runs as part of qmgr startup on the DR systems. The only time you need to reset these channels is if the queue manager is starting up on the DR. It doesn't fundamentally *hurt* things, except for slowing down the startup time, if you reset the sender channels *every* time the DR qmgr starts up.
Even without using powershell, you can use runmqsc and find and echo and etc. on Windows to do sufficient things to get a list of channels, extract the channel name, and loop over them to issue reset commands.
But I personally would solve this problem with perl....  |
|
Back to top |
|
 |
|