ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » SUSPEND RESUME Cluster queue manager with changed IP address

Post new topic  Reply to topic
 SUSPEND RESUME Cluster queue manager with changed IP address « View previous topic :: View next topic » 
Author Message
monkeydluffy
PostPosted: Thu Feb 11, 2016 10:08 pm    Post subject: SUSPEND RESUME Cluster queue manager with changed IP address Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Hi everybody,

Firstly I am very sorry for a long post. We have been facing an MQ cluster issue and thought it would be good to share my experience here and also get expert opinion on this.

We have a scheduled PROD to DR fail-over testing next week. We were trying to test the MQ scripts by simulating within our UAT and a clone environment.
The setup is as follows:
EAI QMGR: QM1 (Full repository)
Backend Application 2 QMGRs: QM2 (FR) and QM3 (PR)

These three queue managers are in a cluster MYCLUS1.
In the PROD to DR activity, only App server and MQ server will be cloned to DR server. EAI will remain in the PROD environment.
So during this simulation, we kept the EAI server in UAT and tried to migrate the App queue managers to a UAT clone server (via VMWare copy of the UAT image)
So this would require the cloned app queue managers to be in the cluster instead of the UAT queue managers.

Before testing: QM1 (UAT, FR), QM2 (UAT, FR) and QM3 (UAT, PR)
During testing: QM1 (UAT, FR), QM2 (UAT clone, FR) and QM3 (UAT clone, PR)
As mentioned, app will be cloned from PROD to DR environment; it is a VMware copy of the image.
So QM2 and QM3 will now have different IP address but they are still the same queue manager with the same QMID also.
For this reason, we are suspending the App queue managers, and resuming them from the new IP address. And we are also changing the IP address of the cluster sender and receiver channels across all 3 queue managers.

Say the IP address of the queue managers:
EAI is in 172.30.9.1
QM2 and QM3 UAT IP address is 172.30.9.2 (port 1415 and 1416 respectively)
UAT clone IP address is 172.30.9.3. Since this is VMware copy, queue manager name and port remain the same.
We followed the steps as detailed in:
WebSphere MQ 7.5.0>WebSphere MQ>Configuring>Configuring a queue manager cluster>Managing WebSphere MQ clusters>Maintaining a queue manager

What we did were:
1. SUSPEND QMGR on QM2 UAT. Once MQ error log says successfully processed, we ended the Queue manager.
2. SUSPEND QMGR on QM3 UAT. Once MQ error log says successfully processed, we ended the Queue manager.
3. ALTER EAI CLUSSDR IP address to point to 172.30.9.3 (app UAT clone IP address) replacing the existing 172.30.9.2(app UAT IP address)
4. Start the queue manager QM2 in UAT clone server.
5. Modified the IP address of QM2 CLUSRCVR channel to 172.30.9.3. We did not modify the CLUSSDR IP address to EAI since EAI IP address is unchanged.
6. Start the queue manager QM2 in UAT clone server.
7. Modified the IP address of QM3 CLUSRCVR channel to 172.30.9.3.
8. Modified the IP address of QM3 CLUSSDR channel for QM2 to 172.30.9.3.
9. We did not modify the CLUSSDR IP address to EAI since EAI IP address is unchanged.

What we then see is that CLUSSDR channel from EAI to QM2 is retrying, CLUSSDR from QM2 to QM3 is in retrying, auto –cluster sender from QM2 to QM3 is retrying, auto-cluster sender from EAI to QM3 is retrying.
Upon investigation, we see the all IP address changes took effect in MQ explorer, but in error log it was still pointing to previous IP addresses. Even in the cluster section of MQ explorer we see old IP address but in queue manager section, we see new IP address.
Then we issued SUSPEND Queue manager in force mode in the UAT app queue managers (started them, forcibly suspended them, then stopped them) which did not solve anything. Then we issue REFRESH CLUSTER command in all three queue managers (EAI and clone UAT), and problem was resolved.
I understand issuing REFRESH CLUSTER should be done only in exceptional circumstances but this was UAT so we proceeded anyways but we cannot do so in PRODUCTION environment unless advised. During our live PROD to DR fail-over do we need to follow the REFRESH CLUSTER approach?

I am trying to figure out a cleaner way to do this keeping in mind the client environment constraints. I will try to put my findings also soon.

I am relatively new to cluster and have been going through the infocenter for clusters section. Have a query which I could not not find a answer on infocenter. May be it is subtly mentioned somewhere which I could have overlooked.


Last edited by monkeydluffy on Mon Feb 15, 2016 6:22 pm; edited 1 time in total
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Feb 12, 2016 5:25 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

Quote:
Backend Application 2 QMGRs: QM2 (FR) and QM2 (PR)

I do hope this is a typo and should read:
Backend Application 2 QMGRs: QM2 (FR) and QM3 (PR)
monkeydluffy wrote:
What we did were:
1. SUSPEND QMGR on QM2 UAT. Once MQ error log says successfully processed, we ended the Queue manager.
2. SUSPEND QMGR on QM3 UAT. Once MQ error log says successfully processed, we ended the Queue manager.
3. ALTER EAI CLUSSDR IP address to point to 172.30.9.3 (app UAT clone IP address) replacing the existing 172.30.9.2(app UAT IP address)
4. Start the queue manager QM2 in UAT clone server.
5. Modified the IP address of QM2 CLUSRCVR channel to 172.30.9.3. We did not modify the CLUSSDR IP address to EAI since EAI IP address is unchanged.
6. Start the queue manager QM2 in UAT clone server.
7. Modified the IP address of QM3 CLUSRCVR channel to 172.30.9.3.
8. Modified the IP address of QM3 CLUSSDR channel for QM2 to 172.30.9.3.
9. We did not modify the CLUSSDR IP address to EAI since EAI IP address is unchanged.

Your procedure is flawed. This is not how you change the address / IP of a cluster channel.
Review the correct procedure in the manuals and implement in UAT. Verify results and implement in PROD.

If I remember correctly the correct procedure to change the address of a cluster receiver channel is as follows:

  1. Make sure the cluster sender channel to the full repository works fine.
  2. suspended the qmgr from the cluster until the change procedure is complete.
  3. Alter the channel : remove the cluster attribute (cluster, clusnl)
  4. stop the channel, restart the channel
  5. alter the channel update the conname to the correct information
  6. stop the channel, start the channel
  7. alter the channel update the cluster information to correctly reintroduce the qmgr to the cluster
  8. stop the channel, restart the channel
  9. verify on both FRs using the display clusqmgr command that the information returned for the channel shows the correct information for the changed channel
  10. resume the qmgr in the cluster



_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
monkeydluffy
PostPosted: Fri Feb 12, 2016 8:22 am    Post subject: Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Thanks fjb_saper, I will update my findings as per the procedure you have highlighted.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Fri Feb 12, 2016 8:27 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7717

Are you using IP addresses or DNS names in your channel definitions?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
monkeydluffy
PostPosted: Sat Feb 13, 2016 8:20 pm    Post subject: Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Hi Peter,

We are using IP address for the cluster channel. We proposed before to customer to use DNS when setting up the cluster but they some reservations against doing so. Does using DNS simplify this switch over activities?
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sat Feb 13, 2016 9:10 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

monkeydluffy wrote:
Hi Peter,

We are using IP address for the cluster channel. We proposed before to customer to use DNS when setting up the cluster but they some reservations against doing so. Does using DNS simplify this switch over activities?

It does greatly as you would not switch any cluster definitions. You would just switch the DNS mapping of the ip.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
monkeydluffy
PostPosted: Sun Feb 14, 2016 8:57 am    Post subject: Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Hi fjb_saber,

Thanks for your suggestions earlier and I replicated the failover using VMware images locally to understand the behaviour.

To begin with I have 3 Vmware images.
VM1: QM1(FR), IP address x(say)
VM2: QM2(FR)and QM3(PR), IP address y(say)
VM3( is clone of VM2): QM2 and QM3, IP address z(say)

VM1 and VM2 queue managers are in a cluster to begin with. My goal here is to failover the cluster from VM2 to VM3 with VM1 still intact.

The steps I followed:

1. SUSPEND QM3(PR) in VM2
2. ALTER CLUSSDR REPOS('') for QM3 in VM2
3. ALTER CLUSRCVR REPOS('') for QM3 in VM2
4. STOP CLUSSDR for QM3( to QM2) in VM2
5. STOP CLUSRCVR for QM3 in VM2. Did not stop. Had to use FORCE.
6. SUSPEND QM2(FR) in VM2
7. ALTER CLUSSDR REPOS('') for QM2 in VM2
8. ALTER CLUSRCVR REPOS('') for QM2 in VM2
9. STOP CLUSSDR for QM2( to QM1) in VM2
10. STOP CLUSRCVR for QM2 in VM2. Did not stop. Had to use FORCE.
12. Change IP address of CLUSSDR in QM1 pointing to QM2. Ip address of VM3 is now given. This channel is stopped after change.
13. In VM3, ALTER CLUSRCVR REPOS('CLUSNAME') for QM2
14. In VM3, ALTER CLUSSDR REPOS('CLUSNAME') for QM2
15. In VM3, ALTER QMGR REPOS ('CLUSNAME') for QM2
16. In VM3, START CLUSSDR and CLUSRCVR for QM2

Now we see QM1 CLUSSDR channel to QM2 is in retry.

15. In VM3, RESUME QMGR REPOS ('CLUSNAME') for QM2
This resume did not help. Channel was still retrying,

Had to issue a REFRESH CLUSTER in QM1 and QM2 separately to get this sorted.

17. In VM3, ALTER CLUSRCVR REPOS('CLUSNAME') for QM3
18. In VM3, ALTER CLUSSDR REPOS('CLUSNAME') for QM3
19. In VM3, START CLUSSDR and CLUSRCVR for QM2

Now did not have to issue REFRESH CLUSTER in QM3. All resolution and auto channels were properly created thereby replicating a succesfull failover.
(not sure if this can be considered succesfful or not)

I again tried switchover back from VM3 to VM2 just to ensure that the same set of steps work. However this time the QM1 to QM3 auto sender channel was not created. I did a REFRESH CLUSTER on QM3 and it got created.

I can say I am only little more confident than I was yesterday , but still would be good to know why in some cases REFRESH CLUSTER is at all needed.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Feb 15, 2016 4:36 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

Ok. So first change the defined cluster sender and make sure it communicates correctly.

Then start the 10 step procedure for the cluster receiver.
Same thing for the FR.

For a retrying channel in a PR / FR. Try to do following.
Stop the channel. Wait for it to be in stopped status. Start the channel.
If fhis is not successful use stop force. If this fails use stop terminate.

Hope this helps some.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
monkeydluffy
PostPosted: Mon Feb 15, 2016 5:47 pm    Post subject: Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Hi fjb_saper,

Thanks for suggestion.
However I cannot make the CLUSSDR of the PR to FR working during the switch over, since this PR is directly connected to the FR which is also getting IP address changed. So until this FR also moves, the CLUSSDR of the moved PR cannot get running.

So as per steps I did before,

1. we are suspending the PR first in PROD. we are doing as you kindly suggested. alter the CLUSSDR and CLUSRCVR channel of this PR to bring it out of cluster. stop the CLUSSDR of this PR to FR. Stop the auto CLUSSDR in both FR to this PR. Then stop the CLUSRCVR in this PR. Now no need to use FORCE sicne CLUSSDR are already stopped.

2. with the PR suspended from PROD but not resumed in DR, same steps are being done for the FR to move from PROD to DR.

3. PROD to DR is cloned.

3. Now the FR is being resumed in DR. The channels (stopped for now since this is cloned from the PROD where is it stopped now) are brought into cluster, conname changed and is now started.

4. REFRESH cluster issued to moved FR. The manual cluster sender from FR(the EAI one which will remain at PROD) to this FR was retrying but now it gets the new IP address and is running.

5. Now the PR is being resumed in DR. The channels( stopped for now since this is cloned from the PROD where is it stopped now) are brought into cluster, conname changed and is now started.

6. REFRESH cluster issued to moved PR. The auto cluster sender from FRs to this PR get created and running.

I hope I am not straying from the approach you suggested, but had to modify a little because of the switchover procedure.

Another thing is the manual specifies that a SUSPEND or RESUME or REFRESH CLUSTER is completed once the SYSTEM.CLUSTER.COMMAND.QUEUE reaches a consistent state. As soon as i execute the command, the error log says processed successfully. Is this an indicator of command completion? Or I need to monitor this queue for 5-10 minutes by enabling Queue monitoring, and also see when OPPROCS of this queue reaches 0, meaning no further internal messages is coming to this queue.

Many thanks again for the steps you have provided. They were of great help.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Feb 15, 2016 9:05 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

You cant't do that. If the PR you move is attached to the FR you move, you have to first fix the moved FR.

First fix the FR's defined cluster sender. Then fix the cluster receiver as per procedure. You can then fix the attached PR...

Note that you should not need to issue the Refresh Cluster command.

Point 4: Don't issue the refresh cluster. If the channel is in retry, do a stop, wait for it to be in stopped status and start it again (at the sender location). And you'll have to do this using runmqsc because most likely you're going to hit an autodefined channel there.


_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
monkeydluffy
PostPosted: Tue Feb 16, 2016 7:17 am    Post subject: Reply with quote

Newbie

Joined: 11 Feb 2016
Posts: 9

Hi fjb_saper,

Thanks for reply.

Yes I am fixing the FR before the PR.

As per steps I mentioned in previous post, step 3 and step 4 is first about fixing the FR. step 5 and 6 afterwards is fixing the PR.

So while bringing down from PROD, we will be doing PR first, then the FR.
Restoration in DR server is FR first, then PR.

Thanks once again for your help.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » SUSPEND RESUME Cluster queue manager with changed IP address
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.