ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexGeneral IBM MQ SupportMQ 9.1 RDQM DR failover query

Post new topicReply to topic Goto page Previous  1, 2
MQ 9.1 RDQM DR failover query View previous topic :: View next topic
Author Message
mgsantos
PostPosted: Mon Feb 25, 2019 4:18 am Post subject: Reply with quote

Newbie

Joined: 22 Feb 2019
Posts: 5

@hughson my test has been done simulating Qmgr processes dead unexpectedly. Following status from primary machine before and after killing it, as well as second one.

obs changed servername and qmgr name before posting this
Server: Primary | Status: Before killing processes

Code:
$ dspmq -o dr -o status -m MQ
rdqmstatus -m MQ
QMNAME(MQ)                                             STATUS(Running) DRROLE(Primary)
$ rdqmstatus -m MQ
Queue manager status:                   Running
CPU:                                    0.01%
Memory:                                 106MB
Queue manager file system:              3880MB used, 9.8GB allocated [39%]
DR role:                                Primary
DR status:                              Normal
DR type:                                Synchronous
DR port:                                1482
DR local IP address:                    10.201.64.36
DR remote IP address:                   10.200.128.13
Command '/opt/mqm.91/bin/rdqmstatus' run with sudo.
$ /usr/sbin/drbdadm status
mq role:Primary
  disk:UpToDate
  mqserver2 role:Secondary
    peer-disk:UpToDate


Server: Primary | Status: After killing processes

Code:
$ ps -ef | grep "/opt/mqm.91/bin/" | grep -v "grep" | awk '{print $2}'| xargs kill -9
$ ps -ef | grep amq
mqm        3212 117723  0 11:49 pts/3    00:00:00 grep --color=auto amq
$ dspmq -o dr -o status -m MQ
dqmstatus -m MQ
QMNAME(MQ)                                             STATUS(Ended unexpectedly) DRROLE(Primary)
$ rdqmstatus -m MQ
Queue manager status:                   Ended unexpectedly
Queue manager file system:              3880MB used, 9.8GB allocated [39%]
DR role:                                Primary
DR status:                              Normal
DR type:                                Synchronous
DR port:                                1482
DR local IP address:                    10.201.64.36
DR remote IP address:                   10.200.128.13
Command '/opt/mqm.91/bin/rdqmstatus' run with sudo.
$ /usr/sbin/drbdadm status
mq role:Primary
  disk:UpToDate
  mqserver2 role:Secondary
    peer-disk:UpToDate


Server: Secondary | Status: Before/After killing processes on primary (same values)


Code:
$ dspmq -o dr -o status -m MQ
QMNAME(MQ)                                             STATUS(Ended immediately) DRROLE(Secondary)
$ rdqmstatus -m MQ
Queue manager status:                   Ended immediately
DR role:                                Secondary
DR status:                              Normal
DR type:                                Synchronous
DR port:                                1482
DR local IP address:                    10.200.128.13
DR remote IP address:                   10.201.64.36
Command '/opt/mqm.91/bin/rdqmstatus' run with sudo.
$ /usr/sbin/drbdadm status
mq role:Secondary
  disk:UpToDate
  mqserver1 role:Primary
    peer-disk:UpToDate

$ rdqmdr -m MQ -p
AMQ3763E: Queue manager 'MQ' is already the DR primary on the remote node.
AMQ3769E: Failed to make queue manager 'MQ' the DR primary on this node.
Command '/opt/mqm.91/bin/rdqmdr' run with sudo.
[/code]
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Feb 25, 2019 5:51 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20056
Location: LI,NY

Does that mean that in order for DR to become primary you HAVE to switch prod to secondary first? That would mean you didn't loose all control on prod...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
exerk
PostPosted: Mon Feb 25, 2019 6:03 am Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6069

The KC clearly states "...Following the loss of the primary queue manager at the main site, you make the secondary queue manager at the recovery site into the primary and start it...".

There is nothing I can see in the KC that implies, or states, that take-over is automatic, so are you killing the primary, ensuring all replication processes are also 'dead', and setting the secondary to primary?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Back to top
View user's profile Send private message
mgsantos
PostPosted: Mon Feb 25, 2019 7:04 am Post subject: Reply with quote

Newbie

Joined: 22 Feb 2019
Posts: 5

@exerc the exercise is simulating the queue manager is lost for whatever reason but the operating system is up and running fine, I did not kill replication processes, I see [drbd*] processes running, are you talking about those?

my next plan is to do testing with server shutdown (once i can get someone to do that for me) and see if I can change secondary queue manager into primary.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Feb 25, 2019 7:32 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20056
Location: LI,NY

I guess he left the replication processes up and that's why he could not switch the secondary to primary...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
exerk
PostPosted: Mon Feb 25, 2019 8:53 am Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6069

mgsantos wrote:
...I did not kill replication processes, I see [drbd*] processes running...

With that running, you will not be able to...

mgsantos wrote:
...see if I can change secondary queue manager into primary...

Please reread the last paragraph of my previous post.
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Back to top
View user's profile Send private message
mgsantos
PostPosted: Mon Feb 25, 2019 11:29 am Post subject: Reply with quote

Newbie

Joined: 22 Feb 2019
Posts: 5

I am covering the following test scenarios:
1. Planned outage: stop qmgr on primary, switch it to secondary, switch the secondary to primary, start it
2. Unplanned outage:
a. MQ server is broken for what ever reason, but server and operating system are up.
b. Server and OS is down.

I have done 1, and testing 2, item a. I know takeover is not automatic, also I understand that the problem are the replication processes now, drbd* ones, however I have no idea how to stop them manually, any thoughts ? probably something with command: drbdadm
Back to top
View user's profile Send private message
mgsantos
PostPosted: Mon Feb 25, 2019 11:37 am Post subject: Reply with quote

Newbie

Joined: 22 Feb 2019
Posts: 5

self replying

well I guess from mq admin perspective I just need to run rdqmdr commands even if qmgr is down, no need to know anything about drbdadm stuff... I will validate if with server being down I can do the switch on the second box.

thanks for the help so far.
Back to top
View user's profile Send private message
hughson
PostPosted: Tue Feb 26, 2019 1:05 am Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1169
Location: Bay of Plenty, New Zealand

I would suggest that if you are arbitrarily killing processes in the hope of simulating a queue manager failure, that you simply have not killed the correct ones. If you are testing server failure, why not just take out the server, instead of only certain processes? That would be a more valid test in my view.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
bruce2359
PostPosted: Tue Feb 26, 2019 4:50 am Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8431
Location: US: west coast, almost. Otherwise, enroute.

hughson wrote:
I would suggest that if you are arbitrarily killing processes in the hope of simulating a queue manager failure, that you simply have not killed the correct ones. If you are testing server failure, why not just take out the server, instead of only certain processes? That would be a more valid test in my view.

Cheers,
Morag


Do you do a similar 'kill a random o/s process' to further test your DR strategy?

Much simpler test: reach around the back of the server and pull the power cord.
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page Previous  1, 2 Page 2 of 2

MQSeries.net Forum IndexGeneral IBM MQ SupportMQ 9.1 RDQM DR failover query
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.