MQSeries.net :: View topic - RDQM - floating IP

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » RDQM - floating IP - how to?

Goto page 1, 2 Next

RDQM - floating IP - how to?

« View previous topic :: View next topic »

Author

Message

mqdev

Posted: Tue Jun 04, 2019 12:42 pm Post subject: RDQM - floating IP - how to?

Centurion

Joined: 21 Jan 2003
Posts: 136

Hello,
I have 3 nodes in my drbd cluster with IPs as follows:

Node1 - inet 172.24.102.138 netmask 255.255.252.0
Node2 - inet 172.24.101.29 netmask 255.255.252.0
Node3 - inet 172.18.178.195 netmask 255.255.252.0

Based on above netmask and IPs, the Floating IP for an RDQM in above cluster needs to be in the following range:

For node1 and node2 : 172.24.100.1 (low) -- 172.24.103.254 (high)
For node3 : 172.18.176.1 (low) -- 172.18.179.254 (high).

There is no overlap in the above IP address ranges - does it mean I cannot assign a Floating IP to an RDQM implemented on above drbd cluster?

Here is what I tried:
on node1 ( IP = 172.24.102.138), I tired assigning Floating IP 172.18.179.200 which failed ( Floating IP address '172.18.179.200' not in interface 'eth0' subnet. )

Next tried assigning 172.24.103.200, it worked ( coz 172.24.102.138 and 172.24.103.200 are in same subnet). However, the RDQM is now dysfunctional - it cannot failover to the 3rd node ( 172.18.176.1). In fact, it is not failing over at all...

Questions:
Should the drbd Cluster IPs be in a certain way (so that the Floating IP matches the IP and Netmask for all 3 nodes)?

I had success with Assigning floating IP ( node IPs - 172.24.101.27/172.24.103.199/172.24.103.79, Floating IP - '172.24.103.1' ) when all nodes were in same subnet and/or have overlapping IP ranges (as per ip & netmask)....

However, if the drbd nodes are not in same subnet, looks like Floating IP is a moot point...thoughts?

hughson

Posted: Tue Jun 04, 2019 9:55 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1972
Location: Bay of Plenty, New Zealand

In order to use a floating IP address with an RDQM queue manager, all nodes must be in the same subnet.

In addition, all nodes must have the same name for the network interface, in your example 'eth0'.

If you don't intend to use a floating IP address, then it is OK for the nodes of an RDQM queue manager to be in different subnets.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

mqdev

Posted: Wed Jun 05, 2019 6:14 am Post subject: Thanks for the prompt and clear response - much appreciated!

Centurion

Joined: 21 Jan 2003
Posts: 136

Thanks for the prompt and clear response - much appreciated!

mqdev

Posted: Wed Jun 05, 2019 6:22 am Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

Morag - can you please give some troubleshooting steps for RDQM?

I did assign an Floating IP (172.24.103.200) to the QM which was Primary on 172.24.102.138 - the rdqmint command itself completed successfully (no errors thrown). However, the QM is hosed from that point onwards -it wouldnt failover (tried failing over to the node 172.18.178.195.and it did not work). However, it wouldnt failover to the other node - 172.24.101.29 as well.

The failover command ( rdqmadm -p -m <RDQM Name> -n <node name> ) completes successfully (i.e. no errors thrown - gives a msg that given node is set as Primary for the QM). However, the failover NEVER happens (I have noticed, in general, it takes a few seconds for the failover to occur - but in this case, it just doesnt happen!). Nothing suspicious in /var/log/messages as well..how can I troubleshoot this further to understand why the failover is not happening?

Thanks in advance for your time!
-mqdev

fjb_saper

Posted: Wed Jun 05, 2019 7:46 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20770
Location: LI,NY

As Morag said:
Do not assign a floating ip if

The network connection on all 3 servers are not assigned to the same interface (ex eth0)
all 3 servers are not in the same subnet (look at the boson subnet calculator for help) even if it is a vlan...

Hope this helps

_________________
MQ & Broker admin

mqdev

Posted: Wed Jun 05, 2019 10:28 am Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

I am looking for toubleshooting steps...as my RDQM is currently hosed due to trying to add the Floating IP...would like to resurrect it, if possible.

Going forward, yes - we will ensure the drbd nodes are in same subnet for us to be able to use Floating IP.

hope that helps why I am asking the question...

hughson

Posted: Wed Jun 05, 2019 7:47 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1972
Location: Bay of Plenty, New Zealand

Have you removed the floating IP address already?

Code:

rdqmint -m <RDQM Name> -d

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

mqdev

Posted: Thu Jun 06, 2019 5:04 am Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

yes (remove the Floating IP from the RDQM) - still no joy.
the RDQM stays hosed...the RDQM itself is no concern (this is Dev env). The bigger payoff for us is to learn how to troubleshoot this situation. Any information in this direction would be highly useful...

john_colgrave

Posted: Fri Jun 07, 2019 6:01 am Post subject: Troubleshooting RDQM

Newbie

Joined: 02 Jun 2014
Posts: 7

If you suspect a problem with any of the resources managed by Pacemaker, and for RDQM failover is managed by Pacemaker, the first thing to do is to run "crm status" and study the output. If you post that output here we can take it from there.

mqdev

Posted: Mon Jun 10, 2019 12:35 pm Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

John/Morag: Please see below, the output of "crm status" command (I have masked the domain name as "bbbbbbbb.com" in the output below)

RDQM1 is the QM where I attempted to attach the Floating IP and backed out. I am now not able to "move" around RDQM1 using the command

rdqmadm -p -m RDQM1 -n lnc3234.bbbbbbbb.com command...

This command completes without errors but, the RDQM1 is not failing over...

==========================================

root@lnc3234 ~
# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:29:27 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms

root@lnc3234 ~
#

==============================================================================

root@lnc3235 ~
# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:28:55 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms

root@lnc3235 ~

==============================================================================

[root@lncb90c ~]# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:28:34 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms

[root@lncb90c ~]#

=========================================================

Last edited by mqdev on Mon Jun 10, 2019 12:44 pm; edited 1 time in total

mqdev

Posted: Mon Jun 10, 2019 12:40 pm Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

in the above is actually "( 8 ) :" without spaces....

Also, we are at MQ v9.1.2.0 on these nodes.

Last edited by mqdev on Mon Jun 10, 2019 12:44 pm; edited 1 time in total

hughson

Posted: Mon Jun 10, 2019 12:42 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1972
Location: Bay of Plenty, New Zealand

mqdev wrote:

in the above is actually "( 8 ) :" without spaces....

Edit your post and check "Disable Smilies in this post" which is just below the edit box.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

mqdev

Posted: Mon Jun 10, 2019 12:45 pm Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

hughson wrote:

mqdev wrote:

in the above is actually "( 8 ) :" without spaces....

Edit your post and check "Disable Smilies in this post" which is just below the edit box.

Done..thank you.

mqdev

Posted: Mon Jun 10, 2019 12:51 pm Post subject:

Centurion

Joined: 21 Jan 2003
Posts: 136

@John, @Morag,

The "crm status" command does indicate a failure with RDQM1 (which is the problematic RDQM). How can I findout exactly what the problem is?

The "last-rc-change='Tue Jun 4 16:07:27 2019'" - this is the time I attempted to add the Floating IP. Somehow my action hosed the Pacemaker, it appears.

Secondly, why is the rdqmadm ending normally when failover is not being achieved? The reason this is important is - we are scripting automation around these commands - so if a given command runs successfully but in reality fails to achieve the intended result, our monitoring will go haywire....

root@lnc3235 ~
# dspmq
QMNAME(RDQM1) STATUS(Running elsewhere)
QMNAME(RDQM2) STATUS(Running)
QMNAME(QM0.AD.US.LNC3234) STATUS(Running elsewhere)

root@lnc3235 ~

root@lnc3235 ~
# rdqmadm -p -m RDQM1 -n lnc3235.bbbbbbbbb.com
The preferred replicated data node has been set to 'lnc3235.bbbbbbbbb.com' for
queue manager 'RDQM1'.

root@lnc3235 ~
# echo $?
0

root@lnc3235 ~
#

hughson

Posted: Mon Jun 10, 2019 1:22 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1972
Location: Bay of Plenty, New Zealand

mqdev wrote:

why is the rdqmadm ending normally when failover is not being achieved?

I'll let John, as the RQDM Architect in IBM Hursley answer the crm status output question. I just wanted to add something about your rdqmadm command.

The rdqmadm command you are issuing is to set a preferred node for the queue manager. The fact that this can cause the queue manager to move to that node is asynchronous to the command. I suspect if you display the RDQM (rdqmstatus) you will see that the preferred node has been successfully set. This will be why the command completed successfully.

In other words, the command is not "move my Qmgr", the command "set the preference" - the queue manager will move to the preferred node ... if it can. Does that make sense?

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » RDQM - floating IP - how to?

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP