ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexIBM MQ Installation/Configuration SupportRDQM - floating IP - how to?

Post new topicReply to topic Goto page 1, 2  Next
RDQM - floating IP - how to? View previous topic :: View next topic
Author Message
mqdev
PostPosted: Tue Jun 04, 2019 12:42 pm Post subject: RDQM - floating IP - how to? Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

Hello,
I have 3 nodes in my drbd cluster with IPs as follows:

Node1 - inet 172.24.102.138 netmask 255.255.252.0
Node2 - inet 172.24.101.29 netmask 255.255.252.0
Node3 - inet 172.18.178.195 netmask 255.255.252.0

Based on above netmask and IPs, the Floating IP for an RDQM in above cluster needs to be in the following range:

For node1 and node2 : 172.24.100.1 (low) -- 172.24.103.254 (high)
For node3 : 172.18.176.1 (low) -- 172.18.179.254 (high).

There is no overlap in the above IP address ranges - does it mean I cannot assign a Floating IP to an RDQM implemented on above drbd cluster?

Here is what I tried:
on node1 ( IP = 172.24.102.138), I tired assigning Floating IP 172.18.179.200 which failed ( Floating IP address '172.18.179.200' not in interface 'eth0' subnet. )

Next tried assigning 172.24.103.200, it worked ( coz 172.24.102.138 and 172.24.103.200 are in same subnet). However, the RDQM is now dysfunctional - it cannot failover to the 3rd node ( 172.18.176.1). In fact, it is not failing over at all...

Questions:
Should the drbd Cluster IPs be in a certain way (so that the Floating IP matches the IP and Netmask for all 3 nodes)?

I had success with Assigning floating IP ( node IPs - 172.24.101.27/172.24.103.199/172.24.103.79, Floating IP - '172.24.103.1' ) when all nodes were in same subnet and/or have overlapping IP ranges (as per ip & netmask)....

However, if the drbd nodes are not in same subnet, looks like Floating IP is a moot point...thoughts?
Back to top
View user's profile Send private message
hughson
PostPosted: Tue Jun 04, 2019 9:55 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1185
Location: Bay of Plenty, New Zealand

In order to use a floating IP address with an RDQM queue manager, all nodes must be in the same subnet.

In addition, all nodes must have the same name for the network interface, in your example 'eth0'.

If you don't intend to use a floating IP address, then it is OK for the nodes of an RDQM queue manager to be in different subnets.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
mqdev
PostPosted: Wed Jun 05, 2019 6:14 am Post subject: Thanks for the prompt and clear response - much appreciated! Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

Thanks for the prompt and clear response - much appreciated!
Back to top
View user's profile Send private message
mqdev
PostPosted: Wed Jun 05, 2019 6:22 am Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

Morag - can you please give some troubleshooting steps for RDQM?

I did assign an Floating IP (172.24.103.200) to the QM which was Primary on 172.24.102.138 - the rdqmint command itself completed successfully (no errors thrown). However, the QM is hosed from that point onwards -it wouldnt failover (tried failing over to the node 172.18.178.195.and it did not work). However, it wouldnt failover to the other node - 172.24.101.29 as well.

The failover command ( rdqmadm -p -m <RDQM Name> -n <node name> ) completes successfully (i.e. no errors thrown - gives a msg that given node is set as Primary for the QM). However, the failover NEVER happens (I have noticed, in general, it takes a few seconds for the failover to occur - but in this case, it just doesnt happen!). Nothing suspicious in /var/log/messages as well..how can I troubleshoot this further to understand why the failover is not happening?

Thanks in advance for your time!
-mqdev
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Wed Jun 05, 2019 7:46 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20072
Location: LI,NY

As Morag said:
Do not assign a floating ip if
  • The network connection on all 3 servers are not assigned to the same interface (ex eth0)
  • all 3 servers are not in the same subnet (look at the boson subnet calculator for help) even if it is a vlan...


Hope this helps

_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
mqdev
PostPosted: Wed Jun 05, 2019 10:28 am Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

I am looking for toubleshooting steps...as my RDQM is currently hosed due to trying to add the Floating IP...would like to resurrect it, if possible.

Going forward, yes - we will ensure the drbd nodes are in same subnet for us to be able to use Floating IP.

hope that helps why I am asking the question...
Back to top
View user's profile Send private message
hughson
PostPosted: Wed Jun 05, 2019 7:47 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1185
Location: Bay of Plenty, New Zealand

Have you removed the floating IP address already?

Code:
rdqmint -m <RDQM Name> -d


Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
mqdev
PostPosted: Thu Jun 06, 2019 5:04 am Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

yes (remove the Floating IP from the RDQM) - still no joy.
the RDQM stays hosed...the RDQM itself is no concern (this is Dev env). The bigger payoff for us is to learn how to troubleshoot this situation. Any information in this direction would be highly useful...
Back to top
View user's profile Send private message
john_colgrave
PostPosted: Fri Jun 07, 2019 6:01 am Post subject: Troubleshooting RDQM Reply with quote

Newbie

Joined: 02 Jun 2014
Posts: 5

If you suspect a problem with any of the resources managed by Pacemaker, and for RDQM failover is managed by Pacemaker, the first thing to do is to run "crm status" and study the output. If you post that output here we can take it from there.
Back to top
View user's profile Send private message
mqdev
PostPosted: Mon Jun 10, 2019 12:35 pm Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

John/Morag: Please see below, the output of "crm status" command (I have masked the domain name as "bbbbbbbb.com" in the output below)


RDQM1 is the QM where I attempted to attach the Floating IP and backed out. I am now not able to "move" around RDQM1 using the command

rdqmadm -p -m RDQM1 -n lnc3234.bbbbbbbb.com command...

This command completes without errors but, the RDQM1 is not failing over...

==========================================


root@lnc3234 ~
# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:29:27 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms


root@lnc3234 ~
#


==============================================================================

root@lnc3235 ~
# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:28:55 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms


root@lnc3235 ~

==============================================================================

[root@lncb90c ~]# crm status
Stack: corosync
Current DC: lncb90c.bbbbbbbbb.com (version 1.1.15.linbit-2.0+20160622+e174ec8.el7-e174ec8) - partition with quorum
Last updated: Mon Jun 10 16:28:34 2019 Last change: Mon Jun 10 15:56:15 2019 by root via crm_attribute on lnc3234.bbbbbbbbb.com

3 nodes and 18 resources configured

Online: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]

Full list of resources:

Master/Slave Set: ms_drbd_rdqm1 [p_drbd_rdqm1]
Masters: [ lncb90c.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lnc3235.bbbbbbbbb.com ]
p_fs_rdqm1 (ocf::heartbeat:Filesystem): Started lncb90c.bbbbbbbbb.com
p_rdqmx_rdqm1 (ocf::ibm:rdqmx): Started lncb90c.bbbbbbbbb.com
rdqm1 (ocf::ibm:rdqm): Started lncb90c.bbbbbbbbb.com
Master/Slave Set: ms_drbd_rdqm2 [p_drbd_rdqm2]
Masters: [ lnc3235.bbbbbbbbb.com ]
Slaves: [ lnc3234.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_rdqm2 (ocf::heartbeat:Filesystem): Started lnc3235.bbbbbbbbb.com
p_rdqmx_rdqm2 (ocf::ibm:rdqmx): Started lnc3235.bbbbbbbbb.com
rdqm2 (ocf::ibm:rdqm): Started lnc3235.bbbbbbbbb.com
Master/Slave Set: ms_drbd_qm0_ad_us_lnc3234 [p_drbd_qm0_ad_us_lnc3234]
Masters: [ lnc3234.bbbbbbbbb.com ]
Slaves: [ lnc3235.bbbbbbbbb.com lncb90c.bbbbbbbbb.com ]
p_fs_qm0_ad_us_lnc3234 (ocf::heartbeat:Filesystem): Started lnc3234.bbbbbbbbb.com
p_rdqmx_qm0_ad_us_lnc3234 (ocf::ibm:rdqmx): Started lnc3234.bbbbbbbbb.com
qm0_ad_us_lnc3234 (ocf::ibm:rdqm): Started lnc3234.bbbbbbbbb.com

Failed Actions:
* p_drbd_rdqm1_monitor_20000 on lnc3234.bbbbbbbbb.com 'master' (8): call=59, status=complete, exitreason='none',
last-rc-change='Tue Jun 4 16:07:27 2019', queued=0ms, exec=0ms

[root@lncb90c ~]#

=========================================================


Last edited by mqdev on Mon Jun 10, 2019 12:44 pm; edited 1 time in total
Back to top
View user's profile Send private message
mqdev
PostPosted: Mon Jun 10, 2019 12:40 pm Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

in the above is actually "( 8 ) :" without spaces....

Also, we are at MQ v9.1.2.0 on these nodes.


Last edited by mqdev on Mon Jun 10, 2019 12:44 pm; edited 1 time in total
Back to top
View user's profile Send private message
hughson
PostPosted: Mon Jun 10, 2019 12:42 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1185
Location: Bay of Plenty, New Zealand

mqdev wrote:
in the above is actually "( 8 ) :" without spaces....

Edit your post and check "Disable Smilies in this post" which is just below the edit box.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
mqdev
PostPosted: Mon Jun 10, 2019 12:45 pm Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

hughson wrote:
mqdev wrote:
in the above is actually "( 8 ) :" without spaces....

Edit your post and check "Disable Smilies in this post" which is just below the edit box.


Done..thank you.
Back to top
View user's profile Send private message
mqdev
PostPosted: Mon Jun 10, 2019 12:51 pm Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

@John, @Morag,

The "crm status" command does indicate a failure with RDQM1 (which is the problematic RDQM). How can I findout exactly what the problem is?

The "last-rc-change='Tue Jun 4 16:07:27 2019'" - this is the time I attempted to add the Floating IP. Somehow my action hosed the Pacemaker, it appears.

Secondly, why is the rdqmadm ending normally when failover is not being achieved? The reason this is important is - we are scripting automation around these commands - so if a given command runs successfully but in reality fails to achieve the intended result, our monitoring will go haywire....

root@lnc3235 ~
# dspmq
QMNAME(RDQM1) STATUS(Running elsewhere)
QMNAME(RDQM2) STATUS(Running)
QMNAME(QM0.AD.US.LNC3234) STATUS(Running elsewhere)

root@lnc3235 ~


root@lnc3235 ~
# rdqmadm -p -m RDQM1 -n lnc3235.bbbbbbbbb.com
The preferred replicated data node has been set to 'lnc3235.bbbbbbbbb.com' for
queue manager 'RDQM1'.

root@lnc3235 ~
# echo $?
0

root@lnc3235 ~
#
Back to top
View user's profile Send private message
hughson
PostPosted: Mon Jun 10, 2019 1:22 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1185
Location: Bay of Plenty, New Zealand

mqdev wrote:
why is the rdqmadm ending normally when failover is not being achieved?

I'll let John, as the RQDM Architect in IBM Hursley answer the crm status output question. I just wanted to add something about your rdqmadm command.

The rdqmadm command you are issuing is to set a preferred node for the queue manager. The fact that this can cause the queue manager to move to that node is asynchronous to the command. I suspect if you display the RDQM (rdqmstatus) you will see that the preferred node has been successfully set. This will be why the command completed successfully.

In other words, the command is not "move my Qmgr", the command "set the preference" - the queue manager will move to the preferred node ... if it can. Does that make sense?

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:
Post new topicReply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum IndexIBM MQ Installation/Configuration SupportRDQM - floating IP - how to?
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.