ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexIBM MQ Installation/Configuration SupportRDQM - floating IP - how to?

Post new topicReply to topic Goto page Previous  1, 2
RDQM - floating IP - how to? View previous topic :: View next topic
Author Message
john_colgrave
PostPosted: Tue Jun 11, 2019 12:35 am Post subject: Reply with quote

Newbie

Joined: 02 Jun 2014
Posts: 5

The Failed Action is almost certainly the reason why the queue manager RDQM1 will not run on its preferred location.

To see what caused the Failed Action you will need to look at the syslog file for the date of the Failed Action, June 4th, on the node that is the preferred location. The syslog file is /var/log/messages by default and it may be rotated so it may be one of the rotated ones you need to look at.

To remove the Failed Action which should allow the queue manager to move, issue the command:
crm resource cleanup p_drbd_rdqm1
Back to top
View user's profile Send private message
mqdev
PostPosted: Tue Jun 11, 2019 6:51 am Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

Hi Morag,
What you said makes sense.
In that case, is there a command to "definitely" failover the QM to a desired node (command should fail if the QM could not be moved). Guess am asking for a synchronous "move QM" vs asynchronous "move QM" which is what the above rdqmadm command is.

Thanks
-mqdev
Back to top
View user's profile Send private message
mqdev
PostPosted: Tue Jun 11, 2019 7:28 am Post subject: Reply with quote

Centurion

Joined: 21 Jan 2003
Posts: 131

john_colgrave wrote:
The Failed Action is almost certainly the reason why the queue manager RDQM1 will not run on its preferred location.

To see what caused the Failed Action you will need to look at the syslog file for the date of the Failed Action, June 4th, on the node that is the preferred location. The syslog file is /var/log/messages by default and it may be rotated so it may be one of the rotated ones you need to look at.

To remove the Failed Action which should allow the queue manager to move, issue the command:
crm resource cleanup p_drbd_rdqm1


John
Thanks for your time!

I have executed the above crm cleanup command. Now my crm status command does not return any failure.

However, RDQM1 is still not failing over.

Below is the log from /var/log/messages for each node around the time Jun 4 16:07:27:

Code:

lnc3234: /var/log/messages @ Jun 4 16:07:10 - 16:07:30

Jun  4 16:07:02 lnc3234 systemd: Started Session 13 of user mqm.
Jun  4 16:07:10 lnc3234 crmd[6097]:  notice: High CPU load detected: 3.310000
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1: Preparing cluster-wide state change 69798672 (0->-1 7683/4609)
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1: State change 69798672: primary_nodes=1, weak_nodes=FFFFFFFFFFFFFFF8
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1: Committing cluster-wide state change 69798672 (27ms)
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1: role( Secondary -> Primary )
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( Outdated -> UpToDate )
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1: Forced to consider local data as UpToDate!
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100: new current UUID: 8C60231B32B5B33F weak: FFFFFFFFFFFFFFFE
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: pdsk( Outdated -> UpToDate )
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: drbd_sync_handshake:
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: self 8C60231B32B5B33F:ABF4045FB4D8A774:ABF4045FB4D8A774:4FCE2B40507568BA bits:1768203 flags:20
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: peer ABF4045FB4D8A774:0000000000000000:ABF4045FB4D8A774:4FCE2B40507568BA bits:1768203 flags:100
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: uuid_compare()=2 by rule 70
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: pdsk( Outdated -> Consistent ) repl( Established -> WFBitMapS )
Jun  4 16:07:25 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 28(1), total 28; compression: 100.0%
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (10000)
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sent update 12: master-p_drbd_rdqm1=10000
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Update relayed from lnc3235.bbbbbbbbb.com
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sending flush op to all hosts for: fail-count-p_drbd_rdqm1 (1)
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sent update 14: fail-count-p_drbd_rdqm1=1
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Update relayed from lnc3235.bbbbbbbbb.com
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sending flush op to all hosts for: last-failure-p_drbd_rdqm1 (1559678848)
Jun  4 16:07:27 lnc3234 attrd[6095]:  notice: Sent update 16: last-failure-p_drbd_rdqm1=1559678848
Jun  4 16:07:28 lnc3234 crmd[6097]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: role( Primary -> Secondary )
Jun  4 16:07:28 lnc3234 crmd[6097]:  notice: Result of demote operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3234 crmd[6097]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3234 crmd[6097]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: Preparing cluster-wide state change 1062701648 (0->1 496/16)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: State change 1062701648: primary_nodes=0, weak_nodes=0
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: Committing cluster-wide state change 1062701648 (27ms)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: conn( Connected -> Disconnecting ) peer( Secondary -> Unknown )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: ack_receiver terminated
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Terminating ack_recv thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Connection closed
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: conn( Disconnecting -> StandAlone )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Terminating receiver thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Terminating sender thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: Preparing cluster-wide state change 776029956 (0->2 496/16)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: State change 776029956: primary_nodes=0, weak_nodes=0
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Cluster is now split
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: Committing cluster-wide state change 776029956 (51ms)
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: conn( Connected -> Disconnecting ) peer( Secondary -> Unknown )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100: quorum( yes -> no )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: pdsk( Consistent -> DUnknown ) repl( WFBitMapS -> Off )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: ack_receiver terminated
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Terminating ack_recv thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Connection closed
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: conn( Disconnecting -> StandAlone )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Terminating receiver thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Terminating sender thread
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( UpToDate -> Detaching )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( Detaching -> Diskless )
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100: drbd_bm_resize called with capacity == 0
Jun  4 16:07:28 lnc3234 kernel: drbd rdqm1: Terminating worker thread
Jun  4 16:07:28 lnc3234 attrd[6095]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (<null>)
Jun  4 16:07:28 lnc3234 attrd[6095]:  notice: Sent delete 22: node=1, attr=master-p_drbd_rdqm1, id=<n/a>, set=(null), section=status
Jun  4 16:07:28 lnc3234 crmd[6097]:  notice: Result of stop operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1: Starting worker thread (from drbdsetup [16078])
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( Diskless -> Attaching )
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: Maximum number of peer devices = 2
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1: Method to ensure write ordering: flush
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: WRITE_SAME disabled by config
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: Adjusting my ra_pages to backing device's (32 -> 1024)
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: drbd_bm_resize called with capacity == 25164216
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: resync bitmap: bits=3145527 words=98298 pages=192
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: size = 12 GB (12582108 KB)
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: size = 12 GB (12582108 KB)
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: recounting of set bits took additional 0ms
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( Attaching -> UpToDate )
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1/0 drbd100: attached to current UUID: 8C60231B32B5B33E
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Starting sender thread (from drbdsetup [16104])
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Starting sender thread (from drbdsetup [16106])
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: conn( StandAlone -> Unconnected )
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: conn( StandAlone -> Unconnected )
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Starting receiver thread (from drbd_w_rdqm1 [16079])
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Starting receiver thread (from drbd_w_rdqm1 [16079])
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: conn( Unconnected -> Connecting )
Jun  4 16:07:29 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: conn( Unconnected -> Connecting )
Jun  4 16:07:29 lnc3234 crmd[6097]:  notice: Result of start operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lnc3234 crmd[6097]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lnc3234 crmd[6097]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3234.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Handshake to peer 1 successful: Agreed network protocol version 114
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: Starting ack_recv thread (from drbd_r_rdqm1 [16113])
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Handshake to peer 2 successful: Agreed network protocol version 114
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Starting ack_recv thread (from drbd_r_rdqm1 [16115])
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1: Preparing cluster-wide state change 4160361946 (0->1 499/146)
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1: State change 4160361946: primary_nodes=4, weak_nodes=FFFFFFFFFFFFFFF8
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1: Committing cluster-wide state change 4160361946 (24ms)
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1 lnc3235.bbbbbbbbb.com: conn( Connecting -> Connected ) peer( Unknown -> Secondary )
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100: disk( UpToDate -> Outdated )
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100: WRITE_SAME disabled by config
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: drbd_sync_handshake:
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: self 8C60231B32B5B33E:ABF4045FB4D8A774:ABF4045FB4D8A774:4FCE2B40507568BA bits:0 flags:20
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: peer 62DD2B4FFF7F9416:ABF4045FB4D8A774:ABF4045FB4D8A774:4FCE2B40507568BA bits:0 flags:100
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: uuid_compare()=100 by rule 90
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100: Ignore Split-Brain, for now, at least one side unstable
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100: quorum( no -> yes )
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
Jun  4 16:07:30 lnc3234 kernel: drbd rdqm1: Preparing cluster-wide state change 973289025 (0->2 499/146)

=========================================================================================================================
lnc3235:

Jun  4 16:07:01 lnc3235 systemd: Started Session 12 of user mqmadm.
Jun  4 16:07:01 lnc3235 systemd: Started Session 13 of user mqmadm.
Jun  4 16:07:06 lnc3235 kernel: EXT4-fs (drbd101): error count since last fsck: 1
Jun  4 16:07:06 lnc3235 kernel: EXT4-fs (drbd101): initial error at time 1559102516: ext4_find_entry:1312: inode 14
Jun  4 16:07:06 lnc3235 kernel: EXT4-fs (drbd101): last error at time 1559102516: ext4_find_entry:1312: inode 14
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Preparing remote state change 69798672
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Committing remote state change 69798672 (primary_nodes=1)
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: peer( Secondary -> Primary )
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1/0 drbd100: disk( Outdated -> UpToDate )
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: pdsk( Outdated -> UpToDate )
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1: State change failed: Refusing to be Outdated while Connected
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1/0 drbd100: Failed: disk( UpToDate -> Outdated )
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1: State change failed: Refusing to be Outdated while Connected
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1/0 drbd100: Failed: disk( UpToDate -> Outdated )
Jun  4 16:07:26 lnc3235 kernel: drbd rdqm1/0 drbd100 lncb90c.bbbbbbbbb.com: pdsk( Outdated -> UpToDate )
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: State transition S_IDLE -> S_POLICY_ENGINE
Jun  4 16:07:28 lnc3235 pengine[5961]: warning: Processing failed op monitor for p_drbd_rdqm1:2 on lnc3234.bbbbbbbbb.com: master (8)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Recover p_drbd_rdqm1:2#011(Master lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_fs_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_rdqmx_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-670.bz2
Jun  4 16:07:28 lnc3235 pengine[5961]: warning: Processing failed op monitor for p_drbd_rdqm1:2 on lnc3234.bbbbbbbbb.com: master (8)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Recover p_drbd_rdqm1:2#011(Master lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_fs_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_rdqmx_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Calculated transition 6, saving inputs in /var/lib/pacemaker/pengine/pe-input-671.bz2
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Processing graph 6 (ref=pe_calc-dc-1559678848-147) derived from /var/lib/pacemaker/pengine/pe-input-671.bz2
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_demote_0 on lncb90c.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_demote_0 locally on lnc3235.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3235.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_demote_0 on lnc3234.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating demote operation p_drbd_rdqm1_demote_0 on lnc3234.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: peer( Primary -> Secondary )
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_post_notify_demote_0 on lncb90c.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 kernel: drbd rdqm1/0 drbd100: No resync, but 1269 bits in bitmap!
Jun  4 16:07:28 lnc3235 kernel: drbd rdqm1/0 drbd100: No resync, but 1269 bits in bitmap!
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Transition aborted by status-3-master-p_drbd_rdqm1 doing create master-p_drbd_rdqm1=10000: Transient attribute change
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_post_notify_demote_0 locally on lnc3235.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 attrd[5960]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (10000)
Jun  4 16:07:28 lnc3235 attrd[5960]:  notice: Sent update 87: master-p_drbd_rdqm1=10000
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3235.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_post_notify_demote_0 on lnc3234.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Transition 6 (Complete=14, Pending=0, Fired=0, Skipped=3, Incomplete=42, Source=/var/lib/pacemaker/pengine/pe-input-671.bz2): Stopped
Jun  4 16:07:28 lnc3235 pengine[5961]: warning: Processing failed op monitor for p_drbd_rdqm1:2 on lnc3234.bbbbbbbbb.com: master (8)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Recover p_drbd_rdqm1:2#011(Slave lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Promote p_drbd_rdqm1:2#011(Slave -> Master lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_fs_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   p_rdqmx_rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Start   rdqm1#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:28 lnc3235 pengine[5961]:  notice: Calculated transition 7, saving inputs in /var/lib/pacemaker/pengine/pe-input-672.bz2
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Processing graph 7 (ref=pe_calc-dc-1559678848-157) derived from /var/lib/pacemaker/pengine/pe-input-672.bz2
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_stop_0 on lncb90c.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_stop_0 locally on lnc3235.bbbbbbbbb.com
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3235.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_pre_notify_stop_0 on lnc3234.bbbbbbbbb.com

Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Initiating stop operation p_drbd_rdqm1_stop_0 on lnc3234.bbbbbbbbb.com
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Preparing remote state change 1062701648
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Committing remote state change 1062701648 (primary_nodes=0)
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( Connected -> TearDown ) peer( Secondary -> Unknown )
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: ack_receiver terminated
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Terminating ack_recv thread
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Connection closed
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( TearDown -> Unconnected )
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Restarting receiver thread
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( Unconnected -> Connecting )
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Preparing remote state change 776029956
Jun  4 16:07:29 lnc3235 kernel: drbd rdqm1 lncb90c.bbbbbbbbb.com: Committing remote state change 776029956 (primary_nodes=0)
Jun  4 16:07:29 lnc3235 crmd[5962]: warning: No reason to expect node 1 to be down
Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Transition aborted by deletion of nvpair[@id='status-1-master-p_drbd_rdqm1']: Transient attribute change

Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Transition aborted by deletion of nvpair[@id='status-1-master-p_drbd_rdqm1']: Transient attribute change
Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_post_notify_stop_0 on lncb90c.bbbbbbbbb.com
Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Initiating notify operation p_drbd_rdqm1_post_notify_stop_0 locally on lnc3235.bbbbbbbbb.com
Jun  4 16:07:29 lnc3235 attrd[5960]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (1000)
Jun  4 16:07:29 lnc3235 attrd[5960]:  notice: Sent update 93: master-p_drbd_rdqm1=1000
Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Result of notify operation for p_drbd_rdqm1 on lnc3235.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lnc3235 crmd[5962]:  notice: Transition 7 (Complete=14, Pending=0, Fired=0, Skipped=2, Incomplete=29, Source=/var/lib/pacemaker/pengine/pe-input-672.bz2): Stopped
Jun  4 16:07:29 lnc3235 pengine[5961]: warning: Processing failed op monitor for p_drbd_rdqm1:2 on lnc3234.bbbbbbbbb.com: master (8)
Jun  4 16:07:29 lnc3235 pengine[5961]:  notice: Promote p_drbd_rdqm1:1#011(Slave -> Master lnc3235.bbbbbbbbb.com)
Jun  4 16:07:29 lnc3235 pengine[5961]:  notice: Start   p_drbd_rdqm1:2#011(lnc3234.bbbbbbbbb.com)
Jun  4 16:07:29 lnc3235 pengine[5961]:  notice: Start   p_fs_rdqm1#011(lnc3235.bbbbbbbbb.com)



==================================================================================================
lncb90c

Jun  4 16:06:11 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_qm0_ad_us_lnc3234 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Preparing remote state change 69798672
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Committing remote state change 69798672 (primary_nodes=1)
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: peer( Secondary -> Primary )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100: disk( Outdated -> UpToDate )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: pdsk( Outdated -> UpToDate )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100 lnc3235.bbbbbbbbb.com: pdsk( Outdated -> UpToDate )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1: State change failed: Refusing to be Outdated while Connected
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100: Failed: disk( UpToDate -> Outdated )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1: State change failed: Refusing to be Outdated while Connected
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100: Failed: disk( UpToDate -> Outdated )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1: State change failed: Refusing to be Outdated while Connected
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100: Failed: disk( UpToDate -> Outdated )
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 28(1), total 28; compression: 100.0%
Jun  4 16:07:26 lncb90c kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: unexpected repl_state (Established) in receive_bitmap

Jun  4 16:07:28 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: peer( Primary -> Secondary )
Jun  4 16:07:28 lncb90c attrd[9839]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (10000)
Jun  4 16:07:28 lncb90c attrd[9839]:  notice: Sent update 47: master-p_drbd_rdqm1=10000
Jun  4 16:07:28 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:28 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Preparing remote state change 1062701648
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Committing remote state change 1062701648 (primary_nodes=0)
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Preparing remote state change 776029956
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Committing remote state change 776029956 (primary_nodes=0)
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( Connected -> TearDown ) peer( Secondary -> Unknown )
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1/0 drbd100 lnc3234.bbbbbbbbb.com: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: ack_receiver terminated
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Terminating ack_recv thread
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Connection closed
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( TearDown -> Unconnected )
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: Restarting receiver thread
Jun  4 16:07:29 lncb90c kernel: drbd rdqm1 lnc3234.bbbbbbbbb.com: conn( Unconnected -> Connecting )
Jun  4 16:07:29 lncb90c attrd[9839]:  notice: Sending flush op to all hosts for: master-p_drbd_rdqm1 (1000)
Jun  4 16:07:29 lncb90c attrd[9839]:  notice: Sent update 53: master-p_drbd_rdqm1=1000
Jun  4 16:07:29 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:29 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:30 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:30 lncb90c crmd[9841]:  notice: Result of notify operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Jun  4 16:07:30 lncb90c kernel: drbd rdqm1: Preparing cluster-wide state change 500683656 (2->-1 3/1)
Jun  4 16:07:30 lncb90c kernel: drbd rdqm1: State change 500683656: primary_nodes=4, weak_nodes=FFFFFFFFFFFFFFF9
Jun  4 16:07:30 lncb90c kernel: drbd rdqm1: Committing cluster-wide state change 500683656 (23ms)
Jun  4 16:07:30 lncb90c kernel: drbd rdqm1: role( Secondary -> Primary )
Jun  4 16:07:30 lncb90c crmd[9841]:  notice: Result of promote operation for p_drbd_rdqm1 on lncb90c.bbbbbbbbb.com: 0 (ok)
Back to top
View user's profile Send private message
john_colgrave
PostPosted: Mon Jun 17, 2019 10:29 pm Post subject: Reply with quote

Newbie

Joined: 02 Jun 2014
Posts: 5

It looks like there was a problem with your data replication interface:
Jun 4 16:07:28 lnc3234 kernel: drbd rdqm1/0 drbd100: quorum( yes -> no )

If a node loses DRBD quorum then any RDQM it is running is immediately stopped and it will not be able to restart until quorum is restored.

It is restored:
Jun 4 16:07:30 lnc3234 kernel: drbd rdqm1/0 drbd100: quorum( no -> yes )

You do not show much of the log after this so it is not clear what happened then.
Back to top
View user's profile Send private message
raj.gowd
PostPosted: Sun Jul 28, 2019 1:13 pm Post subject: Reply with quote

Newbie

Joined: 03 Jul 2014
Posts: 3

Hi,
Can someone provide me step by step procedure for setting up RDQM?

Thanks,
Raj
Back to top
View user's profile Send private message
hughson
PostPosted: Sun Jul 28, 2019 3:01 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1256
Location: Bay of Plenty, New Zealand

raj.gowd wrote:
Can someone provide me step by step procedure for setting up RDQM?

My "What's New in IBM MQ V9.x.x" training course contains exactly that. Let me know if you are interested in taking it - contact details and full description of courses in brochure found here.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
raj.gowd
PostPosted: Sun Jul 28, 2019 3:52 pm Post subject: Reply with quote

Newbie

Joined: 03 Jul 2014
Posts: 3

Hi,
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.
Back to top
View user's profile Send private message
hughson
PostPosted: Sun Jul 28, 2019 7:11 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1256
Location: Bay of Plenty, New Zealand

raj.gowd wrote:
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.

Probably not an appropriate conversation for this forum. Please contact me offline using details from the link I gave in my previous reply.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
exerk
PostPosted: Sun Jul 28, 2019 10:56 pm Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6106

raj.gowd wrote:
Hi,
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.

We're not here to do your work for you. All the necessary steps are in the Knowledge Centre and manuals relevant to the OS, from which you can then construct a site guide of how-to.

Or contact Morag off-line...
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Back to top
View user's profile Send private message
HubertKleinmanns
PostPosted: Wed Aug 28, 2019 12:20 am Post subject: Reply with quote

Yatiri

Joined: 24 Feb 2004
Posts: 698
Location: Germany

raj.gowd wrote:
Hi,
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.


Are you sure? I'm working as an IBM MQ trainer for nearly 15 years - and I still sometimes need some training.
_________________
Regards
Hubert
Back to top
View user's profile Send private message Visit poster's website
exerk
PostPosted: Wed Aug 28, 2019 12:26 am Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6106

HubertKleinmanns wrote:
raj.gowd wrote:
Hi,
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.


Are you sure? I'm working as an IBM MQ trainer for nearly 15 years - and I still sometimes need some training.

, because every day is a school day...
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Thu Aug 29, 2019 4:12 pm Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7558

exerk wrote:
HubertKleinmanns wrote:
raj.gowd wrote:
Hi,
I have a knowledge on MQ and do not need any training. if you have any document or procedure for complete RDQM set up for sale, Please let me know.


Are you sure? I'm working as an IBM MQ trainer for nearly 15 years - and I still sometimes need some training.

, because every day is a school day...


The more you know, the more you realize you don't know.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
bruce2359
PostPosted: Thu Aug 29, 2019 4:43 pm Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8501
Location: US: west coast, almost. Otherwise, enroute.

My goal is to learn one new thing each and every day.
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page Previous  1, 2 Page 2 of 2

MQSeries.net Forum IndexIBM MQ Installation/Configuration SupportRDQM - floating IP - how to?
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.