Author |
Message
|
Trainee |
Posted: Tue May 27, 2008 9:50 am Post subject: MQ High Availability |
|
|
 Centurion
Joined: 27 Oct 2006 Posts: 124
|
Hi ,
I am going through a document understanding high availability with WebSphere MQ and found standby machine - shared disk concept to provide high availabilty to WMQ where Qmgr data and logs are placed on shared disk which can be accessable form both master server as well as standby server in case of master server fails.
If the above one is good to provide HA to wmq, why do we need to go for installing HA Clsuter etc...what additional advantages will be there with HA Clsuter in active/standby senario.
In the first instance when there is no HA Cluster server admin interventioin is needed? Can any body please explain |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue May 27, 2008 7:47 pm Post subject: Re: MQ High Availability |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Trainee wrote: |
Hi ,
I am going through a document understanding high availability with WebSphere MQ and found standby machine - shared disk concept to provide high availabilty to WMQ where Qmgr data and logs are placed on shared disk which can be accessable form both master server as well as standby server in case of master server fails.
If the above one is good to provide HA to wmq, why do we need to go for installing HA Clsuter etc...what additional advantages will be there with HA Clsuter in active/standby senario.
In the first instance when there is no HA Cluster server admin interventioin is needed? Can any body please explain |
Only one qmgr at a time can access it's file system. The HA Cluster provides for automatic shutdown of the bad node and startup of the good node...
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
KramJ |
Posted: Wed May 28, 2008 5:55 am Post subject: |
|
|
Voyager
Joined: 09 Jan 2006 Posts: 80 Location: Atlanta
|
Also, the standby server gets the production node's service address on failover so that you don't have to modify DNS or your remote queue managers' channel definitions with a different hostname/IP address. |
|
Back to top |
|
 |
ipmqadm |
Posted: Thu May 29, 2008 4:46 am Post subject: |
|
|
Acolyte
Joined: 18 Apr 2007 Posts: 68
|
MQ clustering does NOT provide true HA failover. The 'shared disk' you mentioned is provided by services such as Veritas.
In MQ Clustering, if a qmgr has 100,000 rows on a queue and the server where the qmgr resides drops for any reason, those 100,000 rows are stranded on that qmgr's queue until the server comes back online again. That is why MQ Clustering doesn't provide true failover.
Veritas HA clustering, for instance, uses a 'shared disk' or sandisk to segregate the data from the queue manager. Therefore if the qmgr is unavailable, the next qmgr in the cluster can resume processing the rows off the queue.
The 'shared disk' design of the HA service allows a true failover scenario, whereas the MQ Clustering is really intended to assist in workflow processing and/or load balancing between qmgrs... |
|
Back to top |
|
 |
manicminer |
Posted: Thu May 29, 2008 5:00 am Post subject: |
|
|
 Disciple
Joined: 11 Jul 2007 Posts: 177
|
ipmqadm wrote: |
The 'shared disk' design of the HA service allows a true failover scenario, whereas the MQ Clustering is really intended to assist in workflow processing and/or load balancing between qmgrs... |
Assuming that your reason for failure isn't the 'shared disk' failing (Yes I know this should be a SAN with RAID etc. etc. but it is still a valid point)
It is also worth pointing out that in a messaging infrastructure you should aim to keep queue depths small for best performance, when using a clustering solution this limits the impact to only a small number of messages being "stuck" during recovery, the rest of the system can continue working.
There are pro's can con's of most solutions it all depends on your requirements, if a few messages can be delayed then an MQ clustering design might be an appropriate solution. |
|
Back to top |
|
 |
Trainee |
Posted: Wed Jun 04, 2008 7:19 am Post subject: |
|
|
 Centurion
Joined: 27 Oct 2006 Posts: 124
|
Hi saper,
If there is no HA Software..there is no automatic failover of node.We have to do it manually...Is it correct?
Thank you
Trainee |
|
Back to top |
|
 |
exerk |
Posted: Thu Jun 05, 2008 5:49 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
ipmqadm wrote: |
MQ clustering does NOT provide true HA failover. The 'shared disk' you mentioned is provided by services such as Veritas.
In MQ Clustering, if a qmgr has 100,000 rows on a queue and the server where the qmgr resides drops for any reason, those 100,000 rows are stranded on that qmgr's queue until the server comes back online again. That is why MQ Clustering doesn't provide true failover.
Veritas HA clustering, for instance, uses a 'shared disk' or sandisk to segregate the data from the queue manager. Therefore if the qmgr is unavailable, the next qmgr in the cluster can resume processing the rows off the queue.
The 'shared disk' design of the HA service allows a true failover scenario, whereas the MQ Clustering is really intended to assist in workflow processing and/or load balancing between qmgrs... |
I think there may be a little confusion here with shared queues (z/OS-only - at least currently!) the queue file will belong to the 'failed' queue manager, and be inaccessible by any other queue manager.
Or did you mean when the queue manager is restarted on the standby node? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
KramJ |
Posted: Fri Jun 06, 2008 7:23 am Post subject: |
|
|
Voyager
Joined: 09 Jan 2006 Posts: 80 Location: Atlanta
|
I've read the original post a number of times and he isn't asking about an MQ cluster, he's asking about a high-availability cluster, such as HACMP on AIX.
The advantage of using HACMP is that failover is automatic. Your queues, logs, error logs, are on shared disk that is automatically moved from the primary node to the secondary node when the primary fails. The secondary node gets the cluster's service address so that clients and other queue managers can reconnect. HACMP then starts the queue manager for you and whatever other app servers you've configured to startup when that node takes over.
If you don't use HACMP, what happens if your queue manager server crashes at 3:00 am? First of all, hopefully your monitoring the queue manger so that you get paged when it goes down. Once you get dialed in you have to import and varyon your volume group, mount your file systems, start the queue manager. And you have to call someone who has access to repoint the queue manager's DNS alias to the secondary server's IP address. |
|
Back to top |
|
 |
Trainee |
Posted: Mon Jun 09, 2008 10:21 am Post subject: |
|
|
 Centurion
Joined: 27 Oct 2006 Posts: 124
|
KramJ,
So clear..Understood Completely.Thanks.
What if Shared data /logs are corrupted and Qmgr failed on one node.Guess we should not able to start it on another node till data /logs are repaired.
Thanks
Trainee |
|
Back to top |
|
 |
KramJ |
Posted: Tue Jun 10, 2008 6:26 am Post subject: |
|
|
Voyager
Joined: 09 Jan 2006 Posts: 80 Location: Atlanta
|
Quote: |
What if Shared data /logs are corrupted and Qmgr failed on one node.Guess we should not able to start it on another node till data /logs are repaired. |
-Yes, that is correct because you'll be accessing shared data, just from a different node. |
|
Back to top |
|
 |
mquser925 |
Posted: Fri Aug 15, 2008 5:21 am Post subject: |
|
|
Acolyte
Joined: 22 Apr 2008 Posts: 61
|
If you have your logs for persistent messages on a shared disk, what happens if that mount point to the shared disk goes down? You no longer have persistent messages? |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Aug 15, 2008 6:06 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
You no longer have persistent messages? |
Worse than that, without logs, you have no working qmgr, and no capability to restart one. Add disk mirroring. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
zpat |
Posted: Fri Aug 15, 2008 6:14 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
WMQ is only as robust as the underlying disk subsystem that it relies upon.
That's why I personally feel happiest about MQ on z/OS, then AIX (or comparable unix journalled file systems) then and last of all Windows NTFS. |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Aug 15, 2008 6:21 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
If I may add to your post:
WMQ is only as robust as the underlying hardware platform and operating system that it relies upon. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
skoesters |
Posted: Mon Aug 18, 2008 2:17 am Post subject: |
|
|
Acolyte
Joined: 08 Jun 2008 Posts: 73
|
Hi,
i build an active/active Cluster with DRBD and Linux-HA and till now its working fine.
each Server has a primary drbd device with the qgmr and log dir on it.
when one server failed the drbd device goes to the other server and Heartbeat restarts the failed queue managers on the new Server.
regards
Sebastian |
|
Back to top |
|
 |
|