MQSeries.net :: View topic - multi instance broker setup with shared drive for DR setup

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » multi instance broker setup with shared drive for DR setup

Goto page 1, 2, 3, 4 Next

multi instance broker setup with shared drive for DR setup

« View previous topic :: View next topic »

Author

Message

chris boehnke

Posted: Tue Jan 04, 2011 7:57 pm Post subject: multi instance broker setup with shared drive for DR setup

Partisan

Joined: 25 Jul 2006
Posts: 369

Hi Guys,
We are on Solaris 10, MQ 7.0.1.1 and Broker 7.0.01.

We are planning to use MQ/ Broker in 2 datacenters for handling hardware failover. The traffic will be handled in primary datacenter and should be failed over to other datacenter(DR site).

I know that we can use multi-instance broker as we are on version 7 of MQ/ broker but wanted to know how we can use the shared filesystem(NFS mount). Where to place the shared filesystem and how the MQ/ broker will access the filesystem if the MQ/Broker are residing in seperate datacenters?.

Any other thoughts/ suggestions are appreciated.

Thanks.

fjb_saper

Posted: Tue Jan 04, 2011 8:37 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

You might want to upgrade to MQ 7.0.1.3. I believe that's what is being delivered as base 7.0.1 these days...
As for the rest, I thought the manuals gave you a step by step.
What did you try?

_________________
MQ & Broker admin

mqjeff

Posted: Wed Jan 05, 2011 3:03 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Yes, the Broker manual gives you a step by step for everything except the intimate details of configuring the multi-instance queue manager.

The MQ manuals give you a step by step for that.

You need a supported network file system, as here.

The important bit is that the file share must be available on both instances at all times.

lancelotlinc

Posted: Wed Jan 05, 2011 7:22 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

One thing you might consider is to have two independent Brokers, each connected through MQ cluster where MQ transactions that originate in one datacenter stay in that datacenter by way of the channel sender priority.

You'll get better peformance this way vs. forcing the Brokers to share a filesystem.

Email or PM me if you'd like further details.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

mqjeff

Posted: Wed Jan 05, 2011 8:04 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

lancelotlinc wrote:

That's not as cut and dried as you say it is. It really depends on how the incoming work is distributed between the two datacenters.

And, for example, it may make sense to have two multi-instance brokers, each with a primary in one data-center and the secondary in the other datacenter. Ye Olde Active/Passive....

lancelotlinc

Posted: Wed Jan 05, 2011 8:15 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

It depends on the business requirements. My suggestion is intended to point out that active/passive multi-instance broker DR configurations are inherently inefficient and not the best bang-for-the-buck.

Its not possible to design a full DR plan here, since no verbose business requirements are posted. However, there are a wide range of options that provide for greater efficiency and better utilization of resources where all parts of the system are tested and ready for operation on-the-fly.

The ye' old active/passive always comes with the risk that when the ka-ka hits the rotating air foil, the passive is not ready or fails to kick in.

When you have two independent brokers, both operating and one fails, where the second can pick up the slack, you know the second broker will work, because its doing nothing different than it has been all along.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

bsiggers

Posted: Wed Jan 05, 2011 8:56 am Post subject:

Acolyte

Joined: 09 Dec 2010
Posts: 53
Location: Vancouver, BC

Going back to the original question, you'll be able to find a lot of great information here for setting this up on V7, along with the user manuals.

http://www.ibm.com/developerworks/websphere/library/techarticles/1011_gupta/1011_gupta.html

Be aware that like lancelotlinc says, there are some other nice alternatives like setting up an active-active cluster on both sides, that don't necessarily involve complexities like setting up shared filesystems - and from a licensing perspective, you're not saving any money by going active-passive, as the brokers/mq servers on the DR side would be considered 'up' and thereby require a full license - but your IBM rep, or somebody here on the forum may be able to weigh in. But all this depends on your business requirements and what your brokers are actually doing.

Hope this helps.

Vitor

Posted: Wed Jan 05, 2011 9:06 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

bsiggers wrote:

from a licensing perspective, you're not saving any money by going active-passive, as the brokers/mq servers on the DR side would be considered 'up' and thereby require a full license - but your IBM rep, or somebody here on the forum may be able to weigh in.

It's not impossible that software on a passive DR site would be covered by the active license. But it depends on the exact license deal that a given site has negociated.
_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jan 05, 2011 9:13 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

To give you some background so you can gauge where I am coming from, much of my primary responsibility for IT systems has been in the Disaster Recovery area since 1980s.

Beginning in 2003, I started working with IBM WebSphere Message Broker for a major insurance company. One of my first tasks was to develop proof-of-concepts on many different DR configurations for WebSphere Message Broker.

Of the configurations that I setup and tested, the active-passive was the worst at accomplishing the mission. Of every ten attempts at a fail-over, nine were not successful for one reason or the other. Either the QMGR wouldn't start-up or if the QMGR did start, the Broker runtime would not start for various reasons. Human intervention was required more than 90 percent of the time to get the failover to take off.

In theory, active-passive seems nice, easy and convenient. In practice, it is the worst configuration based on successful failover attempts (less than ten percent success rate).

Today, at a major bank, we have four geographically dispersed sites runing active-active-active-active configuration. The ESB is front-ended by hardened Xi50 appliances at the DMZ. I find this configuration to be the most robust and successful for DR purposes.

No matter what configuration you choose, be sure to test nightly. Not weekly, monthly, or worse annually. Test nightly. Nothing worse than to find out that your DR config didn't work when you really wanted it to and the questions you ask to find out why only make peoples faces turn red. Bugs, cob webs, and spiders crawl into the nooks and cranny every day.

By test, I mean send probes through every few minutes and check return codes in an automated fashion.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Last edited by lancelotlinc on Wed Jan 05, 2011 9:16 am; edited 2 times in total

Vitor

Posted: Wed Jan 05, 2011 9:13 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

It's also worth being a bit cautious with terms here. A DR site (as described by the OP) is an off-site, emergency facility for hardware failure. This should be kept distinct from an HA solution which caters more to the transient software or network issue.

Both can be modelled as active/passive or active/active with pros & cons for each in each situation.

It all depends on requirements, attitude to risk, time to recover, budget, etc, etc, etc
_________________
Honesty is the best policy.
Insanity is the best defence.

Vitor

Posted: Wed Jan 05, 2011 9:32 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

Of the configurations that I setup and tested, the active-passive was the worst at accomplishing the mission. Of every ten attempts at a fail-over, nine were not successful for one reason or the other. Either the QMGR wouldn't start-up or if the QMGR did start, the Broker runtime would not start for various reasons. Human intervention was required more than 90 percent of the time to get the failover to take off.

That's ghastly. I've never seen that level of unreliability, and I've been doing this longer than you. I can probably match you horror story for horror story in setting up DR (getting the network people to do it right would try the patience of a saint) but once you've finally achieved that you should be away.

What sort of problems did you experience?

lancelotlinc wrote:

Today, at a major bank, we have four geographically dispersed sites runing active-active-active-active configuration. The ESB is front-ended by hardened Xi50 appliances at the DMZ. I find this configuration to be the most robust and successful for DR purposes.

We've spoken before about the happy situation you find yourself in. A lot of places don't have the money for 4 active licenses (or 4 sites), many that do don't have the political will to spend it.

Don't give me the standard talk about how much cheaper a robust DR solution is than the losses incured during a major failure. I've delivered it so many times I can sing it (and have been tempted to do so a number of times as a puniative measure!). It's all about attitude to risk; some management has a near-suicidal attitude to it.

lancelotlinc wrote:

No matter what configuration you choose, be sure to test nightly. Not weekly, monthly, or worse annually. Test nightly. Nothing worse that to find out that your DR config didn't work when you really wanted it to and the questions you ask to find out why only make peoples faces turn red. Bugs, cob webs, and spiders crawl into the nooks and cranny every day.

By test, I mean send probes through every few minutes and check return codes in an automated fashion.

Again, getting the network to switch over to DR for a 6 monthly test hurts real people. And you can't automate the hardware & software coming up on a nightly basis, especially if there's fear that some of the production data running during the overnight schedule will escape into the DR site.

I'd say 75% of the sites I've worked on do an annual or 6 monthly DR test. I'd say 95% of all DR tests don't go smoothly for exactly the reasons you outline, and all of them are hailed as "successful" because the systems finally came up. Attitude to risk you see.

I include in these major banks, investment houses & share trading organizations who you'd think would know better & where there is no question at all they have the money to do it properly if they wanted to spend it.

Illustrative war story - major investment house doing annual test of the "dark" DR site (they had an active/ passive for everyday disasters like floods & fires and a dark bunker for real emergencies ). On the day of the test, which had been the subject of planning meetings for weeks previously, it took 3 hours to get network to the site, 4 hours for the DBAs to realize they didn't have all the tapes on site they needed to restore all the databases we needed, 12 hours to find & transport the missing tapes and 3 hours for the systems people to figure out why none of us could log on to anything.

In the end, it took a little over a day to get the system going to a point where it was available to end-users. We were congratulated by management for "getting the system up so much faster than last year".
_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jan 05, 2011 9:44 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

One particular issue that seemed to plague my active-passive proof-of-concept test was that the active QMGR writes a file to the network shared disk space to indicate it has ownership. On a fail-over this file was to have gone away, allowing the passive QMGR to take over the access to the shared network files, but in practice, the file sometimes does not go away and when the passive QMGR starts up, it cannot successfully start due to this. If I remember right, this was MQ ver 6 something, and this active-passive DR bug/issue may have been resolved by now.

The reason I like active-active is that you dont have to do anything to get the second system to perform, as it is already live and running.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Vitor

Posted: Wed Jan 05, 2011 9:46 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

The reason I like active-active is that you dont have to do anything to get the second system to perform, as it is already live and running.

It's true, if the client has the money (and the will) to pay for 2 identical geographical sites it's nice for you.
_________________
Honesty is the best policy.
Insanity is the best defence.

Vitor

Posted: Wed Jan 05, 2011 9:56 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

Accepting we're sliding off topic here;

This does actually ring a faint bell. I know on HP-UX and Solaris years back I had problems where /var/mqm switched from the active to the passive machine with all the file handles intact so when the passive queue manager tried to start it couldn't get access to the logs from the OS.

I'm not aware of anything in any version of WMQv6 like a file that maintained which queue manager owned what. Pre-v7 MI you brought up the "same" queue manager; or I always have.

What you're describing sounds more like the "batton" that machines in a SecureGuard or a Tru64 use between themselves to describe who's active.

_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jan 05, 2011 10:03 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

On the off chance that the active-passive failover worked, there was a gap of 90 seconds before business could be resumed. That was another reason to go active-active, because there is no change to what the external customer sees in response time (except for a few more milliseconds in latency.)
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Display posts from previous:

Goto page 1, 2, 3, 4 Next

Page 1 of 4

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » multi instance broker setup with shared drive for DR setup

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP