Author |
Message
|
klamerus |
Posted: Wed Jul 28, 2004 5:10 pm Post subject: Need Failover |
|
|
 Disciple
Joined: 05 Jul 2004 Posts: 199 Location: Detroit, MI
|
We have a pair of Windows NT servers with MQ 5.2 in an MQ cluster with cluster managers on each. We're moving to Windows 2000 with MQ 5.3 also in a cluster. Each of these servers is roughly identical.
Each of our servers had 6 queues. Each of these is read by a program that performs a work step and puts the results in the next queue.
We have a queue in front of both of these which distributes work across them in a roughly equitable manner.
The downside is that if one of the servers dies all of the messages that have been queued to that servers aren't acted upon until it's brought back up.
Is it possible for each of these servers to have backup queues for the other that are kept nearly in sync so that if either of them dies, we can have a copy of it's work on the one that's still up that we can perform work against?
We would have a second set of programs for these backup queues that wouldn't be running until/unless the primary server were found not to be running.
I looking for something that I can only call a "shadow" queue.
We don't have any contact admin or SAN or any other form of shared storage available for use. |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Jul 28, 2004 5:28 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
What's that you say? MQ Clustering isn't a failover solution? Who knew?
You could try and coddle something together using the mirrorq exit.
Or you could ask management to provide a reasonable estimate of the cost of the delayed work (the messages that are stitting on the downed server), and then show them how the cost of a shared disk will repay that cost in a relatively small period of time - or if it doesn't, then they will be willing to accept the cost of the delayed work. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
klamerus |
Posted: Wed Jul 28, 2004 5:39 pm Post subject: |
|
|
 Disciple
Joined: 05 Jul 2004 Posts: 199 Location: Detroit, MI
|
Perhaps. But I can see where this might be a desirable state of affairs where SANs weren't an option. Like across a WAN.
Furthermore SANS and contact admin do both fail and any "shared" storage is still a single point of failure. Not often, but occasionally.
Is it not possible to have the messages going into a queue automatically added to a separate queue and messages leaving a queue automatically removed from a separate queue? |
|
Back to top |
|
 |
sebastianhirt |
Posted: Wed Jul 28, 2004 10:49 pm Post subject: |
|
|
Yatiri
Joined: 07 Jun 2004 Posts: 620 Location: Germany
|
klamerus wrote: |
Furthermore SANS and contact admin do both fail and any "shared" storage is still a single point of failure. Not often, but occasionally.
|
Only my 2 cents, and maybe I am a bit pessimistic...
If you want to have no SPOF you will need to spend a lot of money to create redundancy...
I would probably try to go for the SAN/contact admin, and prepare a bullet-proof disaster recovery procedure to minimize the down-time if there really something happens.
Maybe I am wrong, but I think the chance/danger that a SAN/contact admin is going down is way smaller than the chance that one of the NT Boxes is dying.
A SAN is always an option, even if you are distributed all over the world. The question is only whether you want to pay for the connection between your locations, and there I agree, it is not always making sense from a cost point of view...
Make sure that your management is aware of the costs, if something happens. Give them a worst case scenario... Make a calculation what measures to minimize risk would cost. This will maybe help them to make a good decission
Jeff,
I am interested in that mirrorq exit... Could you please give me a hint in which part of the MQ documentation I will find information to that topic?
THX
Seb |
|
Back to top |
|
 |
jefflowrey |
Posted: Thu Jul 29, 2004 3:19 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
mirrorq isn't in the documentation.
It's a sample exit, provided in the repository here among other places. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
bower5932 |
Posted: Thu Jul 29, 2004 5:49 am Post subject: |
|
|
 Jedi Knight
Joined: 27 Aug 2001 Posts: 3023 Location: Dallas, TX, USA
|
You can also find mirroq out at:
http://www.developer.ibm.com/tech/sampmq.html
There has been a minor update to the Linux and Windows versions. Still trying to get the update for the others out.
(Sorry we haven't been faster Charlie.) |
|
Back to top |
|
 |
clindsey |
Posted: Thu Jul 29, 2004 7:27 am Post subject: |
|
|
Knight
Joined: 12 Jul 2002 Posts: 586 Location: Dallas, Tx
|
Thanks Ron!
I just checked the packages and the updates are not in there yet. I think your buddy in Austin still has to publish them.
Charlie |
|
Back to top |
|
 |
RogerLacroix |
Posted: Thu Jul 29, 2004 8:06 am Post subject: |
|
|
 Jedi Knight
Joined: 15 May 2001 Posts: 3264 Location: London, ON Canada
|
The subject of fail-over almost always comes up when I first go to a client. In my opinion, this is the last big piece that is still truly not available for WMQ.
For most clients, I am able to convince them to use an Active / Passive setup with Veritas controlling the fail-over (see SupportPac MC6A). This works and works well.
The problem comes from some managers. They get bothered by the 'Passive' box. They don't like the idea of money sitting idol (Passive box not doing any work).
Using an exit to copy messages (i.e. from MQPUT) to another 'fail-safe' queue manager is very straightforward. Even deleting messages from the 'fail-safe' queue manager when a MQGET is easy.
The hard part comes when you use transactions, local or global. What if the same transaction is backout 10 times or 1 time or 100 times? What if the queue manager went down and the only when to get it to start is to use the command line utilities to resolve or backout the transaction(s)? How is the API Exit going to help you here? Now your 'real' queue manager will be out of sync with your 'fail-safe' queue manager.
Of course, there is real-time sync-ing but if your applications do 25, 50, 100 or higher transactions per second then your sync program will bring the 'real' queue manager to a halt just trying to keep up.
Also, what about applications creating permanent dynamic queues? What if the application deletes the queue when it is done? What if it doesn't? What if the MQ Admin is very active? Adds 25 queues in the morning then deletes 10 queues in the afternoon? (Sounds weird but I have seen it in a PROD box!!!)
This is a non-trivial subject. Be very careful with the solution you come up with and what the 'expectations' of your management are? If you have a disaster and your solution does not work 'as expected' and data is loss / duplicated / or just unavailable, it will be your butt that gets fired. Sometimes it is better to go with a proven solution that works (i.e. Veritas with Active / Passive) than to promise the moon.
Regards,
Roger Lacroix
Capitalware Inc. _________________ Capitalware: Transforming tomorrow into today.
Connected to MQ!
Twitter |
|
Back to top |
|
 |
klamerus |
Posted: Thu Jul 29, 2004 6:15 pm Post subject: Active/Passive |
|
|
 Disciple
Joined: 05 Jul 2004 Posts: 199 Location: Detroit, MI
|
Well luckily for us the transaction volume aren't severe. What we're passing around starts as data, but quickly turns into PDF documents. Essentially, this is a document processing center. The agents will generate documents from data, print them, fax them, e-mail them, and other document processing functions as part of a workflow.
It can handle hundreds of thousands of documents in a month, but generally, it only does 5000-10000 in any day (and that's across two servers). Most of these are < 20 K in size. Active-Passive would work, but each would need to be active for itself and passive for the other.
That way the servers are both actually working full-time. So far as disaster recovery goes, we've got that. It at another site (actually at IBM), but nobody wants to go there except as a last resort.
We will eventually get SANS, but probably not until mid-late next year. The thing is that I actually want both of the systems to be doing work all the time. And they both are, so I was hoping there was a way to criss-cross this active-passive thing.
No big deal. It was just a thought. I'm sure we can accomplish this in some manner as described in the previous message. |
|
Back to top |
|
 |
jefflowrey |
Posted: Fri Jul 30, 2004 3:41 am Post subject: Re: Active/Passive |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
klamerus wrote: |
The thing is that I actually want both of the systems to be doing work all the time. And they both are, so I was hoping there was a way to criss-cross this active-passive thing. |
So, more of an Active-Active failover scenario... _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
|