MQSeries.net :: View topic - Recovering a queue manager using Asnychronously copied Logs

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Recovering a queue manager using Asnychronously copied Logs

Goto page Previous 1, 2

Recovering a queue manager using Asnychronously copied Logs

« View previous topic :: View next topic »

Author

Message

markt

Posted: Fri Jun 24, 2011 4:40 am Post subject:

Knight

Joined: 14 May 2002
Posts: 508

Quote:

It's only valid to copy mq logs, synchronously or asychronously, when the queue manager is stopped.

The logs are inconsistent otherwise.

Only up to a point. The logs are NEVER inconsistent - because the question then is "what might they be inconsistent with" to which the answer is the qfiles/pagesets. Logs are never inconsistent with themselves (unless the media is in some way corrupted or files deleted in which case you've got different things to worry about).

So you should not expect to copy a complete qmgr tree of logs and data and have it work correctly unless the qmgr is stopped (or you replicate BOTH filesystems in a consistent manner).

What you CAN do (as other people in this chain have implied) is replicate the log files either sync or async and use that to rebuild the contents of a qmgr. The replicated logs may be out of date compared with the original when you use async replication, but they are at least consistent. So you may get lost/duplicated messages, but the rebuilt qmgr will start and restore to the state as indicated by those logs.

Even async replication guarantees (provided it's properly configured) that log records are written in the same sequence as originally written. So an interruption to the replicator is identical to an interruption to the logger with a kill -9, although it might be behind in time. So the qmgr WILL recover from such an interruption, backout any inflight UOWs whose completion had not yet made it across to the log replica, and so on.

But you then need to think about a bunch of other stuff, such as what happens when you restore service to the prime systems, how networks are managed etc.

PeterPotkay

Posted: Fri Jun 24, 2011 9:06 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7722

fjb_saper wrote:

Well we need to make a few assumptions here...
So in case 1 the assumption is that the message is on the queue and the logger got interrupted...
In case 2 we see the same interruption of the logger but the message never hit the queue (as the remote center only gets the logs shipped)... So if the message was in a unit of work the qmgr does not know how to roll it back?

I'm thinking you would have your /var/mqm/log and /var/mqm disk groups in the same SRDF/A "consistency group" (probably not the right term) so that while they may be behind seconds or minutes, at least they are in sync with each other.

Thinking this thru, and with markt's latest post, this sort of puts a whole new perspective on DR for MQ. If your goal is to just have a functional QM post DR, and a few missing or duplicate messages are not that important in the grand scheme of a DR (which they shouldn't be), maybe SRDF/A for MQ is viable? I still don't think its perfect - if someone screws up you have 2 identical QMs online on the network at the same time. But it does seem to be a viable option top consider.
_________________
Peter Potkay
Keep Calm and MQ On

gbaddeley

Posted: Mon Jun 27, 2011 4:40 pm Post subject:

Jedi Knight

Joined: 25 Mar 2003
Posts: 2538
Location: Melbourne, Australia

Quote:

a few missing or duplicate messages are not that important in the grand scheme of a DR (which they shouldn't be)

It depends how the applications are designed to handle these scenarios. A posssible MQ DR strategy for non-replicated disks is to hand over the Queue Manager on the DR system with empty application queues.
_________________
Glenn

bruce2359

Posted: Mon Jun 27, 2011 4:51 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

Quote:

a few missing or duplicate messages are not that important in the grand scheme of a DR (which they shouldn't be)

No, it is possible that a single message might be worth a billion dollars.

The decision as to risk is for management to make, not for a sysadmin.

If loss of a single message puts the organization at critical risk, then management needs to contemplate (and fund) z/OS GDPS (Geographically Dispersed Parallel Sysplex).
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

PeterPotkay

Posted: Tue Jun 28, 2011 9:28 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7722

And the only record of that billion dollars is in the one MQ message?

MQ is not a database. I would hope that critical data of that nature has other checks and balances in place.

I didn't say that a few missing or duplicate messages are never a problem. Ideally they are not and that allows for considering SRDF/A replication of the queue manager data since it appears the DR QM can function correctly based on the info in this thread.

Quote:

A posssible MQ DR strategy for non-replicated disks is to hand over the Queue Manager on the DR system with empty application queues.

At least everyone's expectations can be set ahead of time and no one is (should be) surprised when they get a perfectly functional but empty set of queues, ready to start processing new business.

Its all around requirements though. IF the only record of the data is in that MQ message, and you just can't lose it, even in a disaster, you have to design for it. A "simple" SRDF/A replication of /var/mqm and /var/mqm/log won't be enough in that case.
_________________
Peter Potkay
Keep Calm and MQ On

bruce2359

Posted: Thu Jun 30, 2011 3:23 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

PeterPotkay wrote:

And the only record of that billion dollars is in the one MQ message?

MQ is not a database. I would hope that critical data of that nature has other checks and balances in place.

MQ is more akin to a pipe through which messages flow.

Search here for well-behaved applications.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Recovering a queue manager using Asnychronously copied Logs

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP