|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Recovering a queue manager using Asnychronously copied Logs |
« View previous topic :: View next topic » |
Author |
Message
|
pdminc |
Posted: Tue Jun 21, 2011 4:26 am Post subject: Recovering a queue manager using Asnychronously copied Logs |
|
|
Newbie
Joined: 26 Mar 2002 Posts: 4
|
We often use SRDF to synchronously replicate the queue and log files, but SRDF has a limited distance that can be effectively used to create such a "DR" environment. Has only one use, successfully, an asynchronous copy (SRDF/A) of these or just the log file to recover sucessfully a queue manager?
Clearly there may be both duplicate and missing messages, which the applicaitons would need to deal with, but many apps can in fact address such a limitation.
Thanks ! |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Jun 21, 2011 4:30 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
It's only valid to copy mq logs, synchronously or asychronously, when the queue manager is stopped.
The logs are inconsistent otherwise. |
|
Back to top |
|
 |
pdminc |
Posted: Tue Jun 21, 2011 6:46 am Post subject: |
|
|
Newbie
Joined: 26 Mar 2002 Posts: 4
|
mqjeff... Thanks for the reply.
We've been using Synchronous replication for more that 10 years across hundreds of queue managers, and we often test it and in all this time we've never had an issue. Why do you think the logs would be inconsistent? |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Jun 21, 2011 6:49 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
pdminc wrote: |
mqjeff... Thanks for the reply.
We've been using Synchronous replication for more that 10 years across hundreds of queue managers, and we often test it and in all this time we've never had an issue. Why do you think the logs would be inconsistent? |
Because it is documented that the logs are only in a consistent state when the queue manager is stopped. |
|
Back to top |
|
 |
pdminc |
Posted: Tue Jun 21, 2011 7:14 am Post subject: |
|
|
Newbie
Joined: 26 Mar 2002 Posts: 4
|
Well, just because a log is not "consistent" does not mean a queue manager won't start, nor does it mean that it would have lost or mangled any messages.
Clearly, systems fail, and when reviving them, the logs are used to restart the queue manager. For example, if you killed the logger, then the rest of WMQ, in fact, WMQ will come back up recovering from the log and or the queue files.
There is a well documented procedure to recreate the logs from the queue files. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jun 21, 2011 7:48 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
I doubt that auditors and IT management would agree with your definition of a recovered qmgr, namely: a qmgr with old and/or inconsistent data.
And, what is the well documented procedure to recreate the logs from the queue files.?
Using such a procedure presumes that you have also made backups of your queue files. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
mvic |
Posted: Tue Jun 21, 2011 1:18 pm Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
pdminc wrote: |
Well, just because a log is not "consistent" does not mean a queue manager won't start |
It does usually mean that. Though it depends critically on what exactly is "inconsistent" in the log.
Quote: |
nor does it mean that it would have lost or mangled any messages. |
It can easily mean that. For example, the queue manager is in the middle of writing a 100 MB message to the logs, and you decide at the same time to copy away the log file that is being written to. It's bound to lead to problems trying to restore the message from that log file at some later time.
Quote: |
Clearly, systems fail, and when reviving them, the logs are used to restart the queue manager. For example, if you killed the logger, then the rest of WMQ, in fact, WMQ will come back up recovering from the log and or the queue files. |
This is a very particular example. By forcing the logger to end you cause it to cease to write to the log files. It does not write corrupt information, it just stops writing anything because the OS has ended the process. FWIW there's a manual page somewhere advising that if one day you really have a need to end a queue manager by external means then on Windows and Unix platforms the process containing the logger is the first process to end: see pages under http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/topic/com.ibm.mq.amqzag.doc/fa22280_.htm It is preferable to use endmqm, though, as this is cleaner.
Quote: |
There is a well documented procedure to recreate the logs from the queue files. |
The other way round. There's a procedure to recreate queues from logs, though only when using linear logs. The procedure is likely to give you corrupt queue files, if you copied away log files while they were being written-to with data for that queue file. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jun 21, 2011 1:37 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
pdminc: you need to read the WMQ System Administration manual to understand what logs are, and what WMQ uses the logs for. You need to read and understand the sections of the manual that deal with managing the logs. You need to read and understand the sections on queue manager recovery and restart.
It is clear from your post that you do not have this basic understanding. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Jun 23, 2011 4:32 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Research "Backup Queue Managers" in the WMQ Info Center. Its sort of like log shipping for MQ. That might be the closest official solution to what you are attempting to do. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Jun 23, 2011 4:36 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
mvic wrote: |
Quote: |
Clearly, systems fail, and when reviving them, the logs are used to restart the queue manager. For example, if you killed the logger, then the rest of WMQ, in fact, WMQ will come back up recovering from the log and or the queue files. |
This is a very particular example. By forcing the logger to end you cause it to cease to write to the log files. It does not write corrupt information, it just stops writing anything because the OS has ended the process. |
Playing devil's advocate here.....
Case 1
So if you kill the logger or it dies, it stopped writing to the local log in mid stream. You now have a log file with an abrupt end of data stream in it. "Good data, good data, good data, good d|"
The QM can recover from that, I think we all agree.
Case 2
How is that different from SRDF/A'ing the active log to a remote data center? At any point in time, the log file in the remote data center is incomplete, it has "an abrupt end of data stream" in it.
"Good data, good data, good data, good d|"
Why can the QM recover from this in Case 1 but not Case 2? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
mvic |
Posted: Thu Jun 23, 2011 4:48 pm Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
PeterPotkay wrote: |
Why can the QM recover from this in Case 1 but not Case 2? |
Ah, but does it have an abrupt end to the data stream, in Case 2? I don't know the answer, but I suspect it does not have an abrupt end, in general.
I don't think we can know much about what data is copied, in the case when a copy operation runs while write operations are still taking place.
It must theoretically be possible to make it work, if copier and writer can somehow co-operate. Or maybe if the OS via whatever means holds the writer out while the copier does its work. But I do not know of any OS settings to enforce that. |
|
Back to top |
|
 |
bruce2359 |
Posted: Thu Jun 23, 2011 4:59 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Since we're talking logs, we're talking persistent messages. And, if restart log replay encounters a UofW that did not complete, the UofW will be rolled back. Thus, message loss.
Best case would be for one persistent message per UofW. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Jun 23, 2011 5:00 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
mvic wrote: |
PeterPotkay wrote: |
Why can the QM recover from this in Case 1 but not Case 2? |
Ah, but does it have an abrupt end to the data stream, in Case 2? I don't know the answer, but I suspect it does not have an abrupt end, in general.
|
Assume a disaster that blows up your primary sire and all of the sudden your SRDF/A got no mo' data to replicate.
"Good data, good data, good data, good d|"
Missing messages? Probably. Duplicate messages? Likely.
But not able to at least start up and be ready for new work? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Jun 23, 2011 8:19 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
PeterPotkay wrote: |
Case 1
So if you kill the logger or it dies, it stopped writing to the local log in mid stream. You now have a log file with an abrupt end of data stream in it. "Good data, good data, good data, good d|"
The QM can recover from that, I think we all agree.
Case 2
How is that different from SRDF/A'ing the active log to a remote data center? At any point in time, the log file in the remote data center is incomplete, it has "an abrupt end of data stream" in it.
"Good data, good data, good data, good d|"
Why can the QM recover from this in Case 1 but not Case 2? |
Well we need to make a few assumptions here...
So in case 1 the assumption is that the message is on the queue and the logger got interrupted...
In case 2 we see the same interruption of the logger but the message never hit the queue (as the remote center only gets the logs shipped)... So if the message was in a unit of work the qmgr does not know how to roll it back?
That could mean that you have to specifically run with linear logs and recover the object from the media logs... Not a pretty sight when the logs have been tampered with...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Jun 24, 2011 3:56 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
fjb_saper wrote: |
So if the message was in a unit of work the qmgr does not know how to roll it back? |
Do you mean xa-compliant UofWs here? Those will remain in-doubt until the other participant(s) come to life, and UofWs in-doubt are resolved (negotiated or manually).
Log replay starts at the last checkpoint (in this case not a shutdown checkpoint), replays the log forward (repeating puts, gets, commits, etc.), encounters the last log entry (a put in a UofW, for example), BUT finds no commit. Then, all work done in UofWs, but not committed, is rolled back (undone). _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Last edited by bruce2359 on Fri Jun 24, 2011 4:43 am; edited 1 time in total |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|