|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
using backup queue manager |
« View previous topic :: View next topic » |
Author |
Message
|
jcv |
Posted: Tue Mar 10, 2009 10:26 am Post subject: using backup queue manager |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
Hello,
Quote: |
To ensure that a backup queue manager remains an effective method for disaster recovery it must be updated regularly. Regular updating lessens the discrepancy between the backup queue manager log, and the current queue manager log. |
How did you determine reasonable update interval and what is it?
I suppose that time needed for log extents transfer to the remote backup qmgr, as well as for applying those there with strmqm -r, as well as a wish to avoid certain overhead when forcing active queue manager to log to the next log extent, might be some factors, but I'm not sure. |
|
Back to top |
|
 |
JosephGramig |
Posted: Tue Mar 10, 2009 12:54 pm Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
I could be wrong, but I think you must play all the logs. That has nothing to do with when you took the image (which you should do often).
The question is: How many files to you want to play when you want to use the DR QMGR?
Only use files that are finished. |
|
Back to top |
|
 |
jcv |
Posted: Tue Mar 10, 2009 3:30 pm Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
I have to copy and replay:
Quote: |
...all the log extents since the last update, and up to (but not including) the current extent |
Current active log extent is obtained by: DIS QMSTATUS CURRLOG
Which log extents exactly are those "since the last update, and up to the current extent" is now a bit blur to me, but I hope it will become clear, eventually. Probably, if I copied during the last update, let's say S0000023.LOG and now current is S0000025.LOG, then I suppose I have to copy and replay now only S0000024.LOG. Hence, I expect to have to transfer and replay only a few log extents, and I wonder which operation might be the main bottleneck in this procedure, and how often can it be therefore performed.
I think I can always ensure there will be new finished files to use by issuing: RESET QMGR TYPE(ADVANCELOG) |
|
Back to top |
|
 |
jcv |
Posted: Mon Mar 16, 2009 3:48 pm Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
Of course that there is no doubt about what I said "probably", that's actually pretty clear. I've checked it and it works.
Did anyone implement qmgr point-in-time recovery procedure, based on "Using a backup queue manager" procedure? Experienced users in this field, who are willing to share their experience are wellcome, and their input will be appreciated. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Mar 16, 2009 4:22 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
I have not used backup QMs. As I understand it, its sort of like Log Shipping. The problem as I see it is the Active QM only ships a log when it fills one and cuts the next one. If your QM has very large Linear Log files, and/or is not very busy, it can be quite a long time before the next log is shipped, meaning the back up QM can get hours or days or weeks behind.
I suppose you could script issuing RESET QMGR TYPE(ADVANCELOG) on a more frequent basis, but you will never get around the fact that your backup QM will always be behind, introducing missing messages or duplicate messages when DR strikes. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
jcv |
Posted: Tue Mar 17, 2009 2:59 pm Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
Peter, thanks for the update.
I have opened a PMR with the same subject to collect informations at the same time, in parallel to this forum discussion. The point of the topic is to determine how suitable is this feature for DR with minimum loss of data. IBM emphasizes the existence of alternative solutions to this problem by using methods of synchronous or asynchronous data replication at a disk level, which are superiorly efficient with respect to minimizing the loss of data in case of disaster.
Utilization of this feature would probably lead to these steps:
- determine optimal log extent size according to intensity of logging and frequency of taking log extent copies
- utilization of logger event messages during busy periods
- utilization of RESET QMGR TYPE(ADVANCELOG) during quiet periods (when not notified of a logger event message within a certain time period)
- fast and reliable transfer of log extents to the backup site
Utilization of log extents on the backup site was commented as a separate problem, although I have implied it was part of the same, making suggestions on simultaneous replaying of log extents on backup site and synchronous notification of results to the active site.
It was emphasized that there was intention of not offering automated procedure based on this procedure leaving thus flexibility to the customer in defining final solution. I knew from the start that this kind of flexibility would be the point that bothers me the most. Because it means liberty to make wrong decisions. Besides that, it requires an organizational issue clarification, because management thinks of qmgr point-in-time recovery as an MQ admin task solely, while data replication at a disk level is probably not that. |
|
Back to top |
|
 |
jcv |
Posted: Tue Mar 17, 2009 5:31 pm Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
What happens at the backup site is definitely not a separate problem. If something goes wrong, and new log extents cannot be replayed successfuly, there is a chance that automated procedure has to be stopped, active qmgr stopped too, full backup taken again and replayed, so that automated procedure can be resumed. I guess.
The more I think of it, the less it seems like a trivial assignment. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Mar 17, 2009 6:05 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
The more I think of it, the less it seems like a trivial assignment. |
Who or what gave you the idea that DR is in any way trivial?!
At my shop I managed (over an extended time-period) to get the DR name changed to Business Continuation. This name properly confers the true importance of the mission of the DR/BC. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
jcv |
Posted: Thu Mar 19, 2009 12:58 am Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
Mentioning (non)triviality applies to the observed procedure, not to DR in general. I don't recall I said that DR was not important or trivial.
You said nothing about the way how, or whether you use backup qmgrs or not. |
|
Back to top |
|
 |
bruce2359 |
Posted: Thu Mar 19, 2009 5:39 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
Mentioning (non)triviality applies to the observed procedure ... |
To my way of thinking, non-triviality applies more to scope of DR/BC. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Mar 19, 2009 9:02 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
jcv wrote: |
Of course that there is no doubt about what I said "probably", that's actually pretty clear. I've checked it and it works.
Did anyone implement qmgr point-in-time recovery procedure, based on "Using a backup queue manager" procedure? Experienced users in this field, who are willing to share their experience are wellcome, and their input will be appreciated. |
Just as a reminder when talking about point in time recovery for MQ.
THE ONLY POINT IN TIME YOU CAN RECOVER TO, IS THE ONE IMMEDIATELY BEFORE THE CRASH. MQ is not a DB where you can, out of the box get point in time recovery where you control the point in time. (like 1 hour before the crash...)
This being said , enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jcv |
Posted: Fri Mar 20, 2009 12:45 am Post subject: |
|
|
 Chevalier
Joined: 07 May 2007 Posts: 411 Location: Zagreb
|
If I can get out of the box recovery to the point in time immediately before the crash without using backup queue managers, all my effort to acquire knowledge about them would be excessive, because I would be quite satisfied with that, if you are talking about some service (MQ) level method of recovery. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Mar 20, 2009 4:28 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Point in time recovery immediatly prior to the crash? You are only going to get that with synchronous replication, i.e. an H.A. solution.
True DR implies datacenters so far apart that you must use asynchronous replication. As soon as you start talking about async replication, point in time recovery gets real hairy real fast. You will have missing messages. You will have duplicate messages. And you will have apps and executives asking you to explain it since you told them you were replicating "the MQ". Most apps are not designed to deal with duplicates messages. Most people are not designed to deal with missing messages after being told you are replicating.
MQ is not a database. When your primary site blew up, when all heck is breaking loose, a MQ DR plan that gives the apps all the QM, queues, channels in the DR site, minus any messages that may have been in queues, will be a good DR plan. The ability to fire up and start processing new business is usually the # 1 priority, not dealing with some convoluted, expensive DR set up trying to replicate all message traffic that will never ever get it 100%. Since the apps would have to prepare to deal with at least one message possibly missing, they can deal with multiple messages missing. This is not as big a deal as you think - most MQ messages should be zooming thru in memory, nothing to replicate there. And most queues should be at zero most of the time anyway.
MQ is not a database. Apps that have critical data being sent in MQ messages that can not be lost no matter what are going to have that data somewhere else as well, probably a database, which is designed to minimize data loss even in a disaster.
Having an MQ infrastructure in the DR site ready to work, with empty queues, is probably good enough for most all scenarios. Its simple, and everyone understands that any and all messages that may be sitting in queues at the primary site will be gone. They also know they don't have to deal with duplicate messages, and that they have an MQ Infrastructure that is ready to roll as soon as they get connected to the DR site.
MQ is not a database. As soon as you start talking about async replication, you will have missing messages, you will have duplicate messages. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Mar 20, 2009 6:30 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
If I can get out of the box recovery ... |
If your application demands this scale of continuous availability, take a look at z/OS GDPS (Geographically Dispersed Parallel Sysplex). It is as out-of-the-box as you can get.
http://www-03.ibm.com/systems/z/advantages/gdps/index.html _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|