|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Backup for catastrophic recovery |
« View previous topic :: View next topic » |
Author |
Message
|
Boomn4x4 |
Posted: Mon Apr 09, 2012 7:04 am Post subject: Backup for catastrophic recovery |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
I'm looking to get some clarification on the MQ documentation for backup / recovery.
I'm working on getting a backup qmgr that will reside on a backup server to takeover in the event the hardware the primary qmgr is running on fails. The documentation is a little gray (at least from my perspective) as to what, exactly, needs to be done for backup. I'm looking for clarification.
One significant requirement for the backup procedure is that the active qmgr is not stopped.
Below is how I assume the backup procedure to work:
1. Primary system running DEV_BACKUP_QM with linear logging on
2. Inactive qmgr on a second backup server, created identically as the primary
3. Copied /var/mqm/log/active/DEV_BACKUP_QM and /var/mqm/qmgrs/ preserving permissions and ownership to the backup server
4. "start" the backup qmgr for replay, strmqm -r DEV_BACKUP_QM
The above procedure completes the "configuration and setup" of the backup qmgr. To continue, there needs to be regular updates?
5. Advance the qmgr logging on the primary qmgr using RESET QMGR TYPE(ADVANCELOG) ... this can be done while the qmgr is running.
6. Learn the current log number, DIS QMSTATUS CURRLOG.
7. Copy all of the log files in /var/mqm/log/active/DEV_BACKUP_QM (begining with "S") with numbers up to, but not including, the log number given in step 2 above over to the back qmgr
8. "start" the backup qmgr for replay, strmqm -r DEV_BACK_QM
Steps 1-4 only need to be preformed initially (or if there is a change to the configuration of the qmgr) Steps 5-8 should be preformed 'regularly' .
Is my understanding correct? |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Apr 09, 2012 7:09 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You could consider using an MI qmgr as well.
It depends on how much you need to rely on disk being available or copyable. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Mon Apr 09, 2012 7:30 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
mqjeff wrote: |
You could consider using an MI qmgr as well.
It depends on how much you need to rely on disk being available or copyable. |
That wasn't something I had considered, though now that I'm looking at it (briefly) I'm not sure that would suit our environment. This seems to be a high availability configuration, much greater than the availability we have now. Since there are other applications running on these servers that have a much lower level of recoverablity, upon failure of the primary, there would still be a significant recovery process that would prevent us from appreciating that level of high availability offered with the MI qmgr. |
|
Back to top |
|
 |
bruce2359 |
Posted: Mon Apr 09, 2012 7:55 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Boomn4x4 wrote: |
Since there are other applications running on these servers that have a much lower level of recoverability, upon failure of the primary, there would still be a significant recovery process that would prevent us from appreciating that level of high availability offered with the MI qmgr. |
In most shops there are applications in need of very quick recovery, and other apps not requiring it. Consider moving those apps requiring quick recovery to an MI environment. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Apr 09, 2012 9:36 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
it's really a question of how available is the 'same' disk space at the recovery site?
If the 'same' disk *is* available, regardless of whether or not you have the secondary instance running, then you don't have to worry about any of the steps you've listed to perform a copy of qmgr data while the qmgr is running.
If the 'same' disk is *not* available generally, then you need to take steps to copy data. But you might still consider using disk-level replication rather than qmgr-level replicatoin. |
|
Back to top |
|
 |
bruce2359 |
Posted: Mon Apr 09, 2012 9:57 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
If the app in question is mission-critical, you need to consider all forms of redundancy - mirrored disk, RAID, MI, ... , data replication to hot-site... _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Mon Apr 09, 2012 10:03 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
How much time are we talking about? And what, exactly, are we talking about getting backed up?
My assumption is that physical message data should not be a concern because if the applications are developed property, then no messages should actually be sitting in queues. MQ is only supposed to be utilized as a message transport, not a database, even a short term / temporary one.
The testing I have done thus far in backup and recovery, the process only takes a few seconds to get a backup and copy it to another server. And only a few seconds to do the revers to restore and bring the new queue manager back online to process messages. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Apr 09, 2012 10:11 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Boomn4x4 wrote: |
My assumption is that physical message data should not be a concern because if the applications are developed property, then no messages should actually be sitting in queues. MQ is only supposed to be utilized as a message transport, not a database, even a short term / temporary one. |
That's a big assumption. Even if we accept applications are developed embracing the fact that WMQ is not DB2's cheaper brother, a properly developed application might still use WMQ as short term / temporary store (or queueing mechanism) to buffer an application that periodlically produces bursts of messages faster than they can be processed. Or an application which queues messages for processing by an application that runs once an hour on the hour for business reasons.
If these are use cases your site doesn't have, it becomes a smaller assumption. Replaced by the larger assumption that if this ever changes & those use cases are introduced, someone will rethink the recovery strategy. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Mon Apr 09, 2012 10:16 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
What needs to be backed up depends on your definition of "an outage."
An outage may range from severe to trivial.
Consider: loss of a physical site, loss of a server, loss of a qmgr, to loss of a queue, loss of a message.
Time to recover... depends on what is lost.
What are you trying to recover from? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Mon Apr 09, 2012 10:33 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
bruce2359 wrote: |
What are you trying to recover from? |
A loss of hardware. If the server that MQ is running on explodes, we need to get a backup queue manager backup and running. But, since our application resides on the same server as the queue manager, in the event of a hardware failure, there is no benefit to getting MQ restored instantly because it will still take time for the applications to recover. |
|
Back to top |
|
 |
bruce2359 |
Posted: Mon Apr 09, 2012 10:41 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Boomn4x4 wrote: |
bruce2359 wrote: |
What are you trying to recover from? |
A loss of hardware. If the server that MQ is running on explodes, we need to get a backup queue manager backup and running. But, since our application resides on the same server as the queue manager, in the event of a hardware failure, there is no benefit to getting MQ restored instantly because it will still take time for the applications to recover. |
Will the application take hours, days, to recover? Why do you see no benefit in getting a qmgr up and running sooner than the apps that make use of it? Color me confused.
Loss of a server is a specific type of outage. MI can address this. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Apr 09, 2012 10:47 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Boomn4x4 wrote: |
since our application resides on the same server as the queue manager, |
Again, this is a design assumption. You need to be sure that no-one in future will switch the application to client so that 1-n instances of the application on 1-n dedicated servers can bang away at the same queue manager to increase throughput. Or that a new application isn't sited on it's own server because it's a memory hog. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Mon Apr 09, 2012 11:21 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
bruce2359 wrote: |
Boomn4x4 wrote: |
bruce2359 wrote: |
What are you trying to recover from? |
A loss of hardware. If the server that MQ is running on explodes, we need to get a backup queue manager backup and running. But, since our application resides on the same server as the queue manager, in the event of a hardware failure, there is no benefit to getting MQ restored instantly because it will still take time for the applications to recover. |
Will the application take hours, days, to recover? Why do you see no benefit in getting a qmgr up and running sooner than the apps that make use of it? Color me confused.
Loss of a server is a specific type of outage. MI can address this. |
The application takes about 5 minutes to recover. I don't see it as a benefit because it won't be doing anything, just sitting idle. Consider, MQ and the application will be residing on the same hardware. If I lose the server, I loose them both, and both will need to be restored.So what's the difference if MQ comes up at the same time, or sooner?
I should also mention that I will be running several thousand queue managers, getting close to 8,000 (not including the almost 8,000 backups). Currently, there is no networked storage device available that MI queue managers can write too. That would require the purchase of another server. IF we were to choose to go that route, it will take a pretty strong argument that it is necessary to make that investment. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Apr 09, 2012 11:49 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Boomn4x4 wrote: |
I should also mention that I will be running several thousand queue managers, getting close to 8,000 (not including the almost 8,000 backups). |
This adds weight to my earlier point. While the application currently resides on the same server as the queue manager, there's a very plausible chance that 3 months / 6 months / 2 years down the road someone's going to ask "why are we paying license fees for 8,000 queue managers? Can't we just have 1,000 really powerful queue managers and have the applications client onto them?".
This simplistically assumes 8,000 queue managers on 8,000 servers but you see my point. As I said above, if you're confident your WMQ topology will not change without the recovery straegy being rethought, you're good to go. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Apr 09, 2012 11:50 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Boomn4x4 wrote: |
That would require the purchase of another server. |
Or a large, inexpensive SAN device. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|