|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
MQ Disaster-Recovery setup question |
« View previous topic :: View next topic » |
Author |
Message
|
sunny_30 |
Posted: Sun Jul 12, 2009 1:55 pm Post subject: MQ Disaster-Recovery setup question |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
We have HACMP setup on our Production-MQ(V 6.0) running on AIX(V5.3) with seperate MQ-installs on both active-passive nodes. All MQ file-systems except install (/usr/mqm) switch to the passive-node on failover.
Planning to setup a Disaster-recovery instance at a remote physical location which also will have a seperate MQ-install. MQ will be on same level on all three systems: active, failover & DR.
Circular-logging is being used for MQ and its OK with message-loss/duplication upon DR. The idea is to have data-synced up to the remote instance once every 12 hours.
The data is mirrored to DR for following mq file-systems while MQ is running in PROD:
/var/mqm
/var/mqm/errors
/var/mqm/log &
/var/mqm/qmgrs/
My question is:
With the above setup, will there be a problem with MQ-active logs trying to bring up MQ on the DR-site? If yes, is it going to sometimes (or) always?
What if the active-node in HACMP is suddenly shutdown (say power failure). Will it be the same problem with active-logs bringing up MQ on passive node also?
For HACMP the same log-filesystem: /var/mqm/log is shared, whereas for DR this file-system is copied.
I have read in the forum that copying the active logs while MQ is running will give problems bringing up MQ using copied logs
I dont have a clear answer. Can you please confirm
Thanks in advance. |
|
Back to top |
|
 |
exerk |
Posted: Sun Jul 12, 2009 2:18 pm Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Questions...
1. Is the fail-over automatic, or manual?
2. Why are you copying all the file systems when only the queue manager data and logs are required?
3. How far away is your DR site?
sunny_30 wrote: |
My question is:
With the above setup, will there be a problem with MQ-active logs trying to bring up MQ on the DR-site? If yes, is it going to sometimes (or) always? |
Is it going to sometimes, or always what? Not come up, or come up? Why should there be a problem, apart from the fact that you're going to be 12 hours behind?
sunny_30 wrote: |
My question is:
What if the active-node in HACMP is suddenly shutdown (say power failure). Will it be the same problem bringing up MQ on passive node? |
If it was a problem, what would be the point of the second node? The queue manager will recover itself.
sunny_30 wrote: |
My question is:
For HACMP the same log-filesystem: /var/mqm/log is used, whereas for DR this file-system is copied. |
And? It's a copy (albeit a 12 hour old copy) you'll be using, but it will still be the 'same' file system.
sunny_30 wrote: |
...I have read in the forum that copying the active logs while MQ is running will give problems bringing up MQ using copied logs... |
Any copy of an 'in-flight' file means at best you get a 'fuzzy' back up, although I think that's pretty irrelevant in this scenario if you are prepared for a 12 hour split on active/DR. Will the 12 hour sync be a 'fuzzy' backup, or do you intend stopping the queue manager and copying?
And I do hope you're using the MC91 SupportPac to make your life easier. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
sunny_30 |
Posted: Sun Jul 12, 2009 7:10 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Thank you for your response.
In essence, my question is this:
While the QM say on system 'A' is processing messages & in the middle of a UOW, if I copy its active-logs to the same named QM on a different system say 'B', wait for a day or so, does the QM on system-B come up with no errors when I do strmqm ?
Please note that along with logs, qmgr data-directory data is also copied.
I agree other file-systems dont need to be copied to DR.
In our scenario failover is automatic & DR site is 100s of miles away. |
|
Back to top |
|
 |
vol |
Posted: Sun Jul 12, 2009 10:36 pm Post subject: |
|
|
Acolyte
Joined: 01 Feb 2009 Posts: 69
|
The answer to your question is NO. The qmgr data and logs must not be copied when the qmgr is running. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Jul 13, 2009 12:36 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
While the QM say on system 'A' is processing messages & in the middle of a UOW, if I copy its active-logs to the same named QM on a different system say 'B', wait for a day or so, does the QM on system-B come up with no errors when I do strmqm ? |
No. The queue manager on B will not start correctly, or if it does it'll be a total fluke, and you'll not be able to guarantee consistency to any auditor who asks. Especially if, as you say, there was an open UOW. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
sunny_30 |
Posted: Mon Jul 13, 2009 2:56 am Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Thank you.
As there is no guarantee that the QM might come up cleanly on DR (with the setup I described above), I think its better to have just a 'seperate install', save a copy of mq-objects & authorizations using savegmgr-supportpac.
For our requirement, all messages are non-persisitent & logging used is circular. We are OK with message-loss/ duplication upon DR. I dont think for our setup there is no need to copy the data/qmgr directories(/var/mqm & /var/mqm/qmgrs)
So, each time on DR, the steps wd be this:
1) create a QM (same name) with same logging (same # and size of circular logs)
2) create all mq-objects saved from backed-up saveqmgr file
3) run setmqaut (from saved authorizations) using saved amqoamd /saveqmgr
please correct me if Im wrong abt DR setup
I have a question for HACMP failover:
here all file-systems are shared (switched over to other system on failover) except install as we have seperate installs:
Code: |
No. The queue manager on B will not start correctly, or if it does it'll be a total fluke, and you'll not be able to guarantee consistency to any auditor who asks. Especially if, as you say, there was an open UOW. |
What happens if the QM is suddenly shutdown in the middle of UOW? Can the same QM come up cleanly on same system / failover ?
I ask this question because for failover same file-system (/var/mqm/logs) is used & remember the QM is shutdown abruptly (say a power failure), so the active-logs might still be in a 'fuzzy' state. |
|
Back to top |
|
 |
exerk |
Posted: Mon Jul 13, 2009 3:03 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
sunny_30 wrote: |
...So, each time on DR, the steps wd be this:
1) create a QM (same name) with same logging (same # and size of circular logs)
2) create all mq-objects saved from backed-up saveqmgr file
3) run setmqaut (from saved authorizations) using saved amqoamd /saveqmgr |
Considerations
1. IP Address changes, if any?
2. Channel sync number resets.
sunny_30 wrote: |
...What happens if the QM is suddenly shutdown in the middle of UOW?... |
The same as it would if the queue manager was stand-alone on a single physical server, it will roll back any in-doubts, and recover. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
sunny_30 |
Posted: Mon Jul 13, 2009 3:27 am Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Code: |
Considerations
1. IP Address changes, if any?
2. Channel sync number resets. |
I agree.
yes IP address will change for DR.
but alias will remain the same as prod on DR.
For DR, I think I will also need to back up qm.ini & mqs.ini files
Code: |
The same as it would if the queue manager was stand-alone on a single physical server, it will roll back any in-doubts, and recover. |
If active-logs, /var/mqm/qmgrs are backed up to DR
When trying to bring up MQ on the DR-system, why cannot the DR also behave "as if" it is on the same physical-system?
(i.e rolling back any in-doubts etc & mq coming up cleanly)
whats different on DR, will it be missing any other file ? |
|
Back to top |
|
 |
exerk |
Posted: Mon Jul 13, 2009 3:41 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
sunny_30 wrote: |
...yes IP address will change for DR, but alias will remain the same as prod on DR... |
By this, I take it you mean the underlying IP Address of the DNS entry will change...
sunny_30 wrote: |
For DR, I think I will also need to back up qm.ini & mqs.ini files |
Why? You've already stated that the DR queue manager will be a 'stand-alone' queue manager, in which case do you really want to over-write it's qm.ini file, bearing in mind the log path will be different to an HA queue manager?
exerk wrote: |
The same as it would if the queue manager was stand-alone on a single physical server, it will roll back any in-doubts, and recover. |
Your question was in context to HACMP, i.e. High Availability, not DR, but I stress again, look at the use of a back-up queue manager, and look at hours-of-service to take down your queue manager to back it up and copy the necessary files.
Also research some more, it strikes me that your understanding of both HA and DR are lacking, and then come back and ask questions. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Vitor |
Posted: Mon Jul 13, 2009 3:58 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
I ask this question because for failover same file-system (/var/mqm/logs) is used & remember the QM is shutdown abruptly (say a power failure), so the active-logs might still be in a 'fuzzy' state. |
There is a difference between a SAN sitting between 2 boxes in an HACMP failover situation, and a copy made while the files are in use. In an HACMP situation, the queue manager will start because it's the "same" queue manager restarting. You need to understand this distinction.
You'll still find any in-flight UOW will be rolled back. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
sunny_30 |
Posted: Mon Jul 13, 2009 7:53 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Exerk & Vitor: Thank you for your responses
Exerk, you say:
Code: |
You've already stated that the DR queue manager will be a 'stand-alone' queue manager, in which case do you really want to over-write it's qm.ini file, bearing in mind the log path will be different to an HA queue manager? |
On DR-system, if a new QM is built/created with the same name as in prod, why wd the log-path in qm.ini change? I dont get it how it wd be different from prod.
wdnt the logpath be same on both-systems:
LogPath=/var/mqm/log/QM-NAME/
Am I missing something here?
IF on prod, the qm.ini carries a parameter "MaxChannels" that is set to a non-default value say 300, backing this up to DR would ensure we have the same config set for the DR QM also.
As it is stated (and I also agree) that in a DR situation, copying over the active-logs wd give a problem bringing up the QM cleanly, I decided my plan will be to "each time" build the new QM in DR exactly as setup in PROD.
However, my question still remains:
IF the QM is abruptly shutdown in the middle of UOW/transaction.
Why do active-logs wont give any problem starting the MQ on failover,
BUT if the same active-logs are copied (while UOW is not complete) over to the DR-site & then try to bring up the QM there it wont start
Vitor, you say:
Code: |
There is a difference between a SAN sitting between 2 boxes in an HACMP failover situation, and a copy made while the files are in use. In an HACMP situation, the queue manager will start because it's the "same" queue manager restarting. You need to understand this distinction. |
Can you please explain me the distinction?
It will the same-name for the QM on DR-site where I copy the active-logs to.
I will also copy file: 'amqhlctl.lfh' from '/var/mqm/log/QMGR'
How does MQ know (upon executing srtrmqm) whether the active-logs being used are copied from a different QM (or) running on the same -system's QM where they were created?
Is it this file:'amqhlctl.lfh' that will be inconsistent when active-logs are copied to DR ?
Exerk, you say:
Code: |
it strikes me that your understanding of both HA and DR are lacking |
I really appreciate your valuable time & effort in answering my questions but please note that its obvious that I wdnt be asking questions in the forum if I understood everything.
Im just trying to understand 'what causes' the difference in MQ's behavior for below two scenarios:
1) Trying to bring up MQ on DR after active-logs are copied in middle of transaction
2) Trying to bring up MQ on the same-system/failover after QM shuts off abruptly in the middle of UOW
Thanks again |
|
Back to top |
|
 |
exerk |
Posted: Mon Jul 13, 2009 11:27 pm Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
sunny_30 wrote: |
...Exerk, you say:
Code: |
You've already stated that the DR queue manager will be a 'stand-alone' queue manager, in which case do you really want to over-write it's qm.ini file, bearing in mind the log path will be different to an HA queue manager? |
On DR-system, if a new QM is built/created with the same name as in prod, why wd the log-path in qm.ini change? I dont get it how it wd be different from prod. wdnt the logpath be same on both-systems:
LogPath=/var/mqm/log/QM-NAME/
Am I missing something here? |
Which is why I asked the question (still unanswered) of whether you are using the MC91 SupportPac. I could give you the answer, but instead I suggest you go and look in the qm.ini file of an HA queue manager, and one that is 'stand-alone' on a server, compare them, then ask your question again. Of course, it's always possible that the HA install done on your site has been done differently to how another site would do it.
sunny_30 wrote: |
...IF on prod, the qm.ini carries a parameter "MaxChannels" that is set to a non-default value say 300, backing this up to DR would ensure we have the same config set for the DR QM also... |
Not necessary, because of course you will have built the DR queue manager to the exact same specification as the Production queue manager, especially as you then state...
sunny_30 wrote: |
...As it is stated (and I also agree) that in a DR situation, copying over the active-logs wd give a problem bringing up the QM cleanly, I decided my plan will be to "each time" build the new QM in DR exactly as setup in PROD... |
sunny_30 wrote: |
...However, my question still remains:
IF the QM is abruptly shutdown in the middle of UOW/transaction.
Why do active-logs wont give any problem starting the MQ on failover,
BUT if the same active-logs are copied (while UOW is not complete) over to the DR-site & then try to bring up the QM there it wont start... |
Because in the case of HA, the logs will be consistent with the queue manager, i.e. when the queue manager restarts, it will recover its state at the moment of failure - in-flight UOW's will be rolled back, and queues will only contain committed messages, except from any that may be in-flight due to a channel recovering. 12-hour old in-flight copied logs and in-flight copied queue files, blown over another queue manager will not be consistent.
I'll leave my illustrious master to answer the questions you have directed at him, but answer your last point...
sunny_30 wrote: |
Exerk, you say:
Code: |
it strikes me that your understanding of both HA and DR are lacking |
I really appreciate your valuable time & effort in answering my questions but please note that its obvious that I wdnt be asking questions in the forum if I understood everything.
Im just trying to understand 'what causes' the difference in MQ's behavior for below two scenarios:
1) Trying to bring up MQ on DR after active-logs are copied in middle of transaction
2) Trying to bring up MQ on the same-system/failover after QM shuts off abruptly in the middle of UOW
Thanks again |
Whilst I appreciate that you are asking for clarification, a lot of the questions you are asking could be answered by yourself with just a little bit of research, such as can be found HERE. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Vitor |
Posted: Tue Jul 14, 2009 1:04 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
Vitor, you say:
Code: |
There is a difference between a SAN sitting between 2 boxes in an HACMP failover situation, and a copy made while the files are in use. In an HACMP situation, the queue manager will start because it's the "same" queue manager restarting. You need to understand this distinction. |
Can you please explain me the distinction?
It will the same-name for the QM on DR-site where I copy the active-logs to.
I will also copy file: 'amqhlctl.lfh' from '/var/mqm/log/QMGR'
How does MQ know (upon executing srtrmqm) whether the active-logs being used are copied from a different QM (or) running on the same -system's QM where they were created?
Is it this file:'amqhlctl.lfh' that will be inconsistent when active-logs are copied to DR ? |
If the file is on a SAN it's the actual file at the point in time it failed, and the SAN software running the disc will clear up buffers and so forth. If you make a copy while the file is use, you'll get whatever the OS thinks the file looks like. If you copy any file while it's being written to, it will not be a full, complete or predictable copy because of how the OS runs the file system. This is nothing to do with WMQ or queue manager logs, but simple file handling and why people used shared disc to ensure consistency.
I agree with my apprentice, that you're asking a lot of questions that have easily found answers. You also seem to be asking some general questions about DR that you can find elsewhere, and should certainly understand before trying to implement this.
You also seem to be obsessed with this question of in-flight transactions to the exclusion of other issues, many of which are more significant and potentially solve this issue.
I answer your question with another question: if you want to know what happens to in-flight transactions when you copy the in-use logs, why not just try it? Easy enough to set up a test rig. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jul 14, 2009 6:18 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9470 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
...obsessed with this question of in-flight transactions... |
I suppose a discussion of the definition of what DR means, and what DR does not mean, is in order. Outages span from the simple loss of an application file, to something catastrophic.
The usual definition of DR means that your primary (and hot-failover) IT site is dead. All is lost.
If this is the case, then there is no chance that in-flight transactions can or will be recovered. Why? Because backups are taken at some point in time, and shipped to the DR location.
It seems to me that you are mixing HA with DR. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
exerk |
Posted: Tue Jul 14, 2009 6:24 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
bruce2359 wrote: |
...It seems to me that you are mixing HA with DR... |
Does that equal HARD?  _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2, 3, 4 Next |
Page 1 of 4 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|