Author |
Message
|
robertblue |
Posted: Fri Mar 14, 2008 12:11 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
Has anyone actually gotten a backup queue manager to work in the windows environment. |
|
Back to top |
|
 |
SAFraser |
Posted: Fri Mar 14, 2008 12:54 pm Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
I don't understand your question. What do you mean by "a backup queue manager"? |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 12:59 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
According to the System Admin Guide, Websphere MQ 6.0 provides support for a backup queue manager for an existing queue manager. Following the instructions in the System Admin Guide, we have been unable to get this to work.
We were just wondering if anyone has gotten it to work, especially in the Windows environment |
|
Back to top |
|
 |
SAFraser |
Posted: Fri Mar 14, 2008 2:05 pm Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
I have not tried this; we use a slightly clumsier cold spare strategy for redundancy.
I am curious, though, about where it breaks? Just wondering. |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 2:15 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
The documentation says after creating the backup queue manager, to ake copies of all the existing queue manager’s data and log file directories, including all subdirectories and overwrite the backup queue manager's data and log file directories.
Then it says to run the following command "strmqm -r BackupQMName" to flag the queue manager as a backup queue manager.
We do this, and then even if we just immediately run the "strmqm -s BackupQMName" to activat the queue manager, without doing any additional log updates, we cannot start the backup queue manager.
We get the following error:
WebSphere MQ queue manager 'FGAS.QMGR.PRD' starting.
AMQ6109: An internal WebSphere MQ error has occurred.
exitvalue = 71 |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 2:16 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
Sorry.
That should have been strmqm -q <queuemanager> to activate the backup queue manager |
|
Back to top |
|
 |
SAFraser |
Posted: Fri Mar 14, 2008 2:25 pm Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
I think it is actually "-a" -- but perhaps that is just a typo in your post?
Quote: |
Execute the following control command to activate the backup queue manager:
strmqm -a BackupQMName |
Was an FDC file generated when you got the AMQ6109 error? |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 2:28 pm Post subject: |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 2:30 pm Post subject: Backup and Restore Websphere MQ 6.0 on Windows |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
it has been a long day.
"-a" is correct |
|
Back to top |
|
 |
SAFraser |
Posted: Fri Mar 14, 2008 4:36 pm Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
Post the header from the FDC, if you don't mind.
And take the weekend off! |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 6:44 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
Code: |
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Fri March 14 17:14:47 Central Daylight Time 2008 |
| Host Name :- LBVMQT1 (Windows Server 2003, Build 3790: Service Pack |
| 2) |
| PIDS :- 5724H7200 |
| LVLS :- 6.0.2.2 |
| Product Long Name :- WebSphere MQ for Windows |
| Vendor :- IBM |
| Probe Id :- XC130031 |
| Application Name :- MQM |
| Component :- xehExceptionHandler |
| SCCS Info :- lib/cs/pc/winnt/amqxerrn.c, 1.36.1.2 |
| Line Number :- 916 |
| Build Date :- Aug 1 2007 |
| CMVC level :- p600-202-070801 |
| Build Type :- IKAP - (Production) |
| UserID :- MUSR_MQADMIN |
| Process Name :- C:\Program Files\IBM\WebSphere MQ\bin\amqzxma0.exe |
| Addressing mode :- 32-bit |
| Process :- 00001120 |
| Thread :- 00000001 |
| QueueManager :- FGAS!QMGR!PRD |
| ConnId(1) IPCC :- 2 |
| ConnId(2) QM :- 2 |
| ConnId(3) QM-P :- 2 |
| ConnId(4) App :- 2 |
| Major Errorcode :- xecF_E_UNEXPECTED_SYSTEM_RC |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6119 |
| Probe Severity :- 2 |
| Probe Description :- AMQ6119: An internal WebSphere MQ error has occurred |
| (Access Violation at address 00000000 when reading) |
| FDCSequenceNumber :- 0 |
| Comment1 :- Access Violation at address 00000000 when reading |
| |
|
[/code] |
|
Back to top |
|
 |
robertblue |
Posted: Fri Mar 14, 2008 6:51 pm Post subject: Backup Queue Manager on Windows 2003 |
|
|
Newbie
Joined: 03 Mar 2008 Posts: 9
|
Thanks for your input.
You mentioned that use a slightly clumsier cold spare strategy for redundancy.
We are looking at using MSCS for a local high availability solution, but were looking at the backup queue manager as part of a remote disaster recovery site solution.
Could you describe what you mean by the cold spare strategy? |
|
Back to top |
|
 |
SAFraser |
Posted: Fri Mar 14, 2008 7:43 pm Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
I googled the "probe ID" in your FDC header and found this:
http://www-1.ibm.com/support/docview.wss?&apar=only&uid=swg1IC45816
You might google some more and get additional ideas.
We do not have an HA setup for our production (because our applications are not cluster aware and have to be restarted in the event of a failover anyway). With a couple of exceptions, our applications do not store any data on MQ queues; when all is working properly, the queue depths are zero. The more critical applications track what they have sent and can resend if needed.
For the two reasons above, our failover strategy does not address any data recovery. For many sites, this would not be appropriate, but it works for us.
I have several "production" queue managers built out on different machines. In our case, these qmgrs are not named the same. Naming your cold spare is a matter of your business requirements. In our case, having the cold spares named differently works because 1) our WMB brokers need unique qmgrs and 2) our client applications must restart after failover anyway (because they are the dumbest applications in the world) so it's no big deal for them to reconfigure their connection properties. Connections from other MQ servers are easily handled with a queue manager alias.
Now you can also name your cold spares the exact same name as your active qmgr (as long as WMB is not involved in the same domain) but you must be sure to take precautions not to allow it to start! Many of my cohorts on this forum will disagree with duplicate qmgr names, but if it makes the failover smoother and you are careful (!!), it can work.
Just depends on the way your site uses MQ..... If I were you, I wouldn't give up on the backup queue manager strategy yet, if it addresses your business needs-- it's worth raising a PMR. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sat Mar 15, 2008 12:22 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Keeping in mind that I have never used backup QMs, just read about them:
We decided against using them for DR. Its basically shipping the linear logs from Data Center 1 (DC1) to DC2. I have to think it only does this whenever it cuts a new linear log. Nowadays log files can/should be really big. On anything other than a really busy system it can be quite a while before a new linear log is created, so your RPO (Recovery Point Objective) is going to be quite long.
If an app can handle one or more missing messages and/or one or more duplicate messages (MQPUT gets shipped from DC1 to DC2 but the MQGET doesn't) because of this long RPO, then the app can handle missing and/or duplicate messages. Why go nuts trying to duplicate some of the messages. MQ is not a database. So we took the stance of using hardware clustering for all QMs in DC1 to provide H.A. There is no persistent committed message loss short of a data center or SAN loss in DC1. In DC2 we created a duplicate of the MQ farm in DC1. Server names and QM names are similar but different between DC1 and DC2, but all the q names and SVRCONN channel names and WMB flows are the same. There is no data replication between DC1 and DC2 as far as MQ is concerned. MQ is not a database.
All the apps that MQ Client in (99%) are using a VIP that takes them to the clusters in DC1 until there is a disaster at which point we repoint the VIPS to DC2. Since the MQ Client apps do not use the QM name on their MQCONN(X)s, and all the object names are the same, and they are using VIPs, apps don't have to make any changes in a DR. They just reconnect and off they go. For the few apps that have local QMs they are using multiple QMs in MQ clusters so it doesn't matter if the QMs in DC1 go away because the QMs in DC2 survive.
Remember this for MQ DR for 99.9% of scenarios:
MQ is not a database
Any async data replication scheme will introduce the possibility of one or more messages being lost.
Any async data replication scheme will introduce the possibility of one or more messages being duplicated.
Thus if the apps can tolerate the loss/duplication of one message they can tolerate the loss of multiple messages if you don't replicate.
If your data centers are close enough for synchronous data replication use stretch clusters and never worry about committed persistent MQ data loss or duplication.
Its more important to come up with a design that will 100% of the time get your apps up and running in DR quickly then trying to come up with some crazy inconsistent solution hoping there will be no message loss that takes hours to enable.
MQ apps that absolutely can't lose the messages in a DR probably have a whole bunch of other non MQ checks and balances anyway to handle potential message loss.
This worked for us. You may have a situation where this design does not work. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
|