ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Problem with a cloned (DR) broker

Post new topic  Reply to topic
 Problem with a cloned (DR) broker « View previous topic :: View next topic » 
Author Message
Cliff
PostPosted: Fri May 16, 2008 4:38 am    Post subject: Problem with a cloned (DR) broker Reply with quote

Centurion

Joined: 27 Jun 2001
Posts: 145
Location: Wiltshire

Solaris 5.9, WMQ 6.0.2.1, WBIMB 5.0.0. CSD 9

My client has implemented a ‘temporary’ DR solution whereby it’s accepted that failover will be a manual process. The principle is to clone the configmgr and broker boxes, keeping one pair alive and the other as a warm standby so in the event of a disaster, we simply change a few DNS entries, repoint a few channels and hey presto! we’re back in business using the other pair of machines. Same qmgr names, broker names, UUIDs, etc. This has been done successfully once, including the cloning of the boxes.

This setup obviously means changes have to be applied to both live and standby pairs, and unsurprisingly this hasn’t happened, so the DR failover exercise we’re in the middle of requires realignment of the two brokers and configmgrs as the first step. And it’s not working.

The standby configmgr database was restored from a DB2 backup of live with no apparent problems. The standby broker’s Oracle database was restored from the overnight live backup. A change freeze has ensured there have been no configuration changes for several days. No maintenance has been applied to either broker since the original cloning exercise. No passwords have been changed as far as I know.

So, you’d think that bringing up the standby broker with the same executables and the same underpinning tables would be just fine, but we see errors in the logs. These show that the broker is talking to its tables OK (ODBC trace has proved that) but not providing all the data required in the SQL calls, for example:

May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.error] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP2121E: The thread bootstrap code caught an unhandled exception on thread '33'. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/CommonServices/sparc_solaris_2/ImbOsThread.cpp: 237: ImbOsThread::threadBootStrap: :
May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.error] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP2230E: Error detected whilst processing a message in node 'PubSubControlMsgFlow.ControlNode'. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/JavaNodeLibrary/ImbPubSubControlNode.cpp: 592: ImbPubSubControlNode::evaluate: ComIbmPSControlNode: ControlNode
May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.info] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP7019E: Problem accessing the broker database for publish/subscribe function. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/JavaNodeLibrary/ImbPubSubJNIBrokerDatabase.cpp: 1390: Java_com_ibm_broker_server_BrokerDatabase_addSubscription: :
May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.error] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP2371E: Database statement 'INSERT INTO WBIM6165.BSUBSCRIPTIONS ( BrokerUUID , ClientId , SubscriptionId , Topic , SubPoint , Filter , Expiration , Creation , Options , SubInfo ) VALUES( ? , ? , ? , ? , ? , ? , ? , ? , ? , ?)' could not be executed. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 1593: ImbOdbcStatement::execDirect: :
May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.error] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP2321E: Database error: ODBC return code '-1'. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 224: ImbOdbcHandle::checkRcInner: :
May 16 12:03:12 srim6165 MQSIv500[17731]: [ID 702911 user.error] (UKBRKR1P_B.LZ_BENCHMARKS)[33]BIP2322E: Database error: SQL State 'HY000'; Native Error Code '-999'; Error Text '[DataDirect][ODBC Oracle driver][Oracle] Error in parameter 4.'. : UKBRKR1P_B.d86d9a5c-0e01-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 377: ImbOdbcHandle::checkRcInner: :

An attempt to stop a message flow gives a broadly similar error:

May 16 12:18:47 srim6165 MQSIv500[17924]: [ID 702911 user.error] (UKBRKR1P_B.EQ_EXECUTIONS)[42]BIP4041E: Execution group 'EQ_EXECUTIONS' received an invalid configuration message. See the following messages for details of the error. : UKBRKR1P_B.84c0c27c-1201-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbConfigurationNode.cpp: 364: ImbConfigurationNode::evaluate: ComIbmConfigurationNode: ConfigurationNode
May 16 12:18:47 srim6165 MQSIv500[17924]: [ID 702911 user.error] (UKBRKR1P_B.EQ_EXECUTIONS)[42]BIP2224E: A database operation failed after repeated attempts: diagnostic information 'ConfigurationMessageFlow.ConfigurationNode'. : UKBRKR1P_B.84c0c27c-1201-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbConfigurationNode.cpp: 324: ImbConfigurationNode::evaluate: ComIbmConfigurationNode: ConfigurationNode
May 16 12:18:47 srim6165 MQSIv500[17924]: [ID 702911 user.error] (UKBRKR1P_B.EQ_EXECUTIONS)[42]BIP2371E: Database statement 'UPDATE WBIM6165.BROKERRESOURCES SET ResourceData = ? WHERE BrokerUUID = ? AND ExecGroupUUID = ? AND ResourceType = ? AND ResourceName = ?' could not be executed. : UKBRKR1P_B.84c0c27c-1201-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 1593: ImbOdbcStatement::execDirect: :
May 16 12:18:47 srim6165 MQSIv500[17924]: [ID 702911 user.error] (UKBRKR1P_B.EQ_EXECUTIONS)[42]BIP2321E: Database error: ODBC return code '-1'. : UKBRKR1P_B.84c0c27c-1201-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 224: ImbOdbcHandle::checkRcInner: :
May 16 12:18:47 srim6165 MQSIv500[17924]: [ID 702911 user.error] (UKBRKR1P_B.EQ_EXECUTIONS)[42]BIP2322E: Database error: SQL State 'HY000'; Native Error Code '-999'; Error Text '[DataDirect][ODBC Oracle driver][Oracle] Error in parameter 1.'. : UKBRKR1P_B.84c0c27c-1201-0000-0080-d51319f22e3a: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 377: ImbOdbcHandle::checkRcInner: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.info] (UKBRKR1P_B)[1]BIP8099I: Retrieved message for Execution Group - EQ_EXECUTIONS : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminAgent.cpp: 2163: ImbAdminAgent::receiveEGResponse: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.info] (UKBRKR1P_B)[1]BIP8099I: Deploy Message Response sent to - ConfigMgr : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminAgent.cpp: 2559: ImbAdminAgent::receiveEGResponse: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2048E: An Exception was caught while issuing database SQL command UPDATE WBIM6165.BROKERAAEG SET DynamicState=1, DynamicSync=3, ExecGroupLabel = ? WHERE ExecGroupUUID = ? AND BrokerUUID = ?. : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminStore.cpp: 1543: ImbAdminStore::prepareCollectiveEGForCM: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2371E: Database statement 'UPDATE WBIM6165.BROKERAAEG SET DynamicState=1, DynamicSync=3, ExecGroupLabel = ? WHERE ExecGroupUUID = ? AND BrokerUUID = ?' could not be executed. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 1593: ImbOdbcStatement::execDirect: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2321E: Database error: ODBC return code '-1'. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 224: ImbOdbcHandle::checkRcInner: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2322E: Database error: SQL State 'HY000'; Native Error Code '1403'; Error Text '[DataDirect][ODBC Oracle driver][Oracle]ORA-01403: no data found Error in parameter 1.'. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 377: ImbOdbcHandle::checkRcInner: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.info] (UKBRKR1P_B)[1]BIP8099I: Retrieved message for Execution Group - EQ_EXECUTIONS : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminAgent.cpp: 2163: ImbAdminAgent::receiveEGResponse: :
May 16 12:18:47 srim6165 MQSIv500[17727]: [ID 702911 user.info] (UKBRKR1P_B)[1]BIP8099I: Deploy Message Response sent to - ConfigMgr : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminAgent.cpp: 2559: ImbAdminAgent::receiveEGResponse: :
May 16 12:18:48 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2048E: An Exception was caught while issuing database SQL command UPDATE WBIM6165.BROKERAAEG SET DynamicState=1, DynamicSync=3, ExecGroupLabel = ? WHERE ExecGroupUUID = ? AND BrokerUUID = ?. : UKBRKR1P_B.agent: /build/S500_P/src/AdminAgent/ImbAdminStore.cpp: 1543: ImbAdminStore::prepareCollectiveEGForCM: :
May 16 12:18:48 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2371E: Database statement 'UPDATE WBIM6165.BROKERAAEG SET DynamicState=1, DynamicSync=3, ExecGroupLabel = ? WHERE ExecGroupUUID = ? AND BrokerUUID = ?' could not be executed. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 1593: ImbOdbcStatement::execDirect: :
May 16 12:18:48 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2321E: Database error: ODBC return code '-1'. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 224: ImbOdbcHandle::checkRcInner: :
May 16 12:18:48 srim6165 MQSIv500[17727]: [ID 702911 user.error] (UKBRKR1P_B)[1]BIP2322E: Database error: SQL State 'HY000'; Native Error Code '1403'; Error Text '[DataDirect][ODBC Oracle driver][Oracle]ORA-01403: no data found Error in parameter 1.'. : UKBRKR1P_B.agent: /build/S500_P/src/DataFlowEngine/ImbOdbc.cpp: 377: ImbOdbcHandle::checkRcInner: :

I’ve highlighted a few bits but you’ll get the drift. We’ve tried restoring the broker database from a different Oracle backup and we get the same result. I’m now at a loss – we could just blow the lot away and start again but there are various reasons why that is the option of last resort.

Has anybody out there encountered anything like this? All constructive suggestions will be most warmly welcomed. I have a feeling it’s going to be a long weekend …
Many thanks –
Cliff
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Problem with a cloned (DR) broker
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.