ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Execution Group Crashes - How to determine Root Cause?

Post new topic  Reply to topic
 Execution Group Crashes - How to determine Root Cause? « View previous topic :: View next topic » 
Author Message
jwells34
PostPosted: Mon Feb 11, 2008 8:45 am    Post subject: Execution Group Crashes - How to determine Root Cause? Reply with quote

Newbie

Joined: 01 Nov 2007
Posts: 6

Hi,

Here's the Environment setup:
AIX 5.3.4
2 x CPU 600Mhz
12 GB RAM

Software
WBIMB v5.0 FP09
WMQ Server v6.0.2.2
DB2 v8.1.1.136

We have an Execution Group that crashes almost daily. The first time a message comes it, it fails. We reprocess the message it works successfully.

So, once a message arrives in the MQInput Node, it crashes and restarts itself (at least that is what it appears has happened). The message it was trying to process immediately goes to the failure note and we get this in the trace file from the Message Flow that is in the Execution Group:

__________________________________________________ 2008-02-10 06:37:46.214421 --------------------------------------------------------------------------- EXCEPTION LIST (
(0x01000000):RecoverableException = (
(0x03000000):File = '/build/S500_P/src/DataFlowEngine/ImbMqInputNode.cpp'
(0x03000000):Line = 3328
(0x03000000):Function = 'ImbMqInputNode::eligibleForBackout'
(0x03000000):Type = 'ComIbmMQInputNode'
(0x03000000):Name = 'PHD_PROCESS_TANKS_IN#FCMComposite_1_1'
(0x03000000):Label = 'PHD_PROCESS_TANKS_IN.PHD.PROCESS.TANKS.IN'
(0x03000000):Text = 'Dequeued failed message. Propagating a message to the failure terminal'
(0x03000000):Catalog = 'BIPv500'
(0x03000000):Severity = 3
(0x03000000):Number = 2652
)


I also found the broker log file:

abend record for pid 44126 tid 2075 time in seconds since 01/01/1970: 1202553444
File: /build/S500_P/src/CommonServices/Unix/ImbAbend.cpp
Line: 515
Function: signal received
---- Inserts ----
4
@(#) 1.33.2.5 CommonServices/Unix/ImbAbend.cpp, CommonServices, S500, S500-CSD08D1 06/04/06 16:54:05 [5/2/06 20:00:12]
999383040
-----------------
----------------------------- Stack dump for current thread ( 2075)
(0xdf164918+0x000007a0) ttcdrv [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdf0eb930+0x00000050) nioqwa [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xded4e474+0x00000640) upirtrc [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee54a60+0x0000006c) kpurcsc [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee6f4cc+0x0000053c) kpuexecv8 [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee705cc+0x00000ff8) kpuexec [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee972e8+0x0000001c) OCIStmtExecute [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xde861c3c+0x00000000) <no name available> [/usr/opt/wmqi/merant/lib/UKor818.so]
(0xde9d8068+0x00000000) <no name available> [/usr/lib/libUKbas18.a(UKbas18.so)]
(0xde9d6cf0+0x00000000) <no name available> [/usr/lib/libUKbas18.a(UKbas18.so)]
(0xde84299c+0x00000000) <no name available> [/usr/opt/wmqi/merant/lib/UKor818.so]
(0xdb69c7ac+0x00000000) <no name available> [/usr/opt/mqsi/merant/lib/libodbc.a(odbc.so)]
(0xda3281b8+0x00000460) execute__16ImbOdbcStatementFv [/usr/opt/mqsi/lib/libMessageServices.a(libMessageServices.a.so)]
(0xdc02c284+0x0000104c) executeStmt__17SqlExternalDbStmtCFR14SqlEvalEnvironRC13ImbStringBaseXTwTQ2_3std11char_traitsXTw_TcSP37_Q2_3std6_PtritXTP17SqlExpressionNodeTlTPCP17SqlExpressionNodeTRCP17SqlExpressionNodeTPP17SqlExpressionNodeTRP17SqlExpressionNode_T3RC18SqlGeneralLocationi [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc07dec8+0x0000036c) evaluate__17SqlPassthruFnCallCFR9SqlResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc26d224+0x00000084) assignToMessage__13SqlAssignmentCFR14SqlEvalEnviron [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc2697b8+0x00000b54) execute__13SqlAssignmentCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc051b1c+0x000000d8) execute__17SqlStatementGroupCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc274c70+0x000001b0) execute__15SqlCompoundStmtCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdbf01164+0x00000168) execute__10SqlRoutineCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc3cb278+0x000000ec) execute__9SqlModuleCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc2860cc+0x0000002c) execute__9SqlSchemaCFR18SqlStatementResult [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdc3d2e8c+0x0000054c) evaluate__19SqlComputeInterfaceFRC18ImbMessageAssemblyR18ImbMessageAssembly [/usr/opt/mqsi/lib/libImbRdl.a(libImbRdl.a.so)]
(0xdcdfe080+0x0000040c) evaluate__14ImbComputeNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/usr/opt/mqsi/lil/imbdfsql.lil]
(0xdb88dba0+0x000001d8) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdba48afc+0x00000354) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdce57c98+0x00000284) propagate__14ImbComputeNodeFRC18ImbMessageAssemblyR18ImbMessageAssembly [/usr/opt/mqsi/lil/imbdfsql.lil]
(0xdcdfe080+0x00000c4c) evaluate__14ImbComputeNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/usr/opt/mqsi/lil/imbdfsql.lil]
(0xdb88dba0+0x000001d8) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdba48afc+0x00000354) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdce57c98+0x00000284) propagate__14ImbComputeNodeFRC18ImbMessageAssemblyR18ImbMessageAssembly [/usr/opt/mqsi/lil/imbdfsql.lil]
(0xdcdfe080+0x00000c4c) evaluate__14ImbComputeNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/usr/opt/mqsi/lil/imbdfsql.lil]
(0xdb88dba0+0x000001d8) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdba48afc+0x00000354) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdc691e48+0x00000108) evaluate__15ImbTryCatchNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/usr/opt/mqsi/lil/imbdfbas.lil]
(0xdb88dba0+0x000001d8) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdba48afc+0x00000354) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/usr/opt/mqsi/lib/libDataFlowDLL.a(libDataFlowDLL.a.so)]
(0xdc749f08+0x000055e0) readQueue__14ImbMqInputNodeFP11ImbOsThread [/usr/opt/mqsi/lib/libMQLibrary.a(libMQLibrary.a.so)]
(0xdc756180+0x00000048) run__Q2_14ImbMqInputNode10ParametersFP11ImbOsThread [/usr/opt/mqsi/lib/libMQLibrary.a(libMQLibrary.a.so)]
(0xd6fa3108+0x00000070) run__27ImbThreadPoolThreadFunctionFP11ImbOsThread [/usr/opt/mqsi/lib/libCommonServices.a(libCommonServices.a.so)]
(0xd6f942e8+0x00000054) threadRun__11ImbOsThreadFv [/usr/opt/mqsi/lib/libCommonServices.a(libCommonServices.a.so)]
(0xd6f93e8c+0x00000064) threadBootStrap__11ImbOsThreadFPv [/usr/opt/mqsi/lib/libCommonServices.a(libCommonServices.a.so)]
(0xd004c528+0x0000011c) _pthread_body [/usr/lib/libpthreads.a(shr_xpg5.o)]
(0x00000000) <invalid code address>
----------------------------------------------------------------------


I was wondering if anyone else has run into this issue or have some ideas to help debug the situation.


I appreciate your time.

Thanks
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Mon Feb 11, 2008 9:09 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Either your SQL statement has a problem, or your database connection was dropped overnight.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Ian
PostPosted: Mon Feb 11, 2008 9:38 am    Post subject: Reply with quote

Disciple

Joined: 22 Nov 2002
Posts: 152
Location: London, UK

You have indicated that your (broker) environment uses DB2.

Quote:

WBIMB v5.0 FP09
WMQ Server v6.0.2.2
DB2 v8.1.1.136


However, the call stack shows that the failure is in an Oracle client library.

Oracle Client ---> /var/mqsi/lib/libclntsh.a
DataDirect (Merant) Oracle Driver used by Message Broker ---> /usr/opt/wmqi/merant/lib/UKor818.so

Quote:

0xdf164918+0x000007a0) ttcdrv [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdf0eb930+0x00000050) nioqwa [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xded4e474+0x00000640) upirtrc [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee54a60+0x0000006c) kpurcsc [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee6f4cc+0x0000053c) kpuexecv8 [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee705cc+0x00000ff8) kpuexec [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xdee972e8+0x0000001c) OCIStmtExecute [/var/mqsi/lib/libclntsh.a(shr.o)]
(0xde861c3c+0x00000000) <no name available> [/usr/opt/wmqi/merant/lib/UKor818.so]
(0xde9d8068+0x00000000) <no name available> [/usr/lib/libUKbas18.a(UKbas18.so)]
(0xde9d6cf0+0x00000000) <no name available> [/usr/lib/libUKbas18.a(UKbas18.so)]
(0xde84299c+0x00000000) <no name available> [/usr/opt/wmqi/merant/lib/UKor818.so]
(0xdb69c7ac+0x00000000) <no name available> [/usr/opt/mqsi/merant/lib/libodbc.a(odbc.so)]
(0xda3281b8+0x00000460) execute__16ImbOdbcStatementFv [/usr/opt/mqsi/lib/libMessageServices.a(libMessageServices.a.so)]


I would suggest you first sort out whether you are expecting access to DB2 or Oracle. If it is the latter then you should search for known Oracle problems relating to the 'ttcdrv' method in the Oracle Client library.
_________________
Regards, Ian
Back to top
View user's profile Send private message
jwells34
PostPosted: Mon Feb 11, 2008 10:09 am    Post subject: Reply with quote

Newbie

Joined: 01 Nov 2007
Posts: 6

Hi,

Thanks for the information so far. Greatly appreciated!!!

I should of metioned the Message Flow does perform an Oracle database lookup not DB2.


So, if the Oracle connection is getting dropped and causing the Execution Group to crash, is there anyway around this situation?

Josh
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Mon Feb 11, 2008 11:13 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

FP10 for WB-IMB 5.0 has some fixes for this type of problem. We had the same issue with an EG crashing even an application DB2 DB on the mainframe went down and up. FP10 fixed that, although we didn't upgrade to FP10 for other reasons (a lot of our flows needed code changes).

We put the problem flow in its own EG to isolate the effect of the EG restarting. Its rare enough that we decided to live with it until we upgrade to WMB 6.0. Plus the EG "only" crashes once. When the message is retried it works.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jwells34
PostPosted: Tue Feb 19, 2008 6:27 am    Post subject: Reply with quote

Newbie

Joined: 01 Nov 2007
Posts: 6

I was curious if you know what APAR fixed your issue?
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Feb 19, 2008 9:56 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

I never dug into the Fix Pack read me to see which APAR it was. IBM Support said CSD10 had lots of connection fixes. We applied it and the problem went away.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Execution Group Crashes - How to determine Root Cause?
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.