|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Constant Core Dumping |
« View previous topic :: View next topic » |
Author |
Message
|
klink |
Posted: Fri Aug 01, 2003 5:23 am Post subject: Constant Core Dumping |
|
|
Newbie
Joined: 01 Aug 2003 Posts: 4
|
We are running WMQI 2.1 CSD03, MQ 5.3 fp3, AIX 5.1, DB2 7.2 fp9 client.
We have the configmgr running on win2k and the brokers running on AIX. The backend DB2 server is a remote server (to the AIX and NT Boxes) running on AIX as well.
I constantly (every day) have core dumps in every environment we are running. Currently we have Dev, Test, and Stage .... we go live in 3 weeks. Most of the time I get complete core dumps, but sometimes I don't. The DBX output is as follows. Obviously this is from an incomplete dump:
Several lines of
warning: Unable to access address 0x3000288c from core
warning: Unable to access address 0x3000288c from core
Followed by:
[using memory image in core]
warning: Unable to access address 0xf00cd304 from core
pthdb_session.c, 552: 1 PTHDB_CALLBACK (callback failed)
pthreaded.c, 1778: PTHDB_CALLBACK (callback failed)
IOT/Abort trap in pth_signal.pthread_kill [/usr/lib/libpthreads.a] at 0xd005a8f0
0xd005a8f0 (pthread_kill+0xa8) 80410014 lwz r2,0x14(r1)
(dbx)
I have my ulimits set to unlimited and I have plenty of file system space to hold the core. There is an associated wmqi abend file in the /var/mqsi/errors directory for each core. The contents of all are as follows:
abend record for pid 28984 tid 2057 time in seconds since 01/01/1970: 1059664705
File: /build/S210_P/src/CommonServices/Unix/ImbAbend.cpp
Line: 417
Function: signal received
---- Inserts ----
13
@(#) 1.28.1.4 CommonServices/Unix/ImbAbend.cpp, CommonServices, S210, S210-L2061
8 02/06/16 16:00:54 [6/18/02 14:41:19]
975066880
-----------------
----------------------------- Stack dump for current thread ( 2057)
(0xd0192798+0x000001ac) sqlcctcpsend__FP15sqlcc_comhandleP10sqlcc_cond [/home/db
2caed/sqllib/lib/libdtcp.a]
(0xd28a7e1c+0x0000044c) sqlccsend [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd28ddb10+0x000000ec) sqle_db2ra_sendbuffer__FP17sqle_db2ra_commonP5sqlca [/ho me/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2944e90+0x00002a20) sqle_db2ra_ar_sendrecv__FP10sqle_db2raP17sqle_ar_interfa
ceUli [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2940cbc+0x0000008c) sqle_db2ra_ar_prepexec__FP10sqle_db2raP17sqle_ar_interfa
ceUl [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2937ef0+0x00005e10) sqle_db2ra_ar_driver__FP10sqle_db2raP17sqle_ar_interface
[/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd28fbb04+0x000018d0) sqle_database_SQL__FP10sqle_db2raP10sqler_glob [/home/db
This is only part of the abend file. I will post more if necessary. We have DB2 setup with the Merant driver. I checked with Merant and there is no "updated" driver.
The core dump doesn't seem to interfere or otherwise affect the environment. No other errors in the system logs.
Weird thing is, I have the same issue with AIX 4.3.3, wmqi 2.1 csd04, mq 5.2 fp6, db2 7.2 fp7.
If anyone can help at all, it would be GREATLY appreciated!!! |
|
Back to top |
|
 |
Craig B |
Posted: Fri Aug 01, 2003 5:36 am Post subject: |
|
|
Partisan
Joined: 18 Jun 2003 Posts: 316 Location: UK
|
The abend is occuring in sqlccsend which is call made by the DB2 client when talking to a remote database server (or on AIX, TCP/IP loopback has been configured to overcome the limit of 10 shared memory connections). The ExecutionGroups DataFlowEngine process has terminated because it has detected a SIGSEGV from the DB2 client because the DB2 client encountered a problem that it could not deal with.
Typically this type of error occurs because there has been some disruption in the TCP/IP stack which the DB2 Client is using to comminute to the DB2 Server. This 'disruption' could possibly cause orphaned sockets where the DB2 Client believes it still has a connection and attempts to use it and this then causes the problems that eventually filter back to the Execution group causing it to terminate.
Do you terminate your connections, or server database in anyway during the day? Do you have any network connectivity problems that could cause the TCP/IP disruption?? _________________ Regards
Craig |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|