ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Constant Core Dumping

Post new topic  Reply to topic
 Constant Core Dumping « View previous topic :: View next topic » 
Author Message
klink
PostPosted: Fri Aug 01, 2003 5:23 am    Post subject: Constant Core Dumping Reply with quote

Newbie

Joined: 01 Aug 2003
Posts: 4

We are running WMQI 2.1 CSD03, MQ 5.3 fp3, AIX 5.1, DB2 7.2 fp9 client.

We have the configmgr running on win2k and the brokers running on AIX. The backend DB2 server is a remote server (to the AIX and NT Boxes) running on AIX as well.

I constantly (every day) have core dumps in every environment we are running. Currently we have Dev, Test, and Stage .... we go live in 3 weeks. Most of the time I get complete core dumps, but sometimes I don't. The DBX output is as follows. Obviously this is from an incomplete dump:

Several lines of
warning: Unable to access address 0x3000288c from core
warning: Unable to access address 0x3000288c from core

Followed by:

[using memory image in core]
warning: Unable to access address 0xf00cd304 from core
pthdb_session.c, 552: 1 PTHDB_CALLBACK (callback failed)
pthreaded.c, 1778: PTHDB_CALLBACK (callback failed)

IOT/Abort trap in pth_signal.pthread_kill [/usr/lib/libpthreads.a] at 0xd005a8f0
0xd005a8f0 (pthread_kill+0xa8) 80410014 lwz r2,0x14(r1)
(dbx)

I have my ulimits set to unlimited and I have plenty of file system space to hold the core. There is an associated wmqi abend file in the /var/mqsi/errors directory for each core. The contents of all are as follows:

abend record for pid 28984 tid 2057 time in seconds since 01/01/1970: 1059664705
File: /build/S210_P/src/CommonServices/Unix/ImbAbend.cpp
Line: 417
Function: signal received
---- Inserts ----
13
@(#) 1.28.1.4 CommonServices/Unix/ImbAbend.cpp, CommonServices, S210, S210-L2061
8 02/06/16 16:00:54 [6/18/02 14:41:19]
975066880
-----------------
----------------------------- Stack dump for current thread ( 2057)
(0xd0192798+0x000001ac) sqlcctcpsend__FP15sqlcc_comhandleP10sqlcc_cond [/home/db
2caed/sqllib/lib/libdtcp.a]
(0xd28a7e1c+0x0000044c) sqlccsend [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd28ddb10+0x000000ec) sqle_db2ra_sendbuffer__FP17sqle_db2ra_commonP5sqlca [/ho me/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2944e90+0x00002a20) sqle_db2ra_ar_sendrecv__FP10sqle_db2raP17sqle_ar_interfa
ceUli [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2940cbc+0x0000008c) sqle_db2ra_ar_prepexec__FP10sqle_db2raP17sqle_ar_interfa
ceUl [/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd2937ef0+0x00005e10) sqle_db2ra_ar_driver__FP10sqle_db2raP17sqle_ar_interface
[/home/db2caed/sqllib/lib/db2.o.db2caef]
(0xd28fbb04+0x000018d0) sqle_database_SQL__FP10sqle_db2raP10sqler_glob [/home/db

This is only part of the abend file. I will post more if necessary. We have DB2 setup with the Merant driver. I checked with Merant and there is no "updated" driver.

The core dump doesn't seem to interfere or otherwise affect the environment. No other errors in the system logs.

Weird thing is, I have the same issue with AIX 4.3.3, wmqi 2.1 csd04, mq 5.2 fp6, db2 7.2 fp7.

If anyone can help at all, it would be GREATLY appreciated!!!
Back to top
View user's profile Send private message
Craig B
PostPosted: Fri Aug 01, 2003 5:36 am    Post subject: Reply with quote

Partisan

Joined: 18 Jun 2003
Posts: 316
Location: UK

The abend is occuring in sqlccsend which is call made by the DB2 client when talking to a remote database server (or on AIX, TCP/IP loopback has been configured to overcome the limit of 10 shared memory connections). The ExecutionGroups DataFlowEngine process has terminated because it has detected a SIGSEGV from the DB2 client because the DB2 client encountered a problem that it could not deal with.

Typically this type of error occurs because there has been some disruption in the TCP/IP stack which the DB2 Client is using to comminute to the DB2 Server. This 'disruption' could possibly cause orphaned sockets where the DB2 Client believes it still has a connection and attempts to use it and this then causes the problems that eventually filter back to the Execution group causing it to terminate.

Do you terminate your connections, or server database in anyway during the day? Do you have any network connectivity problems that could cause the TCP/IP disruption??
_________________
Regards
Craig
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Constant Core Dumping
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.