Author |
Message
|
vivekkooks |
Posted: Tue Aug 02, 2005 7:54 am Post subject: Error in multithreaded mq app |
|
|
 Voyager
Joined: 11 Jun 2003 Posts: 91
|
Hello,
I am using MQ Series 5.3 with ptf9 on Linux ES 3.0 Update 4.
I am using multithreads(linuxthreads) with each thread opening separate MQ connection for its operations.
All the _r.so are linked while building the app and the Q manager is opened with MQCNO_HANDLE_SHARE_BLOCK and the common subscribe queue is opened as (MQOO_INPUT_SHARED + MQOO_FAIL_IF_QUIESCING + MQOO_INQUIRE)
But, sometimes my threaded app hangs and I am getting the following trace using gdb backtrace:
#0 0xb6128810 in do_sigsuspend (set=0xbf7fe634) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
#1 0xb61288ba in *__GI___sigsuspend (set=0xbf7fe634) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
#2 0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbf7ffbe0) at pthread.c:1141
#3 0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbf7ffbe0) at restart.h:36
#4 0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
#5 0xb616d445 in __libc_free (mem=0x80955f8) at malloc.c:3356
#6 0xb612a704 in exit () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
#7 0xb5f5aef2 in xehInterpretSavedSigaction () from /opt/mqm/lib/libmqmcs_r.so
#8 0xb5f5b5f3 in xehExceptionHandler () from /opt/mqm/lib/libmqmcs_r.so
#9 0xb72abc72 in __pthread_sighandler_rt (signo=11, si=0xbf7fe8b0, uc=0xbf7fe930) at sighandler.c:62
#10 <signal handler called>
#11 malloc_consolidate (av=0xb62175c0) at malloc.c:4337
#12 0xb616dfde in _int_malloc (av=0xb62175c0, bytes=3022005808) at malloc.c:3790
#13 0xb616d276 in __libc_malloc (bytes=2048) at malloc.c:3292
#14 0xb62cb3d2 in operator new (sz=2048) at new_op.cc:48
#15 0xb62cb4eb in operator new[] (sz=2048) at new_opv.cc:36
#16 0xb60eab4b in ImqCac::resizeBuffer () from /opt/mqm/lib/libimqb23gl_r.so
#17 0xb75c7c6f in ImqQue::get () from /opt/mqm/lib/libimqs23gl_r.so
#18 0x0806914c in CQueue::GetMessage ()
#23 0xb72a69a5 in pthread_start_thread (arg=0xbf7ffbe0) at manager.c:300
#24 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
I checked the MQ errors in /var/mqm and these are the FDC headers I can find
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Sunday November 20 05:55:46 UTC 2005 |
| Host Name :- clusternode95.ud2.cybage.com (Linux 2.4.21-27.ELsmp) |
| PIDS :- 5724B4104 |
| LVLS :- 530 |
| Product Long Name :- WebSphere MQ for Linux for Intel |
| Vendor :- IBM |
| Probe Id :- XC271004 |
| Application Name :- MQM |
| Component :- xehStopAsySignalMonitor |
| Build Date :- Oct 12 2002 |
| CMVC level :- p000-L021011 |
| Build Type :- IKAP - (Production) |
| UserID :- 00000000 (root) |
| Program Name :- dspmqver |
| Thread-Process :- 00014101 |
| Thread :- 00000001 |
| Major Errorcode :- xecF_E_UNEXPECTED_RC |
| Minor Errorcode :- xecL_W_TIMEOUT |
| Probe Type :- MSGAMQ6118 |
| Probe Severity :- 2 |
| Probe Description :- AMQ6118: An internal WebSphere MQ error has occurred |
| (10806020) |
| FDCSequenceNumber :- 0 |
| Arith1 :- 276848672 10806020 |
| |
+-----------------------------------------------------------------------------+
MQM Function Stack
xcsTerminate
TermPrivateServices
xehTerminateAsySignalHandling
xehStopAsySignalMonitor
xcsFFST
Appreciate your help in this matter. |
|
Back to top |
|
 |
malammik |
Posted: Tue Aug 02, 2005 8:41 am Post subject: |
|
|
 Partisan
Joined: 27 Jan 2005 Posts: 397 Location: Philadelphia, PA
|
|
Back to top |
|
 |
wschutz |
Posted: Tue Aug 02, 2005 8:49 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
Code: |
I am using multithreads(linuxthreads) with each thread opening separate MQ connection for its operations.
All the _r.so are linked while building the app and the Q manager is opened with MQCNO_HANDLE_SHARE_BLOCK |
Why are you using SHARE_BLOCK if each thread has its own connection handle? Can you try it with MQCNO_HANDLE_SHARE_NONE? _________________ -wayne |
|
Back to top |
|
 |
vivekkooks |
Posted: Wed Aug 03, 2005 7:50 am Post subject: |
|
|
 Voyager
Joined: 11 Jun 2003 Posts: 91
|
Does MQ uses signal handlers?
If yes, this could be a problem because signal handling does not fully conform to the Posix standard.
#0 0xb6128810 in do_sigsuspend (set=0xbeffe514) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
#1 0xb61288ba in *__GI___sigsuspend (set=0xbeffe514) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
#2 0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbefffbe0) at pthread.c:1141
#3 0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbefffbe0) at restart.h:36
#4 0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
#5 0xb616d445 in __libc_free (mem=0x80965f8) at malloc.c:3356
#6 0xb612a704 in exit () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
#7 0xb5f5aef2 in xehInterpretSavedSigaction () from /opt/mqm/lib/libmqmcs_r.so
#8 0xb5f5b5f3 in xehExceptionHandler () from /opt/mqm/lib/libmqmcs_r.so
#9 0xb72abc72 in __pthread_sighandler_rt (signo=11, si=0xbeffe790, uc=0xbeffe810) at sighandler.c:62
#10 <signal handler called>
#11 0xb74834d9 in xercesc_2_2::BaseRefVectorOf<xercesc_2_2::ENameMap>::setElementAt () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
#12 0xb748290f in xercesc_2_2::XMLTransService::initTransService () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
#13 0xb744e149 in xercesc_2_2::XMLPlatformUtils::Initialize () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
#14 0x08074a08 in XmlManagerImpl::Init ()
#15 0x08074894 in XmlManagerImpl::XmlManagerImpl ()
#16 0x0806ae74 in LogMessage::LogMessage ()
#22 0xb72a69a5 in pthread_start_thread (arg=0xbefffbe0) at manager.c:300
#23 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6 |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Aug 03, 2005 8:01 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Are you positive you are using the correct threading model?
Is LD_ASSUME_KERNEL in effect? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
vivekkooks |
Posted: Wed Aug 03, 2005 9:12 pm Post subject: |
|
|
 Voyager
Joined: 11 Jun 2003 Posts: 91
|
Yes,
I tried with LD_ASSUME_KERNEL=2.4.19 as well as LD_ASSUME_KERNEL=2.2.5.
The applications is hung in both the cases. |
|
Back to top |
|
 |
vivekkooks |
Posted: Wed Aug 03, 2005 9:32 pm Post subject: |
|
|
 Voyager
Joined: 11 Jun 2003 Posts: 91
|
This is how another thread hangs:
#0 0xb6128810 in do_sigsuspend (set=0xbf5ff85c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
#1 0xb61288ba in *__GI___sigsuspend (set=0xbf5ff85c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
#2 0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbf5ffbe0) at pthread.c:1141
#3 0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbf5ffbe0) at restart.h:36
#4 0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
#5 0xb616d445 in __libc_free (mem=0x80cae20) at malloc.c:3356
#6 0xb5f92ea0 in destroy_thread () from /opt/mqm/lib/libmqmcs_r.so
#7 0xb72a96f8 in __pthread_destroy_specifics () at specific.c:192
#8 0xb72a60de in __pthread_do_exit (retval=0xfffffffc, currentframe=0xbf5ffbd4 "") at join.c:43
#9 0xb72a69ae in pthread_start_thread (arg=0xbf5ffbe0) at manager.c:303
#10 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
Now both the traces above show that in destroy_thread () from /opt/mqm/lib/libmqmcs_r.so is involved and may be causing some problem. However, no FDC is generated. |
|
Back to top |
|
 |
|