| Author | Message | 
		
		  | vivekkooks | 
			  
				|  Posted: Tue Aug 02, 2005 7:54 am    Post subject: Error in multithreaded mq app |   |  | 
		
		  |  Voyager
 
 
 Joined: 11 Jun 2003Posts: 91
 
 
 | 
			  
				| Hello, I am using MQ Series 5.3 with ptf9 on Linux ES 3.0 Update 4.
 
 I am using multithreads(linuxthreads) with each thread opening separate MQ connection for its operations.
 All the _r.so are linked while building the app and the Q manager is opened with MQCNO_HANDLE_SHARE_BLOCK and the common subscribe queue is opened as (MQOO_INPUT_SHARED + MQOO_FAIL_IF_QUIESCING + MQOO_INQUIRE)
 
 But, sometimes my threaded app hangs and I am getting the following trace using gdb backtrace:
 
 #0  0xb6128810 in do_sigsuspend (set=0xbf7fe634) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
 #1  0xb61288ba in *__GI___sigsuspend (set=0xbf7fe634) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
 #2  0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbf7ffbe0) at pthread.c:1141
 #3  0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbf7ffbe0) at restart.h:36
 #4  0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
 #5  0xb616d445 in __libc_free (mem=0x80955f8) at malloc.c:3356
 #6  0xb612a704 in exit () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
 #7  0xb5f5aef2 in xehInterpretSavedSigaction () from /opt/mqm/lib/libmqmcs_r.so
 #8  0xb5f5b5f3 in xehExceptionHandler () from /opt/mqm/lib/libmqmcs_r.so
 #9  0xb72abc72 in __pthread_sighandler_rt (signo=11, si=0xbf7fe8b0, uc=0xbf7fe930) at sighandler.c:62
 #10 <signal handler called>
 #11 malloc_consolidate (av=0xb62175c0) at malloc.c:4337
 #12 0xb616dfde in _int_malloc (av=0xb62175c0, bytes=3022005808) at malloc.c:3790
 #13 0xb616d276 in __libc_malloc (bytes=2048) at malloc.c:3292
 #14 0xb62cb3d2 in operator new (sz=2048) at new_op.cc:48
 #15 0xb62cb4eb in operator new[] (sz=2048) at new_opv.cc:36
 #16 0xb60eab4b in ImqCac::resizeBuffer () from /opt/mqm/lib/libimqb23gl_r.so
 #17 0xb75c7c6f in ImqQue::get () from /opt/mqm/lib/libimqs23gl_r.so
 #18 0x0806914c in CQueue::GetMessage ()
 #23 0xb72a69a5 in pthread_start_thread (arg=0xbf7ffbe0) at manager.c:300
 #24 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
 
 
 I checked the MQ errors in /var/mqm and these are the FDC headers I can find
 
 +-----------------------------------------------------------------------------+
 |                                                                             |
 | WebSphere MQ First Failure Symptom Report                                   |
 | =========================================                                   |
 |                                                                             |
 | Date/Time         :- Sunday November 20 05:55:46 UTC 2005                   |
 | Host Name         :- clusternode95.ud2.cybage.com (Linux 2.4.21-27.ELsmp)   |
 | PIDS              :- 5724B4104                                              |
 | LVLS              :- 530                                                    |
 | Product Long Name :- WebSphere MQ for Linux for Intel                       |
 | Vendor            :- IBM                                                    |
 | Probe Id          :- XC271004                                               |
 | Application Name  :- MQM                                                    |
 | Component         :- xehStopAsySignalMonitor                                |
 | Build Date        :- Oct 12 2002                                            |
 | CMVC level        :- p000-L021011                                           |
 | Build Type        :- IKAP - (Production)                                    |
 | UserID            :- 00000000 (root)                                        |
 | Program Name      :- dspmqver                                               |
 | Thread-Process    :- 00014101                                               |
 | Thread            :- 00000001                                               |
 | Major Errorcode   :- xecF_E_UNEXPECTED_RC                                   |
 | Minor Errorcode   :- xecL_W_TIMEOUT                                         |
 | Probe Type        :- MSGAMQ6118                                             |
 | Probe Severity    :- 2                                                      |
 | Probe Description :- AMQ6118: An internal WebSphere MQ error has occurred   |
 |   (10806020)                                                                |
 | FDCSequenceNumber :- 0                                                      |
 | Arith1            :- 276848672 10806020                                     |
 |                                                                             |
 +-----------------------------------------------------------------------------+
 
 MQM Function Stack
 xcsTerminate
 TermPrivateServices
 xehTerminateAsySignalHandling
 xehStopAsySignalMonitor
 xcsFFST
 
 
 Appreciate your help in this matter.
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | malammik | 
			  
				|  Posted: Tue Aug 02, 2005 8:41 am    Post subject: |   |  | 
		
		  |  Partisan
 
 
 Joined: 27 Jan 2005Posts: 397
 Location: Philadelphia, PA
 
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | wschutz | 
			  
				|  Posted: Tue Aug 02, 2005 8:49 am    Post subject: |   |  | 
		
		  |  Jedi Knight
 
 
 Joined: 02 Jun 2005Posts: 3316
 Location: IBM (retired)
 
 | 
			  
				| 
  Why are you using SHARE_BLOCK if each thread has its own connection handle?  Can you try it with MQCNO_HANDLE_SHARE_NONE? 
	| Code: |  
	| I am using multithreads(linuxthreads) with each thread opening separate MQ connection for its operations. All the _r.so are linked while building the app and the Q manager is opened with MQCNO_HANDLE_SHARE_BLOCK
 |  _________________
 -wayne
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | vivekkooks | 
			  
				|  Posted: Wed Aug 03, 2005 7:50 am    Post subject: |   |  | 
		
		  |  Voyager
 
 
 Joined: 11 Jun 2003Posts: 91
 
 
 | 
			  
				| Does MQ uses signal handlers? If yes, this could be a problem because signal handling does not fully conform to the Posix standard.
 
 #0  0xb6128810 in do_sigsuspend (set=0xbeffe514) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
 #1  0xb61288ba in *__GI___sigsuspend (set=0xbeffe514) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
 #2  0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbefffbe0) at pthread.c:1141
 #3  0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbefffbe0) at restart.h:36
 #4  0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
 #5  0xb616d445 in __libc_free (mem=0x80965f8) at malloc.c:3356
 #6  0xb612a704 in exit () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
 #7  0xb5f5aef2 in xehInterpretSavedSigaction () from /opt/mqm/lib/libmqmcs_r.so
 #8  0xb5f5b5f3 in xehExceptionHandler () from /opt/mqm/lib/libmqmcs_r.so
 #9  0xb72abc72 in __pthread_sighandler_rt (signo=11, si=0xbeffe790, uc=0xbeffe810) at sighandler.c:62
 #10 <signal handler called>
 #11 0xb74834d9 in xercesc_2_2::BaseRefVectorOf<xercesc_2_2::ENameMap>::setElementAt () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
 #12 0xb748290f in xercesc_2_2::XMLTransService::initTransService () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
 #13 0xb744e149 in xercesc_2_2::XMLPlatformUtils::Initialize () from /usr/local/MPInsight/lib/sys/lib/libxerces-c.so.22
 #14 0x08074a08 in XmlManagerImpl::Init ()
 #15 0x08074894 in XmlManagerImpl::XmlManagerImpl ()
 #16 0x0806ae74 in LogMessage::LogMessage ()
 
 #22 0xb72a69a5 in pthread_start_thread (arg=0xbefffbe0) at manager.c:300
 #23 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | jefflowrey | 
			  
				|  Posted: Wed Aug 03, 2005 8:01 am    Post subject: |   |  | 
		
		  | Grand Poobah
 
 
 Joined: 16 Oct 2002Posts: 19981
 
 
 | 
			  
				| Are you positive you are using the correct threading model? 
 Is LD_ASSUME_KERNEL in effect?
 _________________
 I am *not* the model of the modern major general.
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | vivekkooks | 
			  
				|  Posted: Wed Aug 03, 2005 9:12 pm    Post subject: |   |  | 
		
		  |  Voyager
 
 
 Joined: 11 Jun 2003Posts: 91
 
 
 | 
			  
				| Yes, I tried with LD_ASSUME_KERNEL=2.4.19 as well as LD_ASSUME_KERNEL=2.2.5.
 
 The applications is hung in both the cases.
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  | vivekkooks | 
			  
				|  Posted: Wed Aug 03, 2005 9:32 pm    Post subject: |   |  | 
		
		  |  Voyager
 
 
 Joined: 11 Jun 2003Posts: 91
 
 
 | 
			  
				| This is how another thread hangs: 
 #0  0xb6128810 in do_sigsuspend (set=0xbf5ff85c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:50
 #1  0xb61288ba in *__GI___sigsuspend (set=0xbf5ff85c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:87
 #2  0xb72a8a55 in __pthread_wait_for_restart_signal (self=0xbf5ffbe0) at pthread.c:1141
 #3  0xb72aa444 in __pthread_alt_lock (lock=0xb62175d0, self=0xbf5ffbe0) at restart.h:36
 #4  0xb72a7642 in *__GI___pthread_mutex_lock (mutex=0xb62175c0) at mutex.c:123
 #5  0xb616d445 in __libc_free (mem=0x80cae20) at malloc.c:3356
 #6  0xb5f92ea0 in destroy_thread () from /opt/mqm/lib/libmqmcs_r.so
 #7  0xb72a96f8 in __pthread_destroy_specifics () at specific.c:192
 #8  0xb72a60de in __pthread_do_exit (retval=0xfffffffc, currentframe=0xbf5ffbd4 "") at join.c:43
 #9  0xb72a69ae in pthread_start_thread (arg=0xbf5ffbe0) at manager.c:303
 #10 0xb61cbe87 in __clone () from /usr/local/MPInsight/lib/sys/lib/libc.so.6
 
 
 Now both the traces above show that in destroy_thread () from /opt/mqm/lib/libmqmcs_r.so is involved and may be causing some problem. However, no FDC is generated.
 |  | 
		
		  | Back to top |  | 
		
		  |  | 
		
		  |  |