MQSeries.net :: View topic - lingering process with hints of mq problems

ranstice · Posted: Wed Sep 25, 2002 10:21 pm Post subject:

Hi,

We have had a production problem for about 2 months now with a C process that lingers. In the middle of processing it sleeps - but never in the same place. Note this process is running parallel to 25 other processes - so we're of the belief this has something to do with timing or maybe a conflicted mq thread(??) - but we can't pinpoint it.

Today we used the modular debugger to get output (mvb). This is the output:

ffbee7f8 libc.so.1`_lwp_sema_wait+8(354e68, fefbe000, 0, 354da8, 24d54, 0)
ffbee858 libthread.so.1`_swtch+0x424(354da8, 354da8, fefbe000, 5, 1000, ff3c5f64)
ffbee8b8 libthread.so.1`_mutex_adaptive_lock+0x160(fefc98ec, 4c00, 1000, fffeffff, 1, 4d58)
ffbee918 libthread.so.1`_cmutex_lock+0x70(fe7be5f8, fefbe000, 0, fe74291c, 0, 0)
ffbee978 libc.so.1`free+0x18(38f128, 0, 0, 0, 0, 0)
ffbee9d8 libclntsh.so.8.0`epc_exit_handler+0x17c(fee5aa8c, 0, ff250ee8, fe71bc0c, 0, 0)
ffbeea38 libc.so.1`_exithandle+0x8c(fe7bc5d0, 3, fefbe000, 1, 2ccd8, fe7ba000)
ffbeea98 libc.so.1`exit+0x24(a, ff1f3dec, ff1fe5ec, 1, 2cc88, ff3c5f64)
ffbeeaf8 libmqmcs.so`xehInterpretSavedSigaction+0x49c(ff216668, ff216b28, ff216bd8, ffbeee30, ff1f3dec, a)
ffbeeb98 libmqmcs.so`xehExceptionHandler+0x138(a, ffbef0e8, ffbeee30, ff1f3dec, a, 1)
ffbeec98 libthread.so.1`__sighndlr+0xc(a, ffbef0e8, ffbeee30, ff11c874, 354e4c, 354e3c)
ffbeecf8 libthread.so.1`sigacthandler+0x708(a, 354da8, 0, 0, 0, fefbe000)
ffbef168 libc.so.1`_smalloc+0x8c(10, fe7c07e0, 4, 10, 0, 0)
ffbef1c8 libc.so.1`malloc+0x20(c, 43, 43000000, fe78e734, 20, 0)
ffbef228 program_initialization+0x1c4(ffbef390, ffbef390, fe7bdc68, fe7bdc68, 0, a7c60)
ffbef2a0 main2+0x360(f, 82ff0, 3e5338, 486600, 458408, 0)
ffbef3c0 invokeCACSApi+0x2e4(a427c, 1, 123000, fe78e734, 0, a3fc0)
ffbef4a8 processAccount+0x130(a41fc, 1, ffffffff, fe7bdc68, 58, 5

ffbef520 coreProcessingLoop+0x324(a41fc, 1, 354c00, 64, 0, 0)
ffbef590 main+0x7c(1, ffbef664, ffbef66c, a8da4, 0, 0)
ffbef600 _start+0x5c(0, 0, 0, 0, 0, 0)

We end up with the `_lwp_sema_wait when we do a truss on the PID. All our mq statements occur before we do invokeCACSApi (including a commit).

What is this xehExceptionHandler? Is this mq?

Should our process have a function that is invoked when an exception like this happens? If so, how would that function be called?

ANY assistance is greatly appreciated.
Thanks in advance,
Robin