Author |
Message
|
starship |
Posted: Mon Jul 31, 2006 12:09 am Post subject: Queue Manager Crashing on Solaris |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
Hi Guys,
We are facing an unusual problem:
We have a queue manager which crashes after being online for a while. The unique thing which we noticed is that the time stamp present in the Queue Manager logs :
1. The error logs shows the time such as 02/07/75 on the date 31/07/06.
2. Sometimes the error logs shows the correct time specially when the Queue Manager is re started or shuts down.
3. The FDC's produced are also showing the wrong time stamp.
The System time is proper and their is no problem we find in the system clock.
Can you please help us diagnose the problem.
Regards |
|
Back to top |
|
 |
mvic |
Posted: Mon Jul 31, 2006 12:57 am Post subject: Re: Queue Manager Crashing on Solaris |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
starship wrote: |
Can you please help us diagnose the problem. |
Since you are receiving FDC files, it would be good to raise this with IBM Support to receive answers as soon as possible.
At the same time, you could post more of the details here if you wish. For example, the header portion of the FDC file(s). But these files are intended principally for use by your IBM Support representatives. |
|
Back to top |
|
 |
obriencm |
Posted: Mon Jul 31, 2006 12:57 am Post subject: |
|
|
Acolyte
Joined: 31 Jan 2002 Posts: 64 Location: Ireland
|
What information is being given in the error logs and in the FDC files? |
|
Back to top |
|
 |
starship |
Posted: Mon Jul 31, 2006 1:09 am Post subject: MQ Server on Solaris crashing |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
The FDC's Looks like this :
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Friday July 05 10:51:40 EST 1968 |
| Host Name :- au1086s (SunOS 5.7) |
| PIDS :- 5724B4103 |
| LVLS :- 530.10 CSD10 |
| Product Long Name :- WebSphere MQ for Sun Solaris |
| Vendor :- IBM |
| Probe Id :- XC034040 |
| Application Name :- MQM |
| Component :- xcsWaitEventSem |
| Build Date :- May 13 2005 |
| CMVC level :- p530-10-L050504 |
| Build Type :- IKAP - (Production) |
| UserID :- 00002200 (mqm) |
| Program Name :- amqcrsta_nd |
| Process :- 00026012 |
| Thread :- 00000001 |
| QueueManager :- AU30KD1!MQ |
| Major Errorcode :- Unknown(16) |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ0016 |
| Probe Severity :- 4 |
| Probe Description :- AMQ6090: WebSphere MQ was unable to display an error |
| message 16. |
| FDCSequenceNumber :- 0 |
| |
+-----------------------------------------------------------------------------+
MQM Function Stack
ccxTcpResponder
ccxResponder
rrxResponder
rriMQIServer
MQGET
zstMQGET
ziiMQGET
ziiSendReceiveAgent
zcpReceiveOnPipe
xcsWaitEventSem
xcsFFST
MQM Trace History
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ MQGET
-----{ zstMQGET
------{ zstVerifyPCD
------} zstVerifyPCD rc=OK
------{ ziiMQGET
-------{ ziiCreateIPCCMessage
--------{ zcpCreateMessage
--------} zcpCreateMessage rc=OK
-------} ziiCreateIPCCMessage rc=OK
-------{ ziiSendReceiveAgent
--------{ zcpSendOnPipe
---------{ xcsResetEventSem
---------} xcsResetEventSem rc=OK
---------{ xcsPostEventSem
---------} xcsPostEventSem rc=OK
--------} zcpSendOnPipe rc=OK
--------{ zcpReceiveOnPipe
---------{ xcsWaitEventSem
---------} xcsWaitEventSem rc=OK
--------} zcpReceiveOnPipe rc=OK
-------} ziiSendReceiveAgent rc=arcE_NO_MSG_AVAILABLE
-------{ zcpDeleteMessage
-------} zcpDeleteMessage rc=OK
------} ziiMQGET rc=OK
-----} zstMQGET rc=arcE_NO_MSG_AVAILABLE
----} MQGET rc=arcE_NO_MSG_AVAILABLE
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ ccxSend
-----{ cciTcpSend
------{ send
------} send rc=Unknown(1BC)
-----} cciTcpSend rc=OK
----} ccxSend rc=OK
----{ ccxFreeMem
----} ccxFreeMem rc=OK
----{ ccxReceive
-----{ cciTcpReceive
------{ ccxAllocMem
------} ccxAllocMem rc=OK
------{ recv
------} recv rc=Unknown(FFFF)
------{ xcsWaitFd
-------{ poll
-------} poll rc=Unknown(1)
------} xcsWaitFd rc=Unknown(1)
------{ recv
------} recv rc=Unknown(1BC)
-----} cciTcpReceive rc=OK
----} ccxReceive rc=OK
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ MQGET
-----{ zstMQGET
------{ zstVerifyPCD
------} zstVerifyPCD rc=OK
------{ ziiMQGET
-------{ ziiCreateIPCCMessage
--------{ zcpCreateMessage
--------} zcpCreateMessage rc=OK
-------} ziiCreateIPCCMessage rc=OK
-------{ ziiSendReceiveAgent
--------{ zcpSendOnPipe
---------{ xcsResetEventSem
---------} xcsResetEventSem rc=OK
---------{ xcsPostEventSem
---------} xcsPostEventSem rc=OK
--------} zcpSendOnPipe rc=OK
--------{ zcpReceiveOnPipe
---------{ xcsWaitEventSem
---------} xcsWaitEventSem rc=OK
--------} zcpReceiveOnPipe rc=OK
-------} ziiSendReceiveAgent rc=arcE_NO_MSG_AVAILABLE
-------{ zcpDeleteMessage
-------} zcpDeleteMessage rc=OK
------} ziiMQGET rc=OK
-----} zstMQGET rc=arcE_NO_MSG_AVAILABLE
----} MQGET rc=arcE_NO_MSG_AVAILABLE
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ ccxSend
-----{ cciTcpSend
------{ send
------} send rc=Unknown(1BC)
-----} cciTcpSend rc=OK
----} ccxSend rc=OK
----{ ccxFreeMem
----} ccxFreeMem rc=OK
----{ ccxReceive
-----{ cciTcpReceive
------{ ccxAllocMem
------} ccxAllocMem rc=OK
------{ recv
------} recv rc=Unknown(FFFF)
------{ xcsWaitFd
-------{ poll
-------} poll rc=Unknown(1)
------} xcsWaitFd rc=Unknown(1)
------{ recv
------} recv rc=Unknown(1BC)
-----} cciTcpReceive rc=OK
----} ccxReceive rc=OK
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ MQGET
-----{ zstMQGET
------{ zstVerifyPCD
------} zstVerifyPCD rc=OK
------{ ziiMQGET
-------{ ziiCreateIPCCMessage
--------{ zcpCreateMessage
--------} zcpCreateMessage rc=OK
-------} ziiCreateIPCCMessage rc=OK
-------{ ziiSendReceiveAgent
--------{ zcpSendOnPipe
---------{ xcsResetEventSem
---------} xcsResetEventSem rc=OK
---------{ xcsPostEventSem
---------} xcsPostEventSem rc=OK
--------} zcpSendOnPipe rc=OK
--------{ zcpReceiveOnPipe
---------{ xcsWaitEventSem
---------} xcsWaitEventSem rc=OK
--------} zcpReceiveOnPipe rc=OK
-------} ziiSendReceiveAgent rc=arcE_NO_MSG_AVAILABLE
-------{ zcpDeleteMessage
-------} zcpDeleteMessage rc=OK
------} ziiMQGET rc=OK
-----} zstMQGET rc=arcE_NO_MSG_AVAILABLE
----} MQGET rc=arcE_NO_MSG_AVAILABLE
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ ccxSend
-----{ cciTcpSend
------{ send
------} send rc=Unknown(1BC)
-----} cciTcpSend rc=OK
----} ccxSend rc=OK
----{ ccxFreeMem
----} ccxFreeMem rc=OK
----{ ccxReceive
-----{ cciTcpReceive
------{ ccxAllocMem
------} ccxAllocMem rc=OK
------{ recv
------} recv rc=Unknown(FFFF)
------{ xcsWaitFd
-------{ poll
-------} poll rc=Unknown(1)
------} xcsWaitFd rc=Unknown(1)
------{ recv
------} recv rc=Unknown(1BC)
-----} cciTcpReceive rc=OK
----} ccxReceive rc=OK
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ MQGET
-----{ zstMQGET
------{ zstVerifyPCD
------} zstVerifyPCD rc=OK
------{ ziiMQGET
-------{ ziiCreateIPCCMessage
--------{ zcpCreateMessage
--------} zcpCreateMessage rc=OK
-------} ziiCreateIPCCMessage rc=OK
-------{ ziiSendReceiveAgent
--------{ zcpSendOnPipe
---------{ xcsResetEventSem
---------} xcsResetEventSem rc=OK
---------{ xcsPostEventSem
---------} xcsPostEventSem rc=OK
--------} zcpSendOnPipe rc=OK
--------{ zcpReceiveOnPipe
---------{ xcsWaitEventSem
---------} xcsWaitEventSem rc=OK
--------} zcpReceiveOnPipe rc=OK
-------} ziiSendReceiveAgent rc=arcE_NO_MSG_AVAILABLE
-------{ zcpDeleteMessage
-------} zcpDeleteMessage rc=OK
------} ziiMQGET rc=OK
-----} zstMQGET rc=arcE_NO_MSG_AVAILABLE
----} MQGET rc=arcE_NO_MSG_AVAILABLE
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ ccxSend
-----{ cciTcpSend
------{ send
------} send rc=Unknown(1BC)
-----} cciTcpSend rc=OK
----} ccxSend rc=OK
----{ ccxFreeMem
----} ccxFreeMem rc=OK
----{ ccxReceive
-----{ cciTcpReceive
------{ ccxAllocMem
------} ccxAllocMem rc=OK
------{ recv
------} recv rc=Unknown(FFFF)
------{ xcsWaitFd
-------{ poll
-------} poll rc=Unknown(1)
------} xcsWaitFd rc=Unknown(1)
------{ recv
------} recv rc=Unknown(1BC)
-----} cciTcpReceive rc=OK
----} ccxReceive rc=OK
----{ rriConvert
-----{ xcsConvertString
-----} xcsConvertString rc=OK
----} rriConvert rc=OK
----{ MQGET
-----{ zstMQGET
------{ zstVerifyPCD
------} zstVerifyPCD rc=OK
------{ ziiMQGET
-------{ ziiCreateIPCCMessage
--------{ zcpCreateMessage
--------} zcpCreateMessage rc=OK
-------} ziiCreateIPCCMessage rc=OK
-------{ ziiSendReceiveAgent
--------{ zcpSendOnPipe
---------{ xcsResetEventSem
---------} xcsResetEventSem rc=OK
---------{ xcsPostEventSem
---------} xcsPostEventSem rc=OK
--------} zcpSendOnPipe rc=OK
--------{ zcpReceiveOnPipe
---------{ xcsWaitEventSem
----------{ xcsBuildDumpPtr
-----------{ xcsGetMem
-----------} xcsGetMem rc=OK
----------} xcsBuildDumpPtr rc=OK
----------{ xcsFFST |
|
Back to top |
|
 |
starship |
Posted: Mon Jul 31, 2006 1:16 am Post subject: |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
The error file consist of errors like :
-------------------------------------------------------------------------------
21/07/68 11:22:00 AM
AMQ6184: An internal WebSphere MQ error has occurred on queue manager
AU30KD1!MQ.
EXPLANATION:
An error has been detected, and the WebSphere MQ error recording routine has
been called. The failing process is process 13065.
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier, and to save the generated output files. Contact your IBM support
center. Do not discard these files until the problem has been resolved.
----- amqxfdcx.c : 722 --------------------------------------------------------
07/21/68 11:22:00
AMQ6090: WebSphere MQ was unable to display an error message 16.
EXPLANATION:
MQ has attempted to display the message associated with return code hexadecimal
'16'. The return code indicates that there is no message text associated with
the message. Associated with the request are inserts 0 : 0 : : : .
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier, and to save the generated output files. Contact your IBM support
center. Do not discard these files until the problem has been resolved. |
|
Back to top |
|
 |
obriencm |
Posted: Mon Jul 31, 2006 1:28 am Post subject: |
|
|
Acolyte
Joined: 31 Jan 2002 Posts: 64 Location: Ireland
|
|
Back to top |
|
 |
xxx |
Posted: Mon Jul 31, 2006 9:51 am Post subject: |
|
|
Centurion
Joined: 13 Oct 2003 Posts: 137
|
|
Back to top |
|
 |
starship |
Posted: Mon Jul 31, 2006 6:03 pm Post subject: |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
Quote: |
I think the solutio was the following:
Set in /etc/system :
set rlim_fd_cur=1024 |
The above said value is already set to 1024. In the meanwhile We have seen the Queue Manager coming down 5 times in 2 weeks of time. |
|
Back to top |
|
 |
starship |
Posted: Wed Aug 09, 2006 2:34 am Post subject: |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
Thanks Guys for your inputs, We were able to resolve the problem, as it came out to be a problem with the UNIX hardware, which was reporting bad time. The component has benn replaced and now the Queue Manager is also working fine.
Thanks |
|
Back to top |
|
 |
obriencm |
Posted: Wed Aug 09, 2006 2:37 am Post subject: |
|
|
Acolyte
Joined: 31 Jan 2002 Posts: 64 Location: Ireland
|
great to hear it is working again correctly. How long now since you have had a failure? |
|
Back to top |
|
 |
starship |
Posted: Wed Aug 09, 2006 2:55 am Post subject: |
|
|
Apprentice
Joined: 07 Dec 2005 Posts: 33 Location: INDIA
|
Quote: |
How long now since you have had a failure |
It was for about 3 weeks before we were able to find the solution for it. |
|
Back to top |
|
 |
obriencm |
Posted: Wed Aug 09, 2006 2:58 am Post subject: |
|
|
Acolyte
Joined: 31 Jan 2002 Posts: 64 Location: Ireland
|
How long has it now been since you identified the problem with the hardware? When did your Qmgrs last crash? |
|
Back to top |
|
 |
|