|   | 
	 
  
    | 
RSS Feed - WebSphere MQ Support
 | 
RSS Feed - Message Broker Support
 |   
 
  
	     | 
	 | 
   
 
  
	|  Analysing FDCs on Unix - it's not always MQ that's wrong | 
	« View previous topic :: View next topic »  | 
   
  
  	
	  
		
		
		  | Author | 
		  Message
		 |  
		
		  | dgolding | 
		  
		    
			  
				 Posted: Fri Jul 05, 2002 1:41 am    Post subject: Analysing FDCs on Unix - it's not always MQ that's wrong | 
				     | 
			   
			 
		   | 
		 
		
		    Yatiri
 
 Joined: 16 May 2001 Posts: 668 Location: Switzerland 
  | 
		  
		    
			  
				Many thanks to Justin Fries for his permission to republish this (from Vienna MQ List April 2000):
 
 
Hi, all.
 
 
     If you haven't seen an FDC with a probeid like XC130001 or XC130003,
 
you soon will.  These FDCs are created on UNIX systems when an application
 
does something foolish like trying to access memory outside its address
 
space (a segmentation violation), or trying to access non word-aligned
 
memory (a bus violation).  UNIX sends signals to misbehaved processes which
 
try to do these things--SIGSEGV and SIGBUS in the two examples I gave.  If
 
the application is a queue manager process or a connected user application,
 
the MQSeries signal handler will write out one of these XC13000x FDCs to
 
help us debug the problem.
 
 
     If an XC13000x FDC is created by a core queue manager
 
program--amqcrsta, amqzxma0, runmqsc, or any other program written entirely
 
by IBM--then the problem is definitely ours.  In that case, we'll need to
 
examine the FDC and perhaps gather other information to diagnose the
 
problem.  If an XC13000x FDC is created by a user application which is
 
merely connected to the queue manager, there is a possibility that the
 
failure is in the customer's code and not ours.  For example, a user thread
 
may cause a segmentation violation by doing something foolish (access NULL
 
pointers, for example), and this will cause UNIX to send signal 11
 
(SIGSEGV) to the process.  Because MQSeries has its own handler installed
 
for SIGSEGV, we print out an FDC for the problem.  In this case, we're
 
really reporting someone else's error, but it almost looks as if we are the
 
culprits.
 
 
     The MQSeries function which handles these synchronous signals is
 
called xehExceptionHandler, and it can create six different FDCs with
 
probes XC130001 through XC130006.  Here is your guide to interpreting these
 
FDCs.  To make it a bit easier to find your answer, blue suggests a problem
 
with our code is possible while red suggests a problem with user code:
 
 
|----------->
 
| XC130001  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | When any thread does an MQCONN, a control block called pCtl is        |
 
  | allocated for it to hold information about the state of that thread.  |
 
  | This FDC is created when that pCtl control block does not exist.  If a|
 
  | thread has no pCtl control block, it is most likely not connected to  |
 
  | MQSeries, so the problem is likely to be in user code.  If the FDC has|
 
  | no trace in it, you can be almost certain the problem is in the       |
 
  | customer application.                                                 |
 
  >-----------------------------------------------------------------------|
 
|----------->
 
| XC130002  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | There are times when MQSeries needs to do dangerous things, like      |
 
  | calling user exits.  Because we can't really trust exits to be        |
 
  | perfectly written, we set up a recovery address where we can go if the|
 
  | exit raises an exception.  This FDC indicates that something--most    |
 
  | likely a user exit--raised an exception.  MQSeries will jump to the   |
 
  | recovery address and continue to process, but this FDC suggests a     |
 
  | problem in the exit.                                                  |
 
  >-----------------------------------------------------------------------|
 
|----------->
 
| XC130003  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | This is the most common probe our exception handler generates.  It    |
 
  | indicates that a thread connected to MQSeries ran into unexpected     |
 
  | trouble.  However, it is possible that the thread which created the   |
 
  | FDC might have been running customer code at the time.  If the        |
 
  | function stack shows only xcsFFST, and if the last two entries in the |
 
  | trace show a return from an MQI call followed by an entry into xcsFFST|
 
  | , then the problem is again likely to be in the customer's code.  If  |
 
  | the process is one of the core queue manager processes, or if the     |
 
  | function stack shows other MQSeries functions, then this FDC suggests |
 
  | a problem within MQSeries.                                            |
 
  >-----------------------------------------------------------------------|
 
|----------->
 
| XC130004  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | As I said before, MQSeries does need to do dangerous things from time |
 
  | to time.  For example, MQPUT needs to make sure that the pointer to   |
 
  | the message data passed in by the customer application is valid.  The |
 
  | only way to do this is to try to access the pointer, even though that |
 
  | may case a segmentation violation.  MQSeries handles this by setting  |
 
  | up a recovery address to which it can jump when a segmentation        |
 
  | violation occurs, then accessing the pointer.  If the pointer is in   |
 
  | fact bad, then MQSeries cuts this FDC and recovers automatically.     |
 
  | This FDC usually means that the customer application passed in a bad  |
 
  | pointer, although there may be some cases where MQSeries itself is at |
 
  | fault.                                                                |
 
  >-----------------------------------------------------------------------|
 
|----------->
 
| XC130005  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | I have never seen this probe and never expect to.  It means that this |
 
  | thread is expecting a possible exception, but unlike the XC130002 or  |
 
  | XC130004 FDCs, there is no recovery area set up.  MQSeries is very    |
 
  | careful about setting up recovery areas when needed, so this FDC      |
 
  | indicates a definite problem in our code.                             |
 
  >-----------------------------------------------------------------------|
 
|----------->
 
| XC130006  |
 
|----------->
 
  >-----------------------------------------------------------------------|
 
  | This FDC also indicates that a thread connected to MQSeries ran into  |
 
  | unexpected trouble.  However, the thread is marked as being unprepared|
 
  | to handle exceptions, which means that it is probably not in MQSeries |
 
  | code.  As with XC130003, if the function stack shows only xcsFFST, the|
 
  | problem is likely to be with customer code.  If the function stack    |
 
  | includes other functions, this could be a more serious problem within |
 
  | MQSeries.                                                             |
 
  >-----------------------------------------------------------------------|
 
 
 
 
 
     Here's an example of a problem in customer code.  This FDC was created
 
because a thread with no pCtl (a thread not currently connected to
 
MQSeries) raised an exception:
 
 
        +------------------------------------------------------------------
 
        -----------+
 
        |
 
 
                  |
 
        | MQSeries First Failure Symptom Report
 
                                               |
 
        | =====================================
 
                                               |
 
        |
 
 
                  |
 
        | Date/Time         :- Tuesday February 02 18:05:16 EST 1999
 
                          |
 
        | Host Name         :- calvin
 
                                                         |
 
        | PIDS              :- 5765B73
 
                                                        |
 
        | LVLS              :- 500
 
                                                            |
 
        | Product Long Name :- MQSeries for AIX
 
                                               |
 
        | Vendor            :- IBM
 
                                                            |
 
        | Probe Id          :- XC130001
 
                                                       |
 
        | Application Name  :- MQM
 
                                                            |
 
        | Component         :- xehExceptionHandler
 
                                            |
 
        | Build Date        :- Nov 23 1998
 
                                                    |
 
        | UserID            :- 00000010 (UNKNOWN)
 
                                             |
 
        | Process           :- 00100836
 
                                                       |
 
        | Major Errorcode   :- xecSTOP
 
                                                        |
 
        | Minor Errorcode   :- OK
 
                                                             |
 
        | Probe Type        :- HALT6109
 
                                                       |
 
        | Probe Severity    :- 1
 
                                                              |
 
        | Probe Description :- AMQ6109: An internal MQSeries error has
 
        occurred.      |
 
        |
 
 
                  |
 
        +------------------------------------------------------------------
 
        -----------+
 
 
 
     Here's an FDC which generally indicates a problem in our code.  But if
 
you look at the function stack and read my description earlier, you can in
 
fact see that the problem is most likely with the customer's code.  Even
 
better, the Arith1 field in this FDC prints out the particular signal which
 
was raised for this exception.  In this case (and this is the most common
 
case), the signal was number 11, or SIGSEGV.  This means that the problem
 
was a segmentation violation (the thread trying to access memory which is
 
outside its address space):
 
 
        +------------------------------------------------------------------
 
        -----------+
 
        |
 
 
                  |
 
        | MQSeries First Failure Symptom Report
 
                                               |
 
        | =====================================
 
                                               |
 
        |
 
 
                  |
 
        | Date/Time         :- Tuesday February 02 16:58:44 EST 1999
 
                          |
 
        | Host Name         :- calvin
 
                                                         |
 
        | PIDS              :- 5765B73
 
                                                        |
 
        | LVLS              :- 500
 
                                                            |
 
        | Product Long Name :- MQSeries for AIX
 
                                               |
 
        | Vendor            :- IBM
 
                                                            |
 
        | Probe Id          :- XC130003
 
                                                       |
 
        | Application Name  :- MQM
 
                                                            |
 
        | Component         :- xehExceptionHandler
 
                                            |
 
        | Build Date        :- Nov 23 1998
 
                                                    |
 
        | UserID            :- 00000010 (hobbes)
 
                                              |
 
        | Process           :- 00109802
 
                                                       |
 
        | Thread            :- 00000001
 
                                                       |
 
        | QueueManager      :- SUSIE
 
                                                          |
 
        | Major Errorcode   :- xecSTOP
 
                                                        |
 
        | Minor Errorcode   :- OK
 
                                                             |
 
        | Probe Type        :- HALT6109
 
                                                       |
 
        | Probe Severity    :- 1
 
                                                              |
 
        | Probe Description :- AMQ6109: An internal MQSeries error has
 
        occurred.      |
 
        | Arith1            :- 11 b
 
                                                           |
 
        |
 
 
                  |
 
        +------------------------------------------------------------------
 
        -----------+
 
 
        MQM Function Stack
 
        xcsFFST
 
 
        MQM Trace History
 
         [Lots of stuff removed here]
 
 
         <-- zcpDeletePacket rc=OK
 
 
         <-- ziiMQPUT rc=OK
 
 
         <-- zstMQPUT rc=OK
 
 
         <-- MQPUT rc=OK
 
 
         --> xcsFFST
 
 
 
     Please realise that these exception handler FDCs are and ever will be
 
common.  They are created whenever an exception is raised in any process
 
connected to MQSeries, so there is no fix for XC130001, and CSD04 does not
 
fix XC130003.  It is possible for an APAR or a CSD to correct a specific
 
problem which can cause an XC130003, but to say that these FDCs are
 
completely fixed by some patch or efix is incorrect.
 
 
     In order to solve these problems, I suggest exporting
 
AMQ_ABORT_ON_EXCEPTION=TRUE.  If this variable is set in the environment in
 
which the application is running, the MQSeries exception handler will abort
 
the process after cutting the FDC.  This will produce a core file which we
 
can investigate to determine conclusively who is at fault and what needs to
 
be done to solve the problem.  If the failing program is an MQSeries
 
program we can examine the core file, but if it is a customer program they
 
will need to use dbx, gdb, or their favourite debugger to dump the stacks
 
for each thread in the process.  We can examine cores from customer
 
applications only when we have their compiled application as well as all
 
libraries on which it depends.  If their application is linked with
 
MQSeries, Tuxedo, Oracle, CORBA, and half a dozen other things, forget it.
 
If it's a straight MQSeries application we may be able to work on it
 
provided that we have the binary program, the core file, and a system at
 
the same OS and maintenance level.  If not, things get very tricky very
 
quickly.
 
 
     I hope this helps.
 
 
     Cheers,
 
 
     Justin T. Fries
 
     MQSeries Support
 
     Raleigh, North Carolina
 
     (919) 254-1422  TL 444-1422
 
     Email: justinf@us.ibm.com | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | kavithadhevi | 
		  
		    
			  
				 Posted: Tue Jul 09, 2002 5:15 am    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Master
 
 Joined: 14 May 2002 Posts: 201 Location: USA 
  | 
		  
		    
			  
				Hi,    
 
 
Many Thanks to Justin Fries for an excellent piece of information for all MQSeries Specialists.   
 
This is information is really very helpful for MQSeries Support / Problem solvers like me.     . This a excellent source for analysing the FDC files and thanks again for Posting it.
 
 
-- Kavitha    | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | cvshiva | 
		  
		    
			  
				 Posted: Thu Dec 12, 2002 12:43 am    Post subject: FDC Files Generated for MQ Agent Process Terminated | 
				     | 
			   
			 
		   | 
		 
		
		    Apprentice
 
 Joined: 04 Mar 2002 Posts: 35 Location: Chennai 
  | 
		  
		    
			  
				We encountered the following problem in Production environment where the Queue Manager Hung. None of the Control Commands / runmqsc commands responded. We couldn't end the Queue manager too. Ultimately we had to Reboot the Server.  Our Server is Unix AIX Server and is being used by many applications which Use MQ as a Messaging layer. It is too much of trouble to convince people before we can reboot such a big Application Box as there is a lot Impact to Production data , cutoff for payments and other stuff which is part & parcel of any Banking System.
 
 
Coming to the problem , this is what I could see in the log in /var/mqm/errors/AMQERR01.LOG file
 
 
-------------------------------------------------------------------------------
 
12/12/02  03:38:48
 
AMQ5009: MQSeries agent process 260894 has terminated unexpectedly.
 
 
EXPLANATION:
 
MQSeries has detected that an agent process has terminated unexpectedly. The
 
queue manager connection(s) that this process is responsible for will be
 
broken.
 
ACTION:
 
Use any previous FFSTs to determine the reason for the failure. Try to
 
eliminate the following reasons before contacting your IBM support center.
 
1) A user has inadvertently terminated the process.
 
2) The system is low on resources.  Some operating systems terminate processes
 
  to free resources.  If your system is low on resources, it is possible that
 
  the operating system has terminated the process so that a new process can be
 
  created.
 
-------------------------------------------------------------------------------
 
 
and the related FDC is AMQ19888.0.FDC where 19888 is PID of amqzxma0 (MQ Processing Controller ) and agent process ID was supposed to be 260894. ( amqzlaa0 -> qmgr agent process ).
 
 
I am not sure on what could have caused the Queue Manager to hang, because it is the second time it has happened within a month's time.
 
 
Just 5 days back I could see a similar entry to the above in the /var/mqm/errors/AMQERR01.log but the Queue Manager did not hang and everything was fine. 
 
 
Following is an extract from the AMQ19888.0.FDC file , please see whether u can help me in any way :
 
 
+-----------------------------------------------------------------------------+
 
|                                                                             |
 
| MQSeries First Failure Symptom Report                                       |
 
| =====================================                                       |
 
|                                                                             |
 
| Date/Time         :- Thursday December 12 03:38:48 TAIST 2002               |
 
| Host Name         :- hk17 (AIX 4.3)                                         |
 
| PIDS              :- 5765B73                                                |
 
| LVLS              :- 510                                                    |
 
| Product Long Name :- MQSeries for AIX                                       |
 
| Vendor            :- IBM                                                    |
 
| Probe Id          :- ZX005015                                               |
 
| Application Name  :- MQM                                                    |
 
| Component         :- zxcProcessChildren                                     |
 
| Build Date        :- Dec 20 2000                                            |
 
| UserID            :- 00000007 (mqm)                                         |
 
| Program Name      :- amqzxma0_nd                                            |
 
| Process           :- 00019888                                               |
 
| Thread            :- 00000001                                               |
 
| QueueManager      :- HK17                                                   |
 
| Major Errorcode   :- zrcX_AGENT_DEAD                                        |
 
| Minor Errorcode   :- OK                                                     |
 
| Probe Type        :- MSGAMQ5009                                             |
 
| Probe Severity    :- 2                                                      |
 
| Probe Description :- AMQ5009: MQSeries agent process 260894 has terminated  |
 
|   unexpectedly.                                                             |
 
| Arith1            :- 260894 3fb1e                                           |
 
|                                                                             |
 
+-----------------------------------------------------------------------------+
 
 
MQM Function Stack
 
zxcProcessChildren
 
xcsFFST
 
 
MQM Trace History
 
                     --> xstFreeChunk
 
                      --> xstInsertChunk
 
                      <-- xstInsertChunk rc=OK
 
                     <-- xstFreeChunk rc=OK
 
                     --> xcsReleaseMutexSem
 
                      --> xllSemRel
 
                      <-- xllSemRel rc=OK
 
                     <-- xcsReleaseMutexSem rc=OK
 
                    <-- xstFreeMemBlock rc=OK
 
                   <-- xcsFreeMemBlock rc=OK
 
                  <-- zxcDeleteAgentConnData rc=OK
 
                  --> xcsReleaseMutexSem
 
                   --> xllSemRel
 
                   <-- xllSemRel rc=OK
 
                  <-- xcsReleaseMutexSem rc=OK
 
                  --> xcsFreeMem
 
                  <-- xcsFreeMem rc=OK
 
                  --> xcsRequestMutexSem
 
                   --> xllSemGetVal
 
                   <-- xllSemGetVal rc=OK
 
                   --> xllSemReq
 
                   <-- xllSemReq rc=OK
 
                  <-- xcsRequestMutexSem rc=OK
 
                  --> xcsCheckProcess
 
                  <-- xcsCheckProcess rc=OK
 
                  --> zxcDeleteAgentConnData
 
                   --> zcpDeleteClientLink
 
                    --> xcsRequestMutexSem
 
                     --> xllSemGetVal
 
                     <-- xllSemGetVal rc=OK
 
                     --> xllSemReq
 
                     <-- xllSemReq rc=OK
 
                    <-- xcsRequestMutexSem rc=OK
 
                    --> xcsReleaseMutexSem
 
                     --> xllSemRel
 
                     <-- xllSemRel rc=OK
 
                    <-- xcsReleaseMutexSem rc=OK
 
                    --> zcqDeleteLink
 
                     --> xcsFreeMemBlock
 
                      --> xstFreeMemBlock
 
                       --> xcsRequestMutexSem
 
                        --> xllSemGetVal
 
                        <-- xllSemGetVal rc=OK
 
                        --> xllSemReq
 
                        <-- xllSemReq rc=OK
 
                       <-- xcsRequestMutexSem rc=OK
 
                       --> xcsQueryMutexSem
 
                       <-- xcsQueryMutexSem rc=OK
 
                       --> xcsRequestMutexSem
 
                        --> xllSpinLockRequest
 
                        <-- xllSpinLockRequest rc=OK
 
                       <-- xcsRequestMutexSem rc=OK
 
                       --> xclDeleteMutexMem
 
                        --> xllCSCloseMutex
 
                         --> xllSpinLockRequest
 
                         <-- xllSpinLockRequest rc=OK
 
                         --> xllSpinLockRelease
 
                         <-- xllSpinLockRelease rc=OK
 
                         --> xcsFreeQuickCell
 
                          --> xllSpinLockRequest
 
                          <-- xllSpinLockRequest rc=OK
 
                          --> xstFreeCell
 
                          <-- xstFreeCell rc=OK
 
                          --> xllSpinLockRelease
 
                          <-- xllSpinLockRelease rc=OK
 
                         <-- xcsFreeQuickCell rc=OK
 
                        <-- xllCSCloseMutex rc=OK
 
                       <-- xclDeleteMutexMem rc=OK
 
                       --> xstFreeChunk
 
                        --> xstDeleteChunk
 
                        <-- xstDeleteChunk rc=OK
 
                        --> xstInsertChunk
 
                        <-- xstInsertChunk rc=OK
 
                       <-- xstFreeChunk rc=OK
 
                       --> xcsReleaseMutexSem
 
                        --> xllSemRel
 
                        <-- xllSemRel rc=OK
 
                       <-- xcsReleaseMutexSem rc=OK
 
                      <-- xstFreeMemBlock rc=OK
 
                     <-- xcsFreeMemBlock rc=OK
 
                     --> xcsCloseEventSem
 
                      --> xllSpinLockRequest
 
                      <-- xllSpinLockRequest rc=OK
 
                      --> xllSpinLockRelease
 
                      <-- xllSpinLockRelease rc=OK
 
                      --> xllFreeSem
 
                       --> xllSemReq
 
                       <-- xllSemReq rc=OK
 
                       --> xllSemRel
 
                       <-- xllSemRel rc=OK
 
                      <-- xllFreeSem rc=OK
 
                      --> xcsFreeQuickCell
 
                       --> xllSpinLockRequest
 
                       <-- xllSpinLockRequest rc=OK
 
                       --> xstFreeCell
 
                       <-- xstFreeCell rc=OK
 
                       --> xllSpinLockRelease
 
                       <-- xllSpinLockRelease rc=OK
 
                      <-- xcsFreeQuickCell rc=OK
 
                     <-- xcsCloseEventSem rc=OK
 
                     --> xcsFreeMemBlock
 
                      --> xstFreeMemBlock
 
                       --> xcsRequestMutexSem
 
                        --> xllSemGetVal
 
                        <-- xllSemGetVal rc=OK
 
                        --> xllSemReq
 
                        <-- xllSemReq rc=OK
 
                       <-- xcsRequestMutexSem rc=OK
 
                       --> xstFreeChunk
 
                        --> xstDeleteChunk
 
                        <-- xstDeleteChunk rc=OK
 
                        --> xstInsertChunk
 
                        <-- xstInsertChunk rc=OK
 
                       <-- xstFreeChunk rc=OK
 
                       --> xcsReleaseMutexSem
 
                        --> xllSemRel
 
                        <-- xllSemRel rc=OK
 
                       <-- xcsReleaseMutexSem rc=OK
 
                      <-- xstFreeMemBlock rc=OK
 
                     <-- xcsFreeMemBlock rc=OK
 
                    <-- zcqDeleteLink rc=OK
 
                   <-- zcpDeleteClientLink rc=OK
 
                   --> xcsFreeMemBlock
 
                    --> xstFreeMemBlock
 
                     --> xcsRequestMutexSem
 
                      --> xllSemGetVal
 
                      <-- xllSemGetVal rc=OK
 
                      --> xllSemReq
 
                      <-- xllSemReq rc=OK
 
                     <-- xcsRequestMutexSem rc=OK
 
                     --> xstFreeChunk
 
                      --> xstDeleteChunk
 
                      <-- xstDeleteChunk rc=OK
 
                      --> xstInsertChunk
 
                      <-- xstInsertChunk rc=OK
 
                     <-- xstFreeChunk rc=OK
 
                     --> xcsReleaseMutexSem
 
                      --> xllSemRel
 
                      <-- xllSemRel rc=OK
 
                     <-- xcsReleaseMutexSem rc=OK
 
                    <-- xstFreeMemBlock rc=OK
 
                   <-- xcsFreeMemBlock rc=OK
 
                  <-- zxcDeleteAgentConnData rc=OK
 
                  --> xcsReleaseMutexSem
 
                   --> xllSemRel
 
                   <-- xllSemRel rc=OK
 
                  <-- xcsReleaseMutexSem rc=OK
 
                  --> xcsFreeMem
 
                  <-- xcsFreeMem rc=OK
 
                  --> xcsRequestMutexSem
 
                   --> xllSemGetVal
 
                   <-- xllSemGetVal rc=OK
 
                   --> xllSemReq
 
                   <-- xllSemReq rc=OK
 
                  <-- xcsRequestMutexSem rc=OK
 
                  --> xcsCheckProcess
 
                  <-- xcsCheckProcess rc=OK
 
                  --> xcsReleaseMutexSem
 
                   --> xllSemRel
 
                   <-- xllSemRel rc=OK
 
                  <-- xcsReleaseMutexSem rc=OK
 
                  --> xcsRequestMutexSem
 
                   --> xllSemGetVal
 
                   <-- xllSemGetVal rc=OK
 
                   --> xllSemReq
 
                   <-- xllSemReq rc=OK
 
                  <-- xcsRequestMutexSem rc=OK
 
                  --> xcsCheckProcess
 
                  <-- xcsCheckProcess rc=OK
 
                  --> xcsReleaseMutexSem
 
                   --> xllSemRel
 
                   <-- xllSemRel rc=OK
 
                  <-- xcsReleaseMutexSem rc=OK
 
                  --> xcsRequestMutexSem
 
                   --> xllSemGetVal
 
                   <-- xllSemGetVal rc=OK
 
                   --> xllSemReq
 
                   <-- xllSemReq rc=OK
 
                  <-- xcsRequestMutexSem rc=OK
 
                  --> xcsCheckProcess
 
                  <-- xcsCheckProcess rc=OK
 
                  --> xcsReleaseMutexSem
 
                   --> xllSemRel
 
                   <-- xllSemRel rc=OK
 
                  <-- xcsReleaseMutexSem rc=OK
 
                  --> xcsRequestMutexSem
 
                   --> xllSemGetVal
 
                   <-- xllSemGetVal rc=OK
 
                   --> xllSemReq
 
                   <-- xllSemReq rc=OK
 
                  <-- xcsRequestMutexSem rc=OK
 
                  --> xcsCheckProcess
 
                  <-- xcsCheckProcess rc=xecP_E_INVALID_PID
 
                  --> xcsBuildDumpPtr
 
                   --> xcsGetMem
 
                   <-- xcsGetMem rc=OK
 
                  <-- xcsBuildDumpPtr rc=OK
 
                  --> xcsBuildDumpPtr
 
                  <-- xcsBuildDumpPtr rc=OK
 
                  --> xcsFFST
 
 
ECAnchor
 
3001fe20                                 5A584541                ZXEA
 
3001fe30   3000004C  300A18B4  300A18B4  00000007    0..L0..´0..´....
 
3001fe40   00000000  2006DE08  80000050  80000050    .... .Þ....P...P
 
3001fe50   30007090  00000000  30007090  3002161C    0.p.....0.p.0...
 
3001fe60   00004DB0  00000000  0002FDEF  000000AC    ..M°......ýï...¬
 
3001fe70   00000000  00000000  000000A7  00000000    ...........§....
 
3001fe80   00000000  0001DF20  00000000  484B3137    ......ß ....HK17
 
3001fe90   00636B00  63726964  5F6C6F63  6B00636F    .ck.crid_lock.co
 
3001fea0   6E666967  5F6C6F63  6B00696E  70757464    nfig_lock.inputd
 
3001feb0   645F6C6F  636B0073  75737065  2F766172    d_lock.suspe/var
 
3001fec0   2F6D716D  00000000  00000000  00000000    /mqm............
 
3001fed0   00000000  00000000  00000000  00000000    ................
 
3001fee0 to 300202a0 suppressed, lines same as above
 
300202b0   00000000  00000000  00000000  5A435048    ............ZCPH
 
300202c0   80007094  80007108  00000000  00000000    ..p...q.........
 
300202d0   00000000  5A435048  800FC5B0  800FC624    ....ZCPH..Ű..Æ$
 
300202e0   801007C4  30000FAC  300203D8  200C9578    ...Ä0..¬0..Ø ..x
 
300202f0   00000000  30000FF4  30020404  30032774    ....0..ô0...0.'t
 
30020300   00000000  00000000  30007090  00000000    ........0.p.....
 
30020310   00000000  00000000  00000000  00000001    ................
 
30020320   00000000  30020430  3000103C  00000000    ....0..00..<....
 
30020330   00000000  00000000  00000005  00028CCE    ...............ÃŽ
 
30020340   00000000  0001603A  00000000  00025248    ......`:......RH
 
30020350   00000000  00041168  00000000  0003FB1E    .......h......û.
 
30020360   0000001E  00036F20  00000000  00030832    ......o .......2
 
30020370   00000000  000495E2  00000000  00043616    .......â......6.
 
30020380   00000000  0003FAA6  00000000  30007120    ......ú¦....0.q 
 
30020390   00000003  000052AC  00000000  00000000    ......R¬........
 
300203a0   00000000  00000001  00000009  00000000    ................
 
 
 
AgentAnchor
 
300257b0             5A584142  00000000  3000004C        ZXAB....0..L
 
300257c0   00004DB0  0003FB1E  5A435048  80007094    ..M°..û.ZCPH..p.
 
300257d0   808530D4  80852F8C  00000037  00000003    ..0Ô../....7....
 
300257e0   00000002  30007090  3001F474  30007090    ....0.p.0.ôt0.p.
 
300257f0   3003F4C4  30007090  300257B4  30007090    0.ôÄ0.p.0.W´0.p.
 
30025800   3002A76C  30024A08  00000001  00000000    0.§l0.J.........
 
30025810   3083A6AC  00000000  00000000  00000001    0.¦¬............
 
30025820   00000000  00000000  00000000  00000000    ................
 
30025830   00000001                                  ....
 
 
 
I have escalated this problem to IBM also , but you know its tough to push IBM to give us the details faster, so do let me know if you can help ?
 
 
Thanks & Regards _________________ Ramnath Shiva
 
IBM Certified SOA Specialist
 
IBM Certified MQSeries Specialist
 
Standard Scope International Pvt Ltd , Chennai | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | dgolding | 
		  
		    
			  
				 Posted: Thu Dec 12, 2002 7:31 am    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Yatiri
 
 Joined: 16 May 2001 Posts: 668 Location: Switzerland 
  | 
		  
		    
			  
				Hi,
 
 
The FDC you posted is not similar to the ones mentioned in the first article. Yours appears to be a genuine MQ problem, in which case we have to leave it to IBM.   
 
 
One thing, you state you had to reboot the machine to restart the queue manager. Did you use:
 
 
ipcs|grep mqm and then use ipcrm
 
 
to remove all shared memory, semaphores and the like?
 
Rebooting seems to be a bit drastic!
 
 
There is a new (undocumented) command called amqiclen which does this, but I've never got it to work. If you have only one queue manager on the box, the manual ipcrm is safe.
 
 
HTH | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | bower5932 | 
		  
		    
			  
				 Posted: Thu Dec 12, 2002 9:39 am    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Jedi Knight
 
 Joined: 27 Aug 2001 Posts: 3023 Location: Dallas, TX, USA 
  | 
		  
		    
			  
				If you've escalated to IBM, I would expect them to get back to you.
 
 
In the mean time, this FDC is usually associated with resource problems.  dgolding's comment about cleaning up shared memory/semaphores is probably the way to go.
 
 
As an aside, if I read your FDC correctly, you are running MQSeries 5.1 (LVLS 510).  If this is the case, this product has been out of support for a while and I would guess that the first thing IBM would tell you to do is to upgrade. | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | cvshiva | 
		  
		    
			  
				 Posted: Thu Dec 12, 2002 7:17 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Apprentice
 
 Joined: 04 Mar 2002 Posts: 35 Location: Chennai 
  | 
		  
		    
			  
				Hi dgolding & bower ,
 
 
Thanks for your postings. Yes, I understand that it was a genuine MQ Problem. 
 
 
I tried cleaning the Semaphores and Shared memory using ipcrm , still couldn't get the Queue Manager respond to any of the commands and finally had to reboot.
 
 
The "amqiclen" is something I haven't heard so far, I should try it , is ther e any help I could get on this ? please let me know more on this !!
 
 
Bower you are right , we are using MQS 5.1.
 
 
I Know that MQS 5.1 is already out of support for about an year and we have the upgrade plans  in the first quarter next year. 
 
 
U know that upgrade can't be done so easily in a busy Prodn environment like this , we got to go through so many procedures like SIT , UAT , OAT etc b4 we can do this on production and above all the Budget for this year isn't sufficient for the upgrade of 12 MQSeries Servers we have. This is the reason we got to do it early next year and survive with whatever we have on hand. We somehow have to seek the support of IBM as we have been renewing the Service contract with them. 
 
 
I will wait for IBM to get back, but b4 that if there is anything you can suggest as a perventive measure (like increasing theSemaphores and Shared Memory Settings for MQ), it will be of great help !!
 
 
Also let me know is there any tool which I can use to get some useful info from the FDC files.
 
 
 
Thanks & Regards, _________________ Ramnath Shiva
 
IBM Certified SOA Specialist
 
IBM Certified MQSeries Specialist
 
Standard Scope International Pvt Ltd , Chennai | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | dgolding | 
		  
		    
			  
				 Posted: Thu Dec 12, 2002 11:38 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Yatiri
 
 Joined: 16 May 2001 Posts: 668 Location: Switzerland 
  | 
		  
		    
			  
				Hi Ramnath,
 
 
I'm afraid "amqiclen" is 5.2 and above only, another good reason to upgrade   It is undocumented and so probably unsupported.
 
 
As I'm not an AIX expert I would hesitate to recommend anything - increasing  the amount of shared memory available and the number of semaphores sounds useful.
 
 
...and as for analysis tools, I don't know of any. I wrote a homegrown one here to classify the FDCs by Probe ID, user ID and the like, but it never took off. It's not really portable either, as it ties in with our alarm reporting mechanism.
 
 
Your best  bet is a quick shell script that greps through the FDC files and does a simple report.
 
 
Probably quicker in the long run to go to 5.2 or above!
 
 
HTH
 
 
regards
 
 
Don | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | cvshiva | 
		  
		    
			  
				 Posted: Fri Dec 13, 2002 12:50 am    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Apprentice
 
 Joined: 04 Mar 2002 Posts: 35 Location: Chennai 
  | 
		  
		    
			  
				Hi Don,
 
 
Thanks for the reply.. !!
 
 
I tried searching for "amqiclen" and couldn't find, so I guessed it should be a feature in higher versions.
 
 
The Shell Script idea is fine , I will write it , but I was looking for a tool to make some meaning out of the  binary characters in the FDC file  
 
 
We are waiting to hear from IBM before we could decide to increase the Semaphores & Shared memory settings.
 
 
I am now trying use this problem as a driving force for upgrading to MQSeries 5.3 , So things will happen quickly !!.
 
 
Thanks  & Regards _________________ Ramnath Shiva
 
IBM Certified SOA Specialist
 
IBM Certified MQSeries Specialist
 
Standard Scope International Pvt Ltd , Chennai | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | kkolla | 
		  
		    
			  
				 Posted: Tue Aug 05, 2003 10:18 am    Post subject: FDC FFST | 
				     | 
			   
			 
		   | 
		 
		
		   Newbie
 
 Joined: 05 Aug 2003 Posts: 1
  
  | 
		  
		    
			  
				+-----------------------------------------------------------------------------+
 
|                                                                             |
 
| MQSeries First Failure Symptom Report                                       |
 
| =====================================                                       |
 
|                                                                             |
 
| Date/Time         :- Friday July 25 18:07:33 EDT 2003                       |
 
| Host Name         :- dpa201 (SunOS 5.8)                                     |
 
| PIDS              :- 5765B75                                                |
 
| LVLS              :- 520                                                    |
 
| Product Long Name :- MQSeries for Sun Solaris 2 (Sparc)                     |
 
| Vendor            :- IBM                                                    |
 
| Probe Id          :- HL142100                                               |
 
| Application Name  :- MQM                                                    |
 
| Component         :- mqloopenp                                              |
 
| Build Date        :- Nov  7 2000                                            |
 
| CMVC level        :- p000-L001106                                           |
 
| Build Type        :- IKAP - (Production)                                    |
 
| UserID            :- 00022044 (mqm)                                         |
 
| Program Name      :- amqhasmx                                               |
 
| Process           :- 00001395                                               |
 
| Thread            :- 00000001                                               |
 
| QueueManager      :- PDIFPM01                                               |
 
| Major Errorcode   :- Unknown(19)                                            |
 
| Minor Errorcode   :- OK                                                     |
 
| Probe Type        :- MSGAMQ0019                                             |
 
| Probe Severity    :- 4                                                      |
 
| Probe Description :- AMQ6090: MQSeries was unable to display an error       |
 
|   message 19.                                                               |
 
|                                                                             |
 
+-----------------------------------------------------------------------------+
 
 
 
What does the above error mean ?? Thanks in advance! | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | bduncan | 
		  
		    
			  
				 Posted: Tue Aug 05, 2003 3:26 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		   Padawan
 
 Joined: 11 Apr 2001 Posts: 1554 Location: Silicon Valley 
  | 
		  
		    
			  
				kkolla,
 
You should probably post your question as a separate thread. Also, some context would be useful. What was your queue manager doing when this report was generated? _________________ Brandon Duncan
 
IBM Certified MQSeries Specialist
 
MQSeries.net forum moderator | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | EddieA | 
		  
		    
			  
				 Posted: Tue Aug 05, 2003 3:42 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Jedi
 
 Joined: 28 Jun 2001 Posts: 2453 Location: Los Angeles 
  | 
		  
		    
			  
				This is an 'informational' error and can safely be ignored.
 
 
It's fixed with MQSeries for V5.2  CSD02.
 
 
Cheers, _________________ Eddie Atherton
 
IBM Certified Solution Developer - WebSphere Message Broker V6.1
 
IBM Certified Solution Developer - WebSphere Message Broker V7.0 | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | cvshiva | 
		  
		    
			  
				 Posted: Tue Aug 05, 2003 8:00 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		 
		
		    Apprentice
 
 Joined: 04 Mar 2002 Posts: 35 Location: Chennai 
  | 
		  
		    
			  
				Hi,
 
 
It looks like you are using MQSeries Version 5.1 . I am not sure whether  latest patches are available on the IBM Website still  !!!
 
 
IF your Queue Manager is running fine and you don't see any other errors closer to the timestamp of this error , u can ignore this !!
 
 
If you have any issues and suspect this to be the cause.. search for latest maintenance patches for MQSeries 5.1 , apply them if you haven't already  and also try escalating the problem to IBM.
 
 
IBM will provide support only on a BEST EFFORT basis , meaning they will provide you with a remedy if they have one already , if not they will surely ask you to upgrade to 5.2 /5.3
 
 
Regards _________________ Ramnath Shiva
 
IBM Certified SOA Specialist
 
IBM Certified MQSeries Specialist
 
Standard Scope International Pvt Ltd , Chennai | 
			   
			 
		   | 
		 
		
		  | Back to top | 
		  
		  	
		   | 
		 
		
		    | 
		 
		
		  | 
		    
		   | 
		 
	   
	 | 
   
 
  
	     | 
	 | 
	Page 1 of 1 | 
   
 
 
 
  
  	
	  
		
		  
 
  | 
		  You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
  | 
  		 
	   
	 | 
   
 
  	 | 
	  |