|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Analysing FDCs on Unix - it's not always MQ that's wrong |
« View previous topic :: View next topic » |
Author |
Message
|
dgolding |
Posted: Fri Jul 05, 2002 1:41 am Post subject: Analysing FDCs on Unix - it's not always MQ that's wrong |
|
|
 Yatiri
Joined: 16 May 2001 Posts: 668 Location: Switzerland
|
Many thanks to Justin Fries for his permission to republish this (from Vienna MQ List April 2000):
Hi, all.
If you haven't seen an FDC with a probeid like XC130001 or XC130003,
you soon will. These FDCs are created on UNIX systems when an application
does something foolish like trying to access memory outside its address
space (a segmentation violation), or trying to access non word-aligned
memory (a bus violation). UNIX sends signals to misbehaved processes which
try to do these things--SIGSEGV and SIGBUS in the two examples I gave. If
the application is a queue manager process or a connected user application,
the MQSeries signal handler will write out one of these XC13000x FDCs to
help us debug the problem.
If an XC13000x FDC is created by a core queue manager
program--amqcrsta, amqzxma0, runmqsc, or any other program written entirely
by IBM--then the problem is definitely ours. In that case, we'll need to
examine the FDC and perhaps gather other information to diagnose the
problem. If an XC13000x FDC is created by a user application which is
merely connected to the queue manager, there is a possibility that the
failure is in the customer's code and not ours. For example, a user thread
may cause a segmentation violation by doing something foolish (access NULL
pointers, for example), and this will cause UNIX to send signal 11
(SIGSEGV) to the process. Because MQSeries has its own handler installed
for SIGSEGV, we print out an FDC for the problem. In this case, we're
really reporting someone else's error, but it almost looks as if we are the
culprits.
The MQSeries function which handles these synchronous signals is
called xehExceptionHandler, and it can create six different FDCs with
probes XC130001 through XC130006. Here is your guide to interpreting these
FDCs. To make it a bit easier to find your answer, blue suggests a problem
with our code is possible while red suggests a problem with user code:
|----------->
| XC130001 |
|----------->
>-----------------------------------------------------------------------|
| When any thread does an MQCONN, a control block called pCtl is |
| allocated for it to hold information about the state of that thread. |
| This FDC is created when that pCtl control block does not exist. If a|
| thread has no pCtl control block, it is most likely not connected to |
| MQSeries, so the problem is likely to be in user code. If the FDC has|
| no trace in it, you can be almost certain the problem is in the |
| customer application. |
>-----------------------------------------------------------------------|
|----------->
| XC130002 |
|----------->
>-----------------------------------------------------------------------|
| There are times when MQSeries needs to do dangerous things, like |
| calling user exits. Because we can't really trust exits to be |
| perfectly written, we set up a recovery address where we can go if the|
| exit raises an exception. This FDC indicates that something--most |
| likely a user exit--raised an exception. MQSeries will jump to the |
| recovery address and continue to process, but this FDC suggests a |
| problem in the exit. |
>-----------------------------------------------------------------------|
|----------->
| XC130003 |
|----------->
>-----------------------------------------------------------------------|
| This is the most common probe our exception handler generates. It |
| indicates that a thread connected to MQSeries ran into unexpected |
| trouble. However, it is possible that the thread which created the |
| FDC might have been running customer code at the time. If the |
| function stack shows only xcsFFST, and if the last two entries in the |
| trace show a return from an MQI call followed by an entry into xcsFFST|
| , then the problem is again likely to be in the customer's code. If |
| the process is one of the core queue manager processes, or if the |
| function stack shows other MQSeries functions, then this FDC suggests |
| a problem within MQSeries. |
>-----------------------------------------------------------------------|
|----------->
| XC130004 |
|----------->
>-----------------------------------------------------------------------|
| As I said before, MQSeries does need to do dangerous things from time |
| to time. For example, MQPUT needs to make sure that the pointer to |
| the message data passed in by the customer application is valid. The |
| only way to do this is to try to access the pointer, even though that |
| may case a segmentation violation. MQSeries handles this by setting |
| up a recovery address to which it can jump when a segmentation |
| violation occurs, then accessing the pointer. If the pointer is in |
| fact bad, then MQSeries cuts this FDC and recovers automatically. |
| This FDC usually means that the customer application passed in a bad |
| pointer, although there may be some cases where MQSeries itself is at |
| fault. |
>-----------------------------------------------------------------------|
|----------->
| XC130005 |
|----------->
>-----------------------------------------------------------------------|
| I have never seen this probe and never expect to. It means that this |
| thread is expecting a possible exception, but unlike the XC130002 or |
| XC130004 FDCs, there is no recovery area set up. MQSeries is very |
| careful about setting up recovery areas when needed, so this FDC |
| indicates a definite problem in our code. |
>-----------------------------------------------------------------------|
|----------->
| XC130006 |
|----------->
>-----------------------------------------------------------------------|
| This FDC also indicates that a thread connected to MQSeries ran into |
| unexpected trouble. However, the thread is marked as being unprepared|
| to handle exceptions, which means that it is probably not in MQSeries |
| code. As with XC130003, if the function stack shows only xcsFFST, the|
| problem is likely to be with customer code. If the function stack |
| includes other functions, this could be a more serious problem within |
| MQSeries. |
>-----------------------------------------------------------------------|
Here's an example of a problem in customer code. This FDC was created
because a thread with no pCtl (a thread not currently connected to
MQSeries) raised an exception:
+------------------------------------------------------------------
-----------+
|
|
| MQSeries First Failure Symptom Report
|
| =====================================
|
|
|
| Date/Time :- Tuesday February 02 18:05:16 EST 1999
|
| Host Name :- calvin
|
| PIDS :- 5765B73
|
| LVLS :- 500
|
| Product Long Name :- MQSeries for AIX
|
| Vendor :- IBM
|
| Probe Id :- XC130001
|
| Application Name :- MQM
|
| Component :- xehExceptionHandler
|
| Build Date :- Nov 23 1998
|
| UserID :- 00000010 (UNKNOWN)
|
| Process :- 00100836
|
| Major Errorcode :- xecSTOP
|
| Minor Errorcode :- OK
|
| Probe Type :- HALT6109
|
| Probe Severity :- 1
|
| Probe Description :- AMQ6109: An internal MQSeries error has
occurred. |
|
|
+------------------------------------------------------------------
-----------+
Here's an FDC which generally indicates a problem in our code. But if
you look at the function stack and read my description earlier, you can in
fact see that the problem is most likely with the customer's code. Even
better, the Arith1 field in this FDC prints out the particular signal which
was raised for this exception. In this case (and this is the most common
case), the signal was number 11, or SIGSEGV. This means that the problem
was a segmentation violation (the thread trying to access memory which is
outside its address space):
+------------------------------------------------------------------
-----------+
|
|
| MQSeries First Failure Symptom Report
|
| =====================================
|
|
|
| Date/Time :- Tuesday February 02 16:58:44 EST 1999
|
| Host Name :- calvin
|
| PIDS :- 5765B73
|
| LVLS :- 500
|
| Product Long Name :- MQSeries for AIX
|
| Vendor :- IBM
|
| Probe Id :- XC130003
|
| Application Name :- MQM
|
| Component :- xehExceptionHandler
|
| Build Date :- Nov 23 1998
|
| UserID :- 00000010 (hobbes)
|
| Process :- 00109802
|
| Thread :- 00000001
|
| QueueManager :- SUSIE
|
| Major Errorcode :- xecSTOP
|
| Minor Errorcode :- OK
|
| Probe Type :- HALT6109
|
| Probe Severity :- 1
|
| Probe Description :- AMQ6109: An internal MQSeries error has
occurred. |
| Arith1 :- 11 b
|
|
|
+------------------------------------------------------------------
-----------+
MQM Function Stack
xcsFFST
MQM Trace History
[Lots of stuff removed here]
<-- zcpDeletePacket rc=OK
<-- ziiMQPUT rc=OK
<-- zstMQPUT rc=OK
<-- MQPUT rc=OK
--> xcsFFST
Please realise that these exception handler FDCs are and ever will be
common. They are created whenever an exception is raised in any process
connected to MQSeries, so there is no fix for XC130001, and CSD04 does not
fix XC130003. It is possible for an APAR or a CSD to correct a specific
problem which can cause an XC130003, but to say that these FDCs are
completely fixed by some patch or efix is incorrect.
In order to solve these problems, I suggest exporting
AMQ_ABORT_ON_EXCEPTION=TRUE. If this variable is set in the environment in
which the application is running, the MQSeries exception handler will abort
the process after cutting the FDC. This will produce a core file which we
can investigate to determine conclusively who is at fault and what needs to
be done to solve the problem. If the failing program is an MQSeries
program we can examine the core file, but if it is a customer program they
will need to use dbx, gdb, or their favourite debugger to dump the stacks
for each thread in the process. We can examine cores from customer
applications only when we have their compiled application as well as all
libraries on which it depends. If their application is linked with
MQSeries, Tuxedo, Oracle, CORBA, and half a dozen other things, forget it.
If it's a straight MQSeries application we may be able to work on it
provided that we have the binary program, the core file, and a system at
the same OS and maintenance level. If not, things get very tricky very
quickly.
I hope this helps.
Cheers,
Justin T. Fries
MQSeries Support
Raleigh, North Carolina
(919) 254-1422 TL 444-1422
Email: justinf@us.ibm.com |
|
Back to top |
|
 |
kavithadhevi |
Posted: Tue Jul 09, 2002 5:15 am Post subject: |
|
|
 Master
Joined: 14 May 2002 Posts: 201 Location: USA
|
Hi,
Many Thanks to Justin Fries for an excellent piece of information for all MQSeries Specialists.
This is information is really very helpful for MQSeries Support / Problem solvers like me. . This a excellent source for analysing the FDC files and thanks again for Posting it.
-- Kavitha  |
|
Back to top |
|
 |
cvshiva |
Posted: Thu Dec 12, 2002 12:43 am Post subject: FDC Files Generated for MQ Agent Process Terminated |
|
|
 Apprentice
Joined: 04 Mar 2002 Posts: 35 Location: Chennai
|
We encountered the following problem in Production environment where the Queue Manager Hung. None of the Control Commands / runmqsc commands responded. We couldn't end the Queue manager too. Ultimately we had to Reboot the Server. Our Server is Unix AIX Server and is being used by many applications which Use MQ as a Messaging layer. It is too much of trouble to convince people before we can reboot such a big Application Box as there is a lot Impact to Production data , cutoff for payments and other stuff which is part & parcel of any Banking System.
Coming to the problem , this is what I could see in the log in /var/mqm/errors/AMQERR01.LOG file
-------------------------------------------------------------------------------
12/12/02 03:38:48
AMQ5009: MQSeries agent process 260894 has terminated unexpectedly.
EXPLANATION:
MQSeries has detected that an agent process has terminated unexpectedly. The
queue manager connection(s) that this process is responsible for will be
broken.
ACTION:
Use any previous FFSTs to determine the reason for the failure. Try to
eliminate the following reasons before contacting your IBM support center.
1) A user has inadvertently terminated the process.
2) The system is low on resources. Some operating systems terminate processes
to free resources. If your system is low on resources, it is possible that
the operating system has terminated the process so that a new process can be
created.
-------------------------------------------------------------------------------
and the related FDC is AMQ19888.0.FDC where 19888 is PID of amqzxma0 (MQ Processing Controller ) and agent process ID was supposed to be 260894. ( amqzlaa0 -> qmgr agent process ).
I am not sure on what could have caused the Queue Manager to hang, because it is the second time it has happened within a month's time.
Just 5 days back I could see a similar entry to the above in the /var/mqm/errors/AMQERR01.log but the Queue Manager did not hang and everything was fine.
Following is an extract from the AMQ19888.0.FDC file , please see whether u can help me in any way :
+-----------------------------------------------------------------------------+
| |
| MQSeries First Failure Symptom Report |
| ===================================== |
| |
| Date/Time :- Thursday December 12 03:38:48 TAIST 2002 |
| Host Name :- hk17 (AIX 4.3) |
| PIDS :- 5765B73 |
| LVLS :- 510 |
| Product Long Name :- MQSeries for AIX |
| Vendor :- IBM |
| Probe Id :- ZX005015 |
| Application Name :- MQM |
| Component :- zxcProcessChildren |
| Build Date :- Dec 20 2000 |
| UserID :- 00000007 (mqm) |
| Program Name :- amqzxma0_nd |
| Process :- 00019888 |
| Thread :- 00000001 |
| QueueManager :- HK17 |
| Major Errorcode :- zrcX_AGENT_DEAD |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ5009 |
| Probe Severity :- 2 |
| Probe Description :- AMQ5009: MQSeries agent process 260894 has terminated |
| unexpectedly. |
| Arith1 :- 260894 3fb1e |
| |
+-----------------------------------------------------------------------------+
MQM Function Stack
zxcProcessChildren
xcsFFST
MQM Trace History
--> xstFreeChunk
--> xstInsertChunk
<-- xstInsertChunk rc=OK
<-- xstFreeChunk rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
<-- xstFreeMemBlock rc=OK
<-- xcsFreeMemBlock rc=OK
<-- zxcDeleteAgentConnData rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> xcsFreeMem
<-- xcsFreeMem rc=OK
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsCheckProcess
<-- xcsCheckProcess rc=OK
--> zxcDeleteAgentConnData
--> zcpDeleteClientLink
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> zcqDeleteLink
--> xcsFreeMemBlock
--> xstFreeMemBlock
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsQueryMutexSem
<-- xcsQueryMutexSem rc=OK
--> xcsRequestMutexSem
--> xllSpinLockRequest
<-- xllSpinLockRequest rc=OK
<-- xcsRequestMutexSem rc=OK
--> xclDeleteMutexMem
--> xllCSCloseMutex
--> xllSpinLockRequest
<-- xllSpinLockRequest rc=OK
--> xllSpinLockRelease
<-- xllSpinLockRelease rc=OK
--> xcsFreeQuickCell
--> xllSpinLockRequest
<-- xllSpinLockRequest rc=OK
--> xstFreeCell
<-- xstFreeCell rc=OK
--> xllSpinLockRelease
<-- xllSpinLockRelease rc=OK
<-- xcsFreeQuickCell rc=OK
<-- xllCSCloseMutex rc=OK
<-- xclDeleteMutexMem rc=OK
--> xstFreeChunk
--> xstDeleteChunk
<-- xstDeleteChunk rc=OK
--> xstInsertChunk
<-- xstInsertChunk rc=OK
<-- xstFreeChunk rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
<-- xstFreeMemBlock rc=OK
<-- xcsFreeMemBlock rc=OK
--> xcsCloseEventSem
--> xllSpinLockRequest
<-- xllSpinLockRequest rc=OK
--> xllSpinLockRelease
<-- xllSpinLockRelease rc=OK
--> xllFreeSem
--> xllSemReq
<-- xllSemReq rc=OK
--> xllSemRel
<-- xllSemRel rc=OK
<-- xllFreeSem rc=OK
--> xcsFreeQuickCell
--> xllSpinLockRequest
<-- xllSpinLockRequest rc=OK
--> xstFreeCell
<-- xstFreeCell rc=OK
--> xllSpinLockRelease
<-- xllSpinLockRelease rc=OK
<-- xcsFreeQuickCell rc=OK
<-- xcsCloseEventSem rc=OK
--> xcsFreeMemBlock
--> xstFreeMemBlock
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xstFreeChunk
--> xstDeleteChunk
<-- xstDeleteChunk rc=OK
--> xstInsertChunk
<-- xstInsertChunk rc=OK
<-- xstFreeChunk rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
<-- xstFreeMemBlock rc=OK
<-- xcsFreeMemBlock rc=OK
<-- zcqDeleteLink rc=OK
<-- zcpDeleteClientLink rc=OK
--> xcsFreeMemBlock
--> xstFreeMemBlock
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xstFreeChunk
--> xstDeleteChunk
<-- xstDeleteChunk rc=OK
--> xstInsertChunk
<-- xstInsertChunk rc=OK
<-- xstFreeChunk rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
<-- xstFreeMemBlock rc=OK
<-- xcsFreeMemBlock rc=OK
<-- zxcDeleteAgentConnData rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> xcsFreeMem
<-- xcsFreeMem rc=OK
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsCheckProcess
<-- xcsCheckProcess rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsCheckProcess
<-- xcsCheckProcess rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsCheckProcess
<-- xcsCheckProcess rc=OK
--> xcsReleaseMutexSem
--> xllSemRel
<-- xllSemRel rc=OK
<-- xcsReleaseMutexSem rc=OK
--> xcsRequestMutexSem
--> xllSemGetVal
<-- xllSemGetVal rc=OK
--> xllSemReq
<-- xllSemReq rc=OK
<-- xcsRequestMutexSem rc=OK
--> xcsCheckProcess
<-- xcsCheckProcess rc=xecP_E_INVALID_PID
--> xcsBuildDumpPtr
--> xcsGetMem
<-- xcsGetMem rc=OK
<-- xcsBuildDumpPtr rc=OK
--> xcsBuildDumpPtr
<-- xcsBuildDumpPtr rc=OK
--> xcsFFST
ECAnchor
3001fe20 5A584541 ZXEA
3001fe30 3000004C 300A18B4 300A18B4 00000007 0..L0..´0..´....
3001fe40 00000000 2006DE08 80000050 80000050 .... .Þ....P...P
3001fe50 30007090 00000000 30007090 3002161C 0.p.....0.p.0...
3001fe60 00004DB0 00000000 0002FDEF 000000AC ..M°......ýï...¬
3001fe70 00000000 00000000 000000A7 00000000 ...........§....
3001fe80 00000000 0001DF20 00000000 484B3137 ......ß ....HK17
3001fe90 00636B00 63726964 5F6C6F63 6B00636F .ck.crid_lock.co
3001fea0 6E666967 5F6C6F63 6B00696E 70757464 nfig_lock.inputd
3001feb0 645F6C6F 636B0073 75737065 2F766172 d_lock.suspe/var
3001fec0 2F6D716D 00000000 00000000 00000000 /mqm............
3001fed0 00000000 00000000 00000000 00000000 ................
3001fee0 to 300202a0 suppressed, lines same as above
300202b0 00000000 00000000 00000000 5A435048 ............ZCPH
300202c0 80007094 80007108 00000000 00000000 ..p...q.........
300202d0 00000000 5A435048 800FC5B0 800FC624 ....ZCPH..Ű..Æ$
300202e0 801007C4 30000FAC 300203D8 200C9578 ...Ä0..¬0..Ø ..x
300202f0 00000000 30000FF4 30020404 30032774 ....0..ô0...0.'t
30020300 00000000 00000000 30007090 00000000 ........0.p.....
30020310 00000000 00000000 00000000 00000001 ................
30020320 00000000 30020430 3000103C 00000000 ....0..00..<....
30020330 00000000 00000000 00000005 00028CCE ...............ÃŽ
30020340 00000000 0001603A 00000000 00025248 ......`:......RH
30020350 00000000 00041168 00000000 0003FB1E .......h......û.
30020360 0000001E 00036F20 00000000 00030832 ......o .......2
30020370 00000000 000495E2 00000000 00043616 .......â......6.
30020380 00000000 0003FAA6 00000000 30007120 ......ú¦....0.q
30020390 00000003 000052AC 00000000 00000000 ......R¬........
300203a0 00000000 00000001 00000009 00000000 ................
AgentAnchor
300257b0 5A584142 00000000 3000004C ZXAB....0..L
300257c0 00004DB0 0003FB1E 5A435048 80007094 ..M°..û.ZCPH..p.
300257d0 808530D4 80852F8C 00000037 00000003 ..0Ô../....7....
300257e0 00000002 30007090 3001F474 30007090 ....0.p.0.ôt0.p.
300257f0 3003F4C4 30007090 300257B4 30007090 0.ôÄ0.p.0.W´0.p.
30025800 3002A76C 30024A08 00000001 00000000 0.§l0.J.........
30025810 3083A6AC 00000000 00000000 00000001 0.¦¬............
30025820 00000000 00000000 00000000 00000000 ................
30025830 00000001 ....
I have escalated this problem to IBM also , but you know its tough to push IBM to give us the details faster, so do let me know if you can help ?
Thanks & Regards _________________ Ramnath Shiva
IBM Certified SOA Specialist
IBM Certified MQSeries Specialist
Standard Scope International Pvt Ltd , Chennai |
|
Back to top |
|
 |
dgolding |
Posted: Thu Dec 12, 2002 7:31 am Post subject: |
|
|
 Yatiri
Joined: 16 May 2001 Posts: 668 Location: Switzerland
|
Hi,
The FDC you posted is not similar to the ones mentioned in the first article. Yours appears to be a genuine MQ problem, in which case we have to leave it to IBM.
One thing, you state you had to reboot the machine to restart the queue manager. Did you use:
ipcs|grep mqm and then use ipcrm
to remove all shared memory, semaphores and the like?
Rebooting seems to be a bit drastic!
There is a new (undocumented) command called amqiclen which does this, but I've never got it to work. If you have only one queue manager on the box, the manual ipcrm is safe.
HTH |
|
Back to top |
|
 |
bower5932 |
Posted: Thu Dec 12, 2002 9:39 am Post subject: |
|
|
 Jedi Knight
Joined: 27 Aug 2001 Posts: 3023 Location: Dallas, TX, USA
|
If you've escalated to IBM, I would expect them to get back to you.
In the mean time, this FDC is usually associated with resource problems. dgolding's comment about cleaning up shared memory/semaphores is probably the way to go.
As an aside, if I read your FDC correctly, you are running MQSeries 5.1 (LVLS 510). If this is the case, this product has been out of support for a while and I would guess that the first thing IBM would tell you to do is to upgrade. |
|
Back to top |
|
 |
cvshiva |
Posted: Thu Dec 12, 2002 7:17 pm Post subject: |
|
|
 Apprentice
Joined: 04 Mar 2002 Posts: 35 Location: Chennai
|
Hi dgolding & bower ,
Thanks for your postings. Yes, I understand that it was a genuine MQ Problem.
I tried cleaning the Semaphores and Shared memory using ipcrm , still couldn't get the Queue Manager respond to any of the commands and finally had to reboot.
The "amqiclen" is something I haven't heard so far, I should try it , is ther e any help I could get on this ? please let me know more on this !!
Bower you are right , we are using MQS 5.1.
I Know that MQS 5.1 is already out of support for about an year and we have the upgrade plans in the first quarter next year.
U know that upgrade can't be done so easily in a busy Prodn environment like this , we got to go through so many procedures like SIT , UAT , OAT etc b4 we can do this on production and above all the Budget for this year isn't sufficient for the upgrade of 12 MQSeries Servers we have. This is the reason we got to do it early next year and survive with whatever we have on hand. We somehow have to seek the support of IBM as we have been renewing the Service contract with them.
I will wait for IBM to get back, but b4 that if there is anything you can suggest as a perventive measure (like increasing theSemaphores and Shared Memory Settings for MQ), it will be of great help !!
Also let me know is there any tool which I can use to get some useful info from the FDC files.
Thanks & Regards, _________________ Ramnath Shiva
IBM Certified SOA Specialist
IBM Certified MQSeries Specialist
Standard Scope International Pvt Ltd , Chennai |
|
Back to top |
|
 |
dgolding |
Posted: Thu Dec 12, 2002 11:38 pm Post subject: |
|
|
 Yatiri
Joined: 16 May 2001 Posts: 668 Location: Switzerland
|
Hi Ramnath,
I'm afraid "amqiclen" is 5.2 and above only, another good reason to upgrade It is undocumented and so probably unsupported.
As I'm not an AIX expert I would hesitate to recommend anything - increasing the amount of shared memory available and the number of semaphores sounds useful.
...and as for analysis tools, I don't know of any. I wrote a homegrown one here to classify the FDCs by Probe ID, user ID and the like, but it never took off. It's not really portable either, as it ties in with our alarm reporting mechanism.
Your best bet is a quick shell script that greps through the FDC files and does a simple report.
Probably quicker in the long run to go to 5.2 or above!
HTH
regards
Don |
|
Back to top |
|
 |
cvshiva |
Posted: Fri Dec 13, 2002 12:50 am Post subject: |
|
|
 Apprentice
Joined: 04 Mar 2002 Posts: 35 Location: Chennai
|
Hi Don,
Thanks for the reply.. !!
I tried searching for "amqiclen" and couldn't find, so I guessed it should be a feature in higher versions.
The Shell Script idea is fine , I will write it , but I was looking for a tool to make some meaning out of the binary characters in the FDC file
We are waiting to hear from IBM before we could decide to increase the Semaphores & Shared memory settings.
I am now trying use this problem as a driving force for upgrading to MQSeries 5.3 , So things will happen quickly !!.
Thanks & Regards _________________ Ramnath Shiva
IBM Certified SOA Specialist
IBM Certified MQSeries Specialist
Standard Scope International Pvt Ltd , Chennai |
|
Back to top |
|
 |
kkolla |
Posted: Tue Aug 05, 2003 10:18 am Post subject: FDC FFST |
|
|
Newbie
Joined: 05 Aug 2003 Posts: 1
|
+-----------------------------------------------------------------------------+
| |
| MQSeries First Failure Symptom Report |
| ===================================== |
| |
| Date/Time :- Friday July 25 18:07:33 EDT 2003 |
| Host Name :- dpa201 (SunOS 5.8) |
| PIDS :- 5765B75 |
| LVLS :- 520 |
| Product Long Name :- MQSeries for Sun Solaris 2 (Sparc) |
| Vendor :- IBM |
| Probe Id :- HL142100 |
| Application Name :- MQM |
| Component :- mqloopenp |
| Build Date :- Nov 7 2000 |
| CMVC level :- p000-L001106 |
| Build Type :- IKAP - (Production) |
| UserID :- 00022044 (mqm) |
| Program Name :- amqhasmx |
| Process :- 00001395 |
| Thread :- 00000001 |
| QueueManager :- PDIFPM01 |
| Major Errorcode :- Unknown(19) |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ0019 |
| Probe Severity :- 4 |
| Probe Description :- AMQ6090: MQSeries was unable to display an error |
| message 19. |
| |
+-----------------------------------------------------------------------------+
What does the above error mean ?? Thanks in advance! |
|
Back to top |
|
 |
bduncan |
Posted: Tue Aug 05, 2003 3:26 pm Post subject: |
|
|
Padawan
Joined: 11 Apr 2001 Posts: 1554 Location: Silicon Valley
|
kkolla,
You should probably post your question as a separate thread. Also, some context would be useful. What was your queue manager doing when this report was generated? _________________ Brandon Duncan
IBM Certified MQSeries Specialist
MQSeries.net forum moderator |
|
Back to top |
|
 |
EddieA |
Posted: Tue Aug 05, 2003 3:42 pm Post subject: |
|
|
 Jedi
Joined: 28 Jun 2001 Posts: 2453 Location: Los Angeles
|
This is an 'informational' error and can safely be ignored.
It's fixed with MQSeries for V5.2 CSD02.
Cheers, _________________ Eddie Atherton
IBM Certified Solution Developer - WebSphere Message Broker V6.1
IBM Certified Solution Developer - WebSphere Message Broker V7.0 |
|
Back to top |
|
 |
cvshiva |
Posted: Tue Aug 05, 2003 8:00 pm Post subject: |
|
|
 Apprentice
Joined: 04 Mar 2002 Posts: 35 Location: Chennai
|
Hi,
It looks like you are using MQSeries Version 5.1 . I am not sure whether latest patches are available on the IBM Website still !!!
IF your Queue Manager is running fine and you don't see any other errors closer to the timestamp of this error , u can ignore this !!
If you have any issues and suspect this to be the cause.. search for latest maintenance patches for MQSeries 5.1 , apply them if you haven't already and also try escalating the problem to IBM.
IBM will provide support only on a BEST EFFORT basis , meaning they will provide you with a remedy if they have one already , if not they will surely ask you to upgrade to 5.2 /5.3
Regards _________________ Ramnath Shiva
IBM Certified SOA Specialist
IBM Certified MQSeries Specialist
Standard Scope International Pvt Ltd , Chennai |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|