Author |
Message
|
jeevan |
Posted: Tue Nov 17, 2009 6:07 pm Post subject: AMQ6150 WebSphere MQ semaphore is busy. |
|
|
Grand Master
Joined: 12 Nov 2005 Posts: 1432
|
We had this problem in one of our gateway queue managers. According to my friend who attended the problem, he saw that some of the store registers were disconnected( alert by a script). when he tried to suspend the queue manager for maintenance, the command were not being processed. Then he shut down the queue manager, clean the semaphore and restarted. That resolved the problem.
My manager asked me to come up with some action for avoiding such problem in future.
The box is Aix with 6 gb of memory. What could have been happned so that the cluster repository manager did not get an access to a semaphore? could a semaphone be overwritten by another process so that first process loose track of it and error out?
Many think it is memory problem as the error log has semaphonre busy error. But I am not convinced but I do not have an alternative explanation.
Any clue would be appreciated
Last edited by jeevan on Tue Aug 10, 2010 12:35 pm; edited 2 times in total |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Nov 17, 2009 9:43 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
If it truely was a memory problem there should also have been an FDC cut.
Check it out  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jeevan |
Posted: Wed Nov 18, 2009 6:12 am Post subject: |
|
|
Grand Master
Joined: 12 Nov 2005 Posts: 1432
|
fjb_saper wrote: |
If it truely was a memory problem there should also have been an FDC cut.
Check it out  |
There were FDC files generated.
-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Sunday November 15 23:29:22 EST 2009 |
| Host Name :- mqprd1 (AIX 5.3) |
| PIDS :- 5724H7201 |
| LVLS :- 6.0.2.7 |
| Product Long Name :- WebSphere MQ for AIX |
| Vendor :- IBM |
| Probe Id :- XC308010 |
| Application Name :- MQM |
| Component :- xlsReleaseMutex |
| SCCS Info :- lib/cs/unix/rs_aix32/amqxlfsx.c, 1.75.1.6 |
| Line Number :- 2060 |
| Build Date :- Jun 17 2009 |
| CMVC level :- p600-207-090617 |
| Build Type :- IKAP - (Production) |
| UserID :- 00002000 (mqm) |
| Program Name :- amqrrmfa |
| Addressing mode :- 64-bit |
| Process :- 467024 |
| Thread :- 1 |
| QueueManager :- MQPRD1_QM |
| ConnId(1) IPCC :- 13 |
| ConnId(2) QM :- 13 |
| Last HQC :- 1.0.0-454296 |
| Last HSHMEMB :- 1.2.2-3207696 |
| Major Errorcode :- xecL_W_LONG_LOCK_WAIT |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6150 |
| Probe Severity :- 3 |
| Probe Description :- AMQ6150: WebSphere MQ semaphore is busy. |
| FDCSequenceNumber :- 0 |
| |
+-----------------------------------------------------------------------------+
We have Nagios monitoring the MQ and OS. When the queue manager was stopped, the memory released was 3.5 GB. so, I am stilll not convinced tha the problem was due to physical memory.
Thanks |
|
Back to top |
|
 |
chinni |
Posted: Wed Nov 18, 2009 6:42 am Post subject: |
|
|
Newbie
Joined: 07 Apr 2006 Posts: 7 Location: United States Of America
|
We had the same kind of problem couple of months back, We had the exatly same FDC's were created like what you have right now.
We used to run linear log clean up script with rcdimg job every 30min, and used to get the same kind of FDC files when the queue depth is high.
As per our MQ Semaphores busy alerts and .FDC files IBM recommended us to follow one of the below to resolve the issue
1. Avoid large queues. MQ is primarily a transport, not a database. Using MQ to store and randomly access large quantities of data is not generally recommended.
2. Avoid RCDMQIMG for large queues. It is not recommended to take frequent media images of very large objects. MQ V6 offers more informational messages regarding objects dependencies upon specific log files in an attempt to allow you to make more informed decisions about when you should take media images of different objects.
3. Tune your system. Some short term relief might be obtained by increasing the value of LogBufferPages to the maximum value (4096), which should allow the log records for the rcdmqimg to be written more quickly, however in the longer term you need to look at whether you are using MQ appropriately and your media image policy.
I hope this info will help. If not just ignore  _________________ chinni |
|
Back to top |
|
 |
jeevan |
Posted: Wed Nov 18, 2009 7:21 am Post subject: |
|
|
Grand Master
Joined: 12 Nov 2005 Posts: 1432
|
chinni wrote: |
We had the same kind of problem couple of months back, We had the exatly same FDC's were created like what you have right now.
We used to run linear log clean up script with rcdimg job every 30min, and used to get the same kind of FDC files when the queue depth is high.
As per our MQ Semaphores busy alerts and .FDC files IBM recommended us to follow one of the below to resolve the issue
1. Avoid large queues. MQ is primarily a transport, not a database. Using MQ to store and randomly access large quantities of data is not generally recommended.
2. Avoid RCDMQIMG for large queues. It is not recommended to take frequent media images of very large objects. MQ V6 offers more informational messages regarding objects dependencies upon specific log files in an attempt to allow you to make more informed decisions about when you should take media images of different objects.
3. Tune your system. Some short term relief might be obtained by increasing the value of LogBufferPages to the maximum value (4096), which should allow the log records for the rcdmqimg to be written more quickly, however in the longer term you need to look at whether you are using MQ appropriately and your media image policy.
I hope this info will help. If not just ignore  |
Thanks. Definitely, this is good information. But we are not running the linear log, and RCDMQIMG to take image of the log file.
What could be other scenario that the semaphore was busy? could have been taken up memory used by mq process by ohter process?
I am not an OS (AIX) expert but definitely I can ask my folks.
thanks |
|
Back to top |
|
 |
gbaddeley |
Posted: Wed Nov 18, 2009 2:33 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
jeevan wrote: |
There were FDC files generated.
-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Sunday November 15 23:29:22 EST 2009 |
| Host Name :- mqprd1 (AIX 5.3) |
| PIDS :- 5724H7201 |
| LVLS :- 6.0.2.7 |
| Product Long Name :- WebSphere MQ for AIX |
| Vendor :- IBM |
| Probe Id :- XC308010 |
| Application Name :- MQM |
| Component :- xlsReleaseMutex |
| SCCS Info :- lib/cs/unix/rs_aix32/amqxlfsx.c, 1.75.1.6 |
| Line Number :- 2060 |
| Build Date :- Jun 17 2009 |
| CMVC level :- p600-207-090617 |
| Build Type :- IKAP - (Production) |
| UserID :- 00002000 (mqm) |
| Program Name :- amqrrmfa |
| Addressing mode :- 64-bit |
| Process :- 467024 |
| Thread :- 1 |
| QueueManager :- MQPRD1_QM |
| ConnId(1) IPCC :- 13 |
| ConnId(2) QM :- 13 |
| Last HQC :- 1.0.0-454296 |
| Last HSHMEMB :- 1.2.2-3207696 |
| Major Errorcode :- xecL_W_LONG_LOCK_WAIT |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6150 |
| Probe Severity :- 3 |
| Probe Description :- AMQ6150: WebSphere MQ semaphore is busy. |
| FDCSequenceNumber :- 0 |
| |
+-----------------------------------------------------------------------------+
We have Nagios monitoring the MQ and OS. When the queue manager was stopped, the memory released was 3.5 GB. so, I am stilll not convinced tha the problem was due to physical memory.
Thanks |
Semaphores are lock flags or mutexes. They don't use much memory. It looks like something set a semaphore but failed to release it. The rest of the FDC might provide some more information of the contending resource. A Google search for ProbeId XC308010 indicates it could just be a very busy system (100% CPU). _________________ Glenn |
|
Back to top |
|
 |
servi |
Posted: Fri Jan 15, 2010 4:02 am Post subject: |
|
|
 Novice
Joined: 19 Mar 2008 Posts: 22 Location: Madrid, España
|
On Wednesday at 10:30 we had the same problem on our HP-UX gateway queue manager:
1. Firstly we detected MQChannels cannot open connections. We show the queue manager had reached the Max connections limit and the system had lots of connections in CLOSE_WAIT state (>1500). The queue manager connection limit is 8000 connections, and it had only 1400 active connections.
2. We intend to halt the queue manager (endmqm), but it didn't respond on time.
3. Finally we killed all the mq process and deleted the semaphores. The queue manager starts at 12:45 with out any problem.
We are now investigating the cause of the problem and we can see the same FDC error: " AMQ6150: WebSphere MQ semaphore is busy".
Here is the FDC Header:
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Wednesday January 13 10:53:29 MET 2010 |
| Host Name :- ues90034 (HP-UX B.11.11) |
| PIDS :- 5724H7202 |
| LVLS :- 6.0.2.2 |
| Product Long Name :- WebSphere MQ for HP-UX (PA-RISC platform) |
| Vendor :- IBM |
| Probe Id :- XC308034 |
| Application Name :- MQM |
| Component :- xlsReleaseMutex |
| SCCS Info :- lib/cs/unix/hp700_ux90/amqxlfsx.c, 1.51.1.2 |
| Line Number :- 1688 |
| Build Date :- Aug 1 2007 |
| CMVC level :- p600-202-070801 |
| Build Type :- IKAP - (Production) |
| UserID :- 00002001 (mqm) |
| Program Name :- amqzlaa0_nd |
| Addressing mode :- 64-bit |
| Process :- 9992 |
| Thread :- 4 |
| QueueManager :- MPSDSP01 |
| ConnId(1) IPCC :- 75153270 |
| ConnId(2) QM :- 6279114 |
| Last HQC :- 2.5.19-284872 |
| Last HSHMEMB :- 1.2.2-32295976 |
| Major Errorcode :- xecL_W_LONG_LOCK_WAIT |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6150 |
| Probe Severity :- 3 |
| Probe Description :- AMQ6150: WebSphere MQ semaphore is busy. |
| FDCSequenceNumber :- 0 |
| |
+-----------------------------------------------------------------------------+
Platform: HP-UX (PA-RISC)
Log:
LogPrimaryFiles=250
LogSecondaryFiles=2
LogFilePages=1024
LogType=CIRCULAR
LogBufferPages=0
LogPath=/MQHA/MPSDSP01/log/MPSDSP01/
LogWriteIntegrity=TripleWrite
Channels:
MaxChannels=8000
MaxActiveChannels=8000
PipeLineLength=2
TCP:
KeepAlive=Yes
¿Could be the same problem? ¿Any idea what would happened? |
|
Back to top |
|
 |
mvic |
Posted: Fri Jan 15, 2010 8:25 am Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
servi wrote: |
Any idea what would happened? |
I suggest to open a PMR against MQ and send the full set of files from /var/mqm/errors (in a zipfile etc.) into IBM. |
|
Back to top |
|
 |
George Carey |
Posted: Fri Feb 12, 2010 1:27 pm Post subject: same problem in Linux mqv7.0.0.2 |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
What was IBM's resolution to this semaphore busy issue ..
having the same problem ... would serious like/need to know.
Is there a PMR post 7.0.0.2 for it ???
(minimal activity on system,low queue depths, circular logs being used)
Such problems go back to 2005 if not before and PMR's were generated for it. Hope it is not a recurring bug !!
GTC _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Fri Jun 25, 2010 11:07 am Post subject: again |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Just bumped into this semaphore busy FDC error again ... on another Linux server with MQv7.0.0.2 ... don't know if I ever got the final resolution from IBM or anywhere on this ... at least I don't remember it if I did ...
Anyother person posting on this item ever get a final root cause resolution to this semaphore busy problem, PMR number or the like ?
I have a fix for a semaphore leak that I have not applied yet but that problem presents a different set of symptoms than this.
TIA _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
jeevan |
Posted: Fri Jun 25, 2010 12:51 pm Post subject: Re: again |
|
|
Grand Master
Joined: 12 Nov 2005 Posts: 1432
|
George Carey wrote: |
Just bumped into this semaphore busy FDC error again ... on another Linux server with MQv7.0.0.2 ... don't know if I ever got the final resolution from IBM or anywhere on this ... at least I don't remember it if I did ...
Anyother person posting on this item ever get a final root cause resolution to this semaphore busy problem, PMR number or the like ?
I have a fix for a semaphore leak that I have not applied yet but that problem presents a different set of symptoms than this.
TIA |
We do not have that problem again. We upgraded to 7.0.1.1 and also appleid a few ifixes. That fixes the problem. I do not exactly remember the fix for this particular problem though.
Can you pleas paste probe id in the FDC file?
I would suggest to upgrade to 7.0.1.2 if possible. This version has major ifixes needed to be applied over 7.0.1.1
thanks |
|
Back to top |
|
 |
George Carey |
Posted: Fri Jun 25, 2010 2:24 pm Post subject: probe id |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Probe ID XC307070
Probe Type MSGAMQ6150
component xlsRequestMutex _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
mvic |
Posted: Fri Jun 25, 2010 2:42 pm Post subject: Re: probe id |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
George Carey wrote: |
Probe ID XC307070
Probe Type MSGAMQ6150
component xlsRequestMutex |
Do you have the text xecL_W_LONG_LOCK_WAIT in there?
These can appear if you have very deep queues.
If not, open a PMR.
You'll probably be advised to get off of 7.0.0.2 because it's so old, and still very soon after the beginning of a major release. |
|
Back to top |
|
 |
jeevan |
Posted: Fri Jun 25, 2010 4:09 pm Post subject: Re: probe id |
|
|
Grand Master
Joined: 12 Nov 2005 Posts: 1432
|
George Carey wrote: |
Probe ID XC307070
Probe Type MSGAMQ6150
component xlsRequestMutex |
Are you using linear logging? do you get this problem when you are trying to create media images(RCDMQIMG) of a deep queue?
do you also have FDC with probe id: xc3070810
what is the value of LogBufferPages? increasing this also may help if this is due to CDMQIMGing of a large queue. |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jun 28, 2010 11:34 am Post subject: no linear logs, etc. |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Quote: |
What was IBM's resolution to this semaphore busy issue ..
having the same problem ... would serious like/need to know.
Is there a PMR post 7.0.0.2 for it ???
(minimal activity on system,low queue depths, circular logs being used)
Such problems go back to 2005 if not before and PMR's were generated for it. Hope it is not a recurring bug !!
GTC |
repeating my previous post ... I would prefer to do a spot fix ... the last upgrade to the latest version 7.0.1 I did had major impact ... (MQ problems where there were none before) ... but thought I would get to latest version of MQ as we had to move to new platform (Sun to Linux) but where MQ was seen as rock solid before ... it has since lost some of that lustre since moving to MQv7.x and a spot fix can be less of a deployment impact than a full re-install. While going to latest version of MQ v 7.0.1.2(or 3) would likely fix the problem ... I would like to associate cause and effect with the fixed used if at all possible.
For example another PMR with a semaphore leak had a patch which just replaces amqzlaa0 module ... a lot easier and less impact to deploy on a running production environment.
TIA _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
|