Author |
Message
|
George Carey |
Posted: Fri Jun 04, 2010 4:07 pm Post subject: MQ Internal Error on semget |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Let me try this (lengthy question) on this forum as there were no real stabs on the list forum.
-------------------------------------------------
Quote: |
Trying to understand root cause of kernel resource error. Or actually
perhaps a likely kernel resource usage scenario.
Problem Scenario: Have a java application(not app server hosted ) running
on same box as an MQ server. The box was Solaris and no environment
interaction issues seen on kernel resource usage. A movement to RedHat
Linux on Dell servers was done.
Now an infrequent occurrence of the following error type occurs:
"AMQ6119: An internal Websphere MQ error has occurred ('28 No space left
on device' from semget.)from the amqzxma0 process and MQ crashes.
The kernel semaphore settings are set to standard values from Quick
Beginnings manual for Linux. Settings were not a problem when in
Solaris(9/10?) but appears to be an issue with RedHat Linux 5.1.
questions ...
1.) Assuming the problem is not MQ semaphore usage setting but a semaphore
useage setting for the other environment sharing app, why is MQ the
application that always crashes with a semget error and never the other
application (which is a resource hog).
Kind of hard to say it's other applications problem when MQ crashes and
not the other application. Looking for plausible/likely semaphore
consumption scenario that would always cause MQ to fail. Like perhaps
other app grabs(static allocation) large number of semaphores up front and
doesn't let go of them and MQ is more dynamic and doesn't know it will hit
a wall until it hits it !? |
_________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Jun 04, 2010 4:22 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
...
And what were the results of the PMR you opened?
.... |
|
Back to top |
|
 |
mvic |
Posted: Sat Jun 05, 2010 2:41 pm Post subject: Re: MQ Internal Error on semget |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
George Carey wrote: |
Trying to understand root cause of kernel resource error. Or actually perhaps a likely kernel resource usage scenario. |
1. monitor usage of these items over time using ipcs -a
2. increase kernel numbers that are short. Maybe MQ needed more than the "basic" setup given in the QB manual? Don't know, though. |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jun 07, 2010 6:52 am Post subject: Response to response |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Yes, need to up semaphore kernel parameters more than likely... but how much ... also the issue is the other co-habitatting application is a hog and the application's maintenance team does not have any sense if or how much kernel semaphore usage it needs.
My issue is that trying to say it is the other app when MQ is the app that crashes is a difficult argument. I am trying to get a better sense of how semaphore usage by MQ or apps in general may indicate that MQ crashing is a red herring and the other app is proximate cause/culprit for lack of setting appropriate combinded semaphore usage values for MQ and the othe app.
As per Jeff's PMR response I am awaiting approval/funding for a new Secure Support Offering from IBM to be put in place (when? out of my control)... I am somewhat hamstrung with current support setup in the interim.
Thus needing to figure things out as best I can in this less than optimal support environment. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
mvic |
Posted: Mon Jun 07, 2010 7:10 am Post subject: Re: Response to response |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
George Carey wrote: |
My issue is that trying to say it is the other app when MQ is the app that crashes is a difficult argument. I am trying to get a better sense of how semaphore usage by MQ or apps in general may indicate that MQ crashing is a red herring and the other app is proximate cause/culprit for lack of setting appropriate combinded semaphore usage values for MQ and the othe app. |
ipcs -a should show the id of the original creator of the segment. Should typically be "mqm" for MQ-owned stuff. |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Jun 07, 2010 7:23 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Without taking the time to accumulate ipcs -a results, you can always take a WAG and double the kernel parameters for the qmgr and see what happens. |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jun 07, 2010 8:10 am Post subject: ClientIdle etc |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Yes, I was thinking if the other app doesn't crash with default sem settings that just doulbling sem setting to assure MQ gets what it should need.
But looking at the ipcs -a command output I am seeing root and mqm owners only. I believe an application does not have to use the kernel semaphores for resource locking.
But what if it is an improperly coded MQ client app that is causing the exhaustion of the kernel semaphores.
Can a misbehaving MQ client cause the MQ server to out-strip its semaphore usage? ... Say it doesn't disconnect properly from queue or server or the like thus leaving semaphores allocated ?... Again not sure where in the MQI API callls or which MQI API calls the kernel semaphores are allocated and de-allocated ... so unsure on this in general.
So increasing the semaphore allocation may just be a delaying action. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
exerk |
Posted: Mon Jun 07, 2010 9:57 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
George,
Have you run THIS? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jun 08, 2010 7:44 am Post subject: good info |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Thanks good info ... it will be of value in general.
GTC _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Wed Jun 09, 2010 7:18 am Post subject: semmni |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Great little tool!!
IBM should make it part of the product release or at minimum a supportpac.
If it was just doing a compare of sysctl.conf values to Quick Start Guide recommendation it would not be of much value ... but it also appears to be giving the current consumed/in use semaphores usage to the recommendations and gives a OK, WARN, or Fail value on that ... very nice.
I will need to read the script in detail to see how it determines semaphores in use.
I have an Linux server with two MQ QMGRs on it ... the semmni usage on running the 'mqconfig' script yesterday showed a '78% WARN' message and today it shows '82% WARN' message.
The problem client process has been separated onto its own server and is connecting to these QMGRs as remote MQ Client ... what would be a likely scenario that would cause these semmni (semaphore sets) to not be released and just increase? The client code is only doing simple Gets via JAVA/JMS.
TIA _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Wed Jun 09, 2010 12:09 pm Post subject: not as sexy as I thought |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Well looking at the mqconfig script ... it is not as sexy as I thought and in fact I think I saw it some time ago and dismissed it as just doing a compare of sysctl.conf values to IBM Quick start quide.
But it does a bit more ... to get the semmni count it just used ipcs -s and pipes to wc -l with a little formatting. I thought there was some system call to get the value !
Bottom line not sexy but still useful and the doco along with it is just as useful ! So maybe not needed as a supportpac
But a section in the Quick Start Guide or the like on the same subject matter would be useful. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Jun 09, 2010 9:45 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
GeorgeCarey wrote: |
The problem client process has been separated onto its own server and is connecting to these QMGRs as remote MQ Client ... what would be a likely scenario that would cause these semmni (semaphore sets) to not be released and just increase? The client code is only doing simple Gets via JAVA/JMS. |
You need to make sure that the apps are closing their channels correctly.
Alternatively check if using a clientidle stanza would help.  _________________ MQ & Broker admin |
|
Back to top |
|
 |
George Carey |
Posted: Fri Jun 11, 2010 9:01 am Post subject: app or not app |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
The app can be taken down entirely and the semaphore useage is still allocated ... the client channel is not even connected. Not until the QMGR is bounced do the semaphores get released back to the system.
IBM support said there was a semaphore leak in versions 7 < 7.0.0.2 but was fixed in 7.0.0.2 ! Maybe not ! _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
Vitor |
Posted: Fri Jun 11, 2010 9:58 am Post subject: Re: app or not app |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
George Carey wrote: |
IBM support said there was a semaphore leak in versions 7 < 7.0.0.2 but was fixed in 7.0.0.2 ! Maybe not ! |
This will be an interesting PMR.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Jun 11, 2010 11:38 am Post subject: Re: app or not app |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
George Carey wrote: |
IBM support said there was a semaphore leak in versions 7 < 7.0.0.2 but was fixed in 7.0.0.2 ! Maybe not ! |
Are you sure that wasn't meant to be
Quote: |
was fixed in 7.0.1.2 |
 _________________ MQ & Broker admin |
|
Back to top |
|
 |
|