Author |
Message
|
vsathyan |
Posted: Tue Jun 02, 2015 7:33 pm Post subject: shmmni 100% used - Linux |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Hi all,
We are facing a strange issue in production without any change to the infrastructure which was deployed almost 4 months ago.
The shared V system memory - shmmni gets 100% full and the queue manager performance degrades, finally not accepting any connections and existing connections fail with MQRC 2009/2059.
There are only 3 server connection channels in this queue manager with a total number of 34 connections for all the three. The server connection channels have DISCINT = 0 and SHARECNV = 0. Does this create a problem?
There are other queue managers in the network, which are highly overloaded as compared to this queue manager, but they are using only around 45 out of 6400 sets of shmmni.
The queue manager is running with only 16 processes under 'mqm' user account
ps -ef | grep mqm
The operating system is Oracle Enterprise Linux 6.5. WebSphere MQ 7.5.0.2.
For a temporary fix, we increased the shmmni to 8192, but we have to identify and apply a permanent fix for this issue.
Below are the command outputs
----------------------------------------------------------
/opt/mqm/bin/mqconfig
System V Shared Memory
shmmax 68719476736 bytes IBM>=268435456 PASS
shmmni 6400 of 8192 sets (78%) IBM>=4096 WARN
shmall 417061616 of 4294967296 pages (9%) IBM>=2097152 PASS
[mqm@server ~]$ free
total used free shared buffers cached
Mem: 16330176 16168032 162144 10594096 103840 14658264
-/+ buffers/cache: 1405928 14924248
Swap: 2097144 0 2097144
-------------------------------------------------------------------
Also, in the MQ error logs we observed that there were errors logged with MQRC 2071. (MQRC_STORAGE_NOT_AVAILABLE). When we checked the NFS mount, the usage is around 6%
nfsserver:/mq_prodnfs/mq_prodnfs
50G 2.7G 47G 6% /mqdata
Out of 50GB, only 2.7GB is used and 47GB free.
Googled for MQRC 2071, and as indicated in a couple of links, the app is hosted on windows, and is not posting blank messages either.
The setup was running fine from nearly 4 months, and suddenly we have started facing this issue from last Friday.
Your inputs are much appreciated. Thanks in advance for your advise. |
|
Back to top |
|
 |
exerk |
Posted: Wed Jun 03, 2015 1:13 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
As a starting point, I suggest checking all Change Management Records applied on that date and see if any were applied specifically to your server, and if so, investigate that change. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Jun 03, 2015 2:52 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Also for storage not available: check not only for mqdata but also for the mqlogs file system.  _________________ MQ & Broker admin |
|
Back to top |
|
 |
tczielke |
Posted: Wed Jun 03, 2015 4:54 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
When you get to the shmmni 100% full condition, have you confirmed that MQ is taking up the shared memory segments? Have you checked with something like ipcs -m?
Also, you probably want to look into being on the latest fix pack, which is 7.5.0.5. _________________ Working with MQ since 2010. |
|
Back to top |
|
 |
vsathyan |
Posted: Wed Jun 03, 2015 7:59 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
@exerk,
There were no changes applied on that date, but a couple of months ago, there was a forced Linux patching. But that should not affect the queue manager after 2 months.
@fjp_saper
Unfortunately, data and logs are in the same share/mount (we are in the process of moving it to a different mount). There is space available and the usage is only 6%.
@tczielke,
ipcs -m listed active mq shared processes. There were no processes listed for destruction.
On an other note, we have identified a damaged queue object, used by N a s t e l monitoring agent. The contact admin process may have been trying to access this object and finally creating the problem.
Currently we have stopped the contact admin agent, restarted the queue manager and as of now, the shmmni is consistently used around 34 out of 8192 sets from the past 10 hours. We are monitoring the same.
Will update you once we have more information.
thanks all for your valuable time and inputs.
Cheers! |
|
Back to top |
|
 |
rammer |
Posted: Wed Jun 03, 2015 2:20 pm Post subject: |
|
|
Partisan
Joined: 02 May 2002 Posts: 359 Location: England
|
vsathyan wrote: |
On an other note, we have identified a damaged queue object, used by contact admin monitoring agent. The contact admin process may have been trying to access this object and finally creating the problem.
Currently we have stopped the contact admin agent, restarted the queue manager and as of now, the shmmni is consistently used around 34 out of 8192 sets from the past 10 hours. We are monitoring the same.
|
Sounds like a good spot. |
|
Back to top |
|
 |
vsathyan |
Posted: Tue Jul 14, 2015 8:17 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Update:
MQ 7.5.0.2 has a memory leak problem, confirmed by IBM and fixed in 7.5.0.5.
We tested before applying the maintenance pack, reproduced the issue, and applied the 7.5.0.5 - tried to reproduce using the same steps which we did in 7.5.0.2. The memory leak did not occur.
Hope this helps some one, who is still using MQ 7.5 :p _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
tczielke |
Posted: Tue Jul 14, 2015 8:23 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
Thanks for sharing this. Do you have the APAR number that corrects this issue in 7.5.0.5? _________________ Working with MQ since 2010. |
|
Back to top |
|
 |
vsathyan |
Posted: Tue Jul 14, 2015 9:47 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
|
Back to top |
|
 |
tczielke |
Posted: Tue Jul 14, 2015 9:54 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
Thanks. It looks like that APAR was corrected in 7.5.0.3, too. _________________ Working with MQ since 2010. |
|
Back to top |
|
 |
vsathyan |
Posted: Tue Jul 14, 2015 9:58 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Yeah, it was corrected in 7.5.0.3.
7.5.0.5 has a bunch of fixed applied. When we tested in our environment, there were no side effects and also we can sustain for an year or so.
Hence we deployed 7.5.0.5 in our prod environment. The environment is very stable now.
Thanks & Regards,
vsathyan _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
|