ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » Lack of available virtualized memory space alerts

Post new topic  Reply to topic
 Lack of available virtualized memory space alerts « View previous topic :: View next topic » 
Author Message
nelson
PostPosted: Wed Oct 19, 2016 6:16 am    Post subject: Lack of available virtualized memory space alerts Reply with quote

Partisan

Joined: 02 Oct 2012
Posts: 313

Hi all,

We are working on MQ7.5.0.4:

A lot of these FDC's were generated before a MQ crash:


Code:
+-----------------------------------------------------------------------------+
|                                                                             |
| WebSphere MQ First Failure Symptom Report                                   |
| =========================================                                   |
|                                                                             |
| Date/Time         :- Fri October 14 2016 14:24:33 AST                       |
| UTC Time          :- 1476469473.952225                                      |
| UTC Time Offset   :- -240 (AST)                                             |
| Host Name         :- myhost                                                |
| Operating System  :- AIX 7.1                                                |
| PIDS              :- 5724H7221                                              |
| LVLS              :- 7.5.0.4                                                |
| Product Long Name :- WebSphere MQ for AIX                                   |
| Vendor            :- IBM                                                    |
| Installation Path :- /usr/mqm                                               |
| Installation Name :- Installation1    (1)                                   |
| License Type      :- Production                                             |
| Probe Id          :- XC267011                                               |
| Application Name  :- MQM                                                    |
| Component         :- xehAsySignalMonitor                                    |
| SCCS Info         :- /build/slot1/p750_P/src/lib/cs/unix/amqxerrx.c,        |
| Line Number       :- 2912                                                   |
| Build Date        :- Aug  7 2014                                            |
| Build Level       :- p750-004-140807                                        |
| Build Type        :- IKAP - (Production)                                    |
| Effective UserID  :- 12 (mqm)                                               |
| Real UserID       :- 203 (mqsi)                                             |
| Program Name      :- amqzlaa0                                               |
| Addressing mode   :- 64-bit                                                 |
| LANG              :- en_US                                                  |
| Process           :- 2097488                                                |
| Thread            :- 2                                                      |
| UserApp           :- FALSE                                                  |
| Last HQC          :- 0.0.0-0                                                |
| Last HSHMEMB      :- 0.0.0-0                                                |
| Major Errorcode   :- xecE_W_UNEXPECTED_ASYNC_SIGNAL                         |
| Minor Errorcode   :- OK                                                     |
| Probe Type        :- MSGAMQ6209                                             |
| Probe Severity    :- 3                                                      |
| Probe Description :- AMQ6209: An unexpected asynchronous signal (33 :       |
|   SIGDANGER) has been received and ignored.                                 |
| FDCSequenceNumber :- 0                                                      |
| Arith1            :- 33 (0x21)                                              |
| Arith2            :- 2097488 (0x200150)                                     |
| Comment1          :- SIGDANGER                                              |
|                                                                             |
+-----------------------------------------------------------------------------+


And here is the IBM's tech note related to the issue:

Quote:
Problem(Abstract)
To alert you to a lack of available virtualized memory space, WebSphere MQ generates many FDC files with the Probe ID XC267011, showing the signal SIGDANGER.


http://www-01.ibm.com/support/docview.wss?uid=swg21685969

Is there any possibility that the problem was not related to virtual memory space? Something like a high CPU usage for example or other failing resource?

Have any of you faced the issue?

Thanks in advance.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Oct 19, 2016 6:20 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Have you already eliminated the possibility that it was out of memory?
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Oct 19, 2016 6:21 am    Post subject: Re: Lack of available virtualized memory space alerts Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

nelson wrote:
Is there any possibility that the problem was not related to virtual memory space? Something like a high CPU usage for example or other failing resource?


So why don't you think the problem is what IBM says it is?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
nelson
PostPosted: Wed Oct 19, 2016 6:35 am    Post subject: Reply with quote

Partisan

Joined: 02 Oct 2012
Posts: 313

mqjeff wrote:
Have you already eliminated the possibility that it was out of memory?


In fact... but unfortunately we don't have a historical of the virtual memory and other resources consumption. Neither another previous error that indicates a high virtual memory usage. By now, after the crash, all is working fine and have no other option but to keep monitoring the processes...

Other than a memory leak in a IIB flow for example, at MQ level, what another "silent" problem could lead to a high virtual memory usage?

The AIX specialists confirmed us that there is enough VM space, the maximum recommended for the current physical memory.

Thanks in advance for your comments.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Oct 19, 2016 6:38 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Do you see any indications of any crashes or leaks in a message flow?

Have the AIX team confirmed that the memory of the virtual machine is not shared between other virtual machines?
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Oct 19, 2016 7:08 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

nelson wrote:
The AIX specialists confirmed us that there is enough VM space, the maximum recommended for the current physical memory.


Even at the actual time MQ was sending SIGDANGER?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
nelson
PostPosted: Wed Oct 19, 2016 8:57 am    Post subject: Reply with quote

Partisan

Joined: 02 Oct 2012
Posts: 313

mqjeff wrote:
Do you see any indications of any crashes or leaks in a message flow?

Have the AIX team confirmed that the memory of the virtual machine is not shared between other virtual machines?


Six minutes after the alerts an abend/dump was generated:

Code:
+-----------------------------------------------------------------------------+
|                                                                             |
|                                                                             |
| First Failure Symptom Report                                                |
|   ========================                                                  |
|                                                                             |
| Proc start time (GMT) :- Fri Oct 14 14:30:44 2016                           |
|                                                                             |
|   Product Details                                                           |
|   +++++++++++++++                                                           |
|                                                                             |
| Vendor                :- IBM                                                |
| Product Name          :- IBM Integration Bus                                |
| Program ID            :- 5724-J05                                           |
| Version               :- 9002                                               |
|                                                                             |
|   OS Information                                                            |
|   ++++++++++++++                                                            |
|                                                                             |
| Operating System      :- AIX                                                |
| Version               :- 7                                                  |
| Release               :- 1                                                  |
| Node Name             :- aixcl001                                           |
| Machine ID            :- 00F944014C00                                       |
|                                                                             |
|   Environment                                                               |
|   +++++++++++                                                               |
|                                                                             |
| Installation Path     :- /opt/IBM/mqsi/9.0.0.2                              |
| Service User ID       :- mqsi                                               |
| Work Path             :- /var/mqsi                                          |
| Executable Name       :- DataFlowEngine                                     |
| Process ID            :- 5505436                                            |
|                                                                             |
|   Deployment                                                                |
|   ++++++++++                                                                |
|                                                                             |
| Component Name        :- BROKER                                             |
| Component UUID        :- 1896f272-c833-11e4-b480-ac11699c0000               |
| Queue Manager         :- QMGR                                               |
| Execution Group       :- execution.group                                    |
| EG UUID               :- a1b60d35-4c01-0000-0080-d3d407c2eed9               |
| User trace            :- 0                                                  |
| Service trace         :- 0                                                  |
| Trace size            :- 0                                                  |
|                                                                             |
|   Build Information                                                         |
|   +++++++++++++++++                                                         |
|                                                                             |
| Backing build         :-                                                    |
| Sandbox               :- /build/slot1/S900_P                                |
| CMVC Level            :- S900-FP02                                          |
| Build type            :- Production                                         |
| Build context         :- rios_aix_4                                         |
| 64 Bit Build          :- yes                                                |
|                                                                             |
|   Failure Location                                                          |
|   ++++++++++++++++                                                          |
|                                                                             |
| Time of Report (GMT)  :- secs since 1/1/1970: 1476469855                    |
| Message Flow          :-                                                    |
|  failingFlow                                           |
| Thread ID             :- 0x0000000000004143                                 |
|                                                                             |
+-----------------------------------------------------------------------------+
                                                                               
abend record for pid 5505436 tid 16707 time in seconds since 01/01/1970: 1476469855
File: /build/slot1/S900_P/src/CommonServices/Unix/ImbAbend.cpp
Line: 1131
Function: signal received
---- Inserts ----
11
@(#) MQMBID sn=S900-FP02 su=_0DRGUPfNEeO0_-a3FU977g pn=CommonServices/Unix/ImbAbend.cpp]
498721728
-----------------
----------------------------- Stack dump for current thread (   16707)
(0x1db9e440+??????] <no name available> []
(0x07459580+0x0000013c) getNextBuffer__17ImbXMLNSCServicesFPUiPPcT1 [/opt/IBM/mqsi/9.0.0.2/lib/libGenXmlParser4.a(libGenXmlParser4.a.so)]
(0x085fb380+0x0000004c) getNextBuffer__9@15@xlxpcFPiPUiPPcPv [/opt/IBM/mqsi/9.0.0.2/xlxpc/lib/libBIPNVBRT11.0.a]
(0x085ee760+0x00000098) @17@GNBDataSource_load [/opt/IBM/mqsi/9.0.0.2/xlxpc/lib/libBIPNVBRT11.0.a]
(0x08607a60+0x00000250) parseSetup__Q2_5xlxpc11XLXPCParserFb [/opt/IBM/mqsi/9.0.0.2/xlxpc/lib/libBIPNVBRT11.0.a]
(0x0744c000+0x00000188) setupXLXPParser__19ImbXMLNSCDocHandlerFv [/opt/IBM/mqsi/9.0.0.2/lib/libGenXmlParser4.a(libGenXmlParser4.a.so)]
(0x07470880+0x0000008c) parseAll__19ImbXMLNSCDocHandlerFv [/opt/IBM/mqsi/9.0.0.2/lib/libGenXmlParser4.a(libGenXmlParser4.a.so)]
(0x07474b00+0x000005e4) parseLastChild__15ImbXMLNSCParserFP16ImbSyntaxElement [/opt/IBM/mqsi/9.0.0.2/lib/libGenXmlParser4.a(libGenXmlParser4.a.so)]
(0x066d2700+0x000004dc) createLastChild__16ImbSyntaxElementFRC10ImbWstringPC20ImbDefaultPropertiesRC17ImbMessageOptionsRC21ImbBufferedStringBaseXTUcTQ2_3std11char_traitsXTUc_TUsSP24SP128_iT5N31 [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x08133780+0x00000254) Java_com_ibm_broker_plugin_MbElement__1createElementAsLastChildFromBitstream [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x0af81068+??????] <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0af10ba0+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x081d9800+0x00000048) CallVoidMethod__7JNIEnv_FP8_jobjectP10_jmethodIDe [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x0821a700+0x000002e8) evaluate__10ImbJniNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x06709d80+0x0000021c) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06708e80+0x0000010c) propagateInner__19ImbDataFlowTerminalFRC18ImbMessageAssemblyP19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06c3cb00+0x0000016c) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x0829db80+0x000000d0) Java_com_ibm_broker_plugin_MbOutputTerminal__1propagate [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x0af8102c+??????] <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0af0b280+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0af2c500+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0afd79a0+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9prt26.so]
(0x0af2c560+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0af0b9c0+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x0af10ba0+0x00000000) <no name available> [/opt/IBM/mqsi/9.0.0.2/jre17/lib/ppc64/compressedrefs/libj9vm26.so]
(0x081d9800+0x00000048) CallVoidMethod__7JNIEnv_FP8_jobjectP10_jmethodIDe [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x0821a700+0x000002e8) evaluate__10ImbJniNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lib/libimbjplg.a]
(0x06709d80+0x0000021c) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06708e80+0x0000010c) propagateInner__19ImbDataFlowTerminalFRC18ImbMessageAssemblyP19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06c3cb00+0x0000016c) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x11f64a80+0x00000714) evaluate__14ImbComputeNodeFRC18ImbMessageAssemblyPC19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lil/imbdfsql.lil]
(0x06709d80+0x0000021c) evaluate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06708e80+0x0000010c) propagateInner__19ImbDataFlowTerminalFRC18ImbMessageAssemblyP19ImbDataFlowTerminal [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x06c3cb00+0x0000016c) propagate__19ImbDataFlowTerminalFRC18ImbMessageAssembly [/opt/IBM/mqsi/9.0.0.2/lib/libMessageServices.a(libMessageServices.a.so)]
(0x0a7b4500+0x0000292c) run__18ImbCommonInputNodeFP11ImbOsThread [/opt/IBM/mqsi/9.0.0.2/lib/libMQLibrary.a(libMQLibrary.a.so)]
(0x0a7dcd80+0x00000044) run__Q2_18ImbCommonInputNode10ParametersFP11ImbOsThread [/opt/IBM/mqsi/9.0.0.2/lib/libMQLibrary.a(libMQLibrary.a.so)]
(0x05eb1500+0x0000008c) run__27ImbThreadPoolThreadFunctionFP11ImbOsThread [/opt/IBM/mqsi/9.0.0.2/lib/libCommonServices.a(libCommonServices.a.so)]
(0x05eaaa00+0x000000a8) threadRun__11ImbOsThreadFv [/opt/IBM/mqsi/9.0.0.2/lib/libCommonServices.a(libCommonServices.a.so)]
(0x05ea9e00+0x000000e0) threadBootStrap__11ImbOsThreadFPv [/opt/IBM/mqsi/9.0.0.2/lib/libCommonServices.a(libCommonServices.a.so)]
(0x00545d20+0x000000f4) _pthread_body [/usr/lib/libpthreads.a(shr_xpg5_64.o)]
(0x00000000) <invalid code address>
----------------------------------------------------------------------


Could this lead to a global Virtual Memory crash? Does this abend indicates some memory leak? Does this abend could be a result of a low paging space?

Thanks in advance for your comments.
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Oct 19, 2016 9:24 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

nelson wrote:
Could this lead to a global Virtual Memory crash? Does this abend indicates some memory leak? Does this abend could be a result of a low paging space?


So when you said:

nelson wrote:
We are working on MQ7.5.0.4


you didn't think it was important to add:

"we're also using a version of IIB that's not up to date with maintenance that's running Lord knows what user code that could easily have a leak in it"



It's unclear if the IIB dump is a problem of its own or a result of the MQ crash bring the broker down in an untidy heap, but I would strongly suspect the reason MQ is reporting a shortage of memory is because the user code inside the IIB EG is hogging it, and that's the first place I'd start looking.

While planning to bring IIB up to date with maintenance.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Oct 19, 2016 9:37 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Okay. Three things.
  1. If the virtual machine is set up to share physical memory with other VMs on the same server, then you are subject to out of memory errors at any time.
  2. As my esteemed colleague said, upgrade to a supported version and fixpack.
  3. As my esteemed colleague says, make sure your flows aren't running in loops and hogging memory

_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Wed Oct 19, 2016 4:19 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2492
Location: Melbourne, Australia

If memory and paging space is low, the OS will send SIGDANGER to processes, hoping that they will terminate and hence free up some memory.

http://www.ibm.com/support/knowledgecenter/SSPHQG_7.2.0/com.ibm.powerha.insgd/ha_install_error_sigdanger.htm
_________________
Glenn
Back to top
View user's profile Send private message
nelson
PostPosted: Thu Nov 03, 2016 5:26 am    Post subject: Reply with quote

Partisan

Joined: 02 Oct 2012
Posts: 313

Hi all, thanks for your comments.

We got this from IBM, related to a java class that had class level variables accidentally shared across multiple instances (java variables must be declared within the evaluate() method). We are not sure, but this could led to a memory crash..

Quote:
Accidentally sharing memory allocated to one threads processing across
multiple threads will be catastrophic to the DataFlowEngine process.
An attempt to do this for a flow with additional instances will cause
one thread to delete allocated memory from under another thread. This
leads to the type of abends that have been observed.
If both of the threads attempt to delete the same underlying memory
blocks then the second delete will randomly erase part of the heap as
the malloc blocks would not be meaningful on the same pointer the second
time around.

When dealing with MbMessageAssembly, MbMessage and MbElement objects
these are not self-contained java objects. Each has an underlying C++
pointer to the real message/tree object on the heap. So this is not just
a case of simply dereferencing a simple java object. One thread
overwriting the class level object will change all the underlying
C++ pointers the first thread was using.

With these points in mind it is strong possibility that this same
condition has led to the "traceassert" javacore that the customer has
observed. At the very least with this erroneous java there is a large
unknown in the system.

In terms of the corruption of the heap due to double deletion this would only affect the DataFlowEngine process to which these java classes were deployed and had additional instances running against them.

In the past we have seen cases where objects have accidentally been shared across threads and this has led to orphaned pointers to allocated memory and as such memory leaks could occur. If the DataFlowEngine leaked a significant amount of memory then this would cause an exhaustion in the system. However, this would just be conjecture.


Kind regards.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Thu Nov 03, 2016 6:05 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

nelson wrote:
Hi all, thanks for your comments.

We got this from IBM, related to a java class that had class level variables accidentally shared across multiple instances (java variables must be declared within the evaluate() method). We are not sure, but this could led to a memory crash..

Kind regards.

There is nothing accidental about that. The scope of a class level variable is clear. What you must understand is that the JCN class is not an instance per message, but acts more like a singleton. The message is then a thread executing the evaluate method... It has implications on scope and data integrity when you look at variable declarations. You also need to be concerned with the thread safety of your code
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » Lack of available virtualized memory space alerts
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.