Author |
Message
|
roshan.171188 |
Posted: Fri Oct 11, 2013 9:10 am Post subject: MQ hung for too many open files (sigfault occured?) |
|
|
Apprentice
Joined: 07 Jun 2012 Posts: 35
|
Hi,
One of our MQ systems went into Hung state and while looking for the root cause we noticed file descriptors limit was reached on the system.
At the same point of time errors in MQ logs, FDC fies and "sigfault for amqrmppa" was generated in system logs.
Both MQ logs and FDC files point to open files limit being reached but when i am concerned is about the below :
> xcsConnectSimplePipe error occured in the FDC just before the sigfault error occured into the system logs.
can the signal fault occur because of the open files limit being reached?
Also, I can not find anything related to ConnectSimplePipe error on google or IBM's site.
Can anyone help please? |
|
Back to top |
|
 |
roshan.171188 |
Posted: Fri Oct 11, 2013 9:12 am Post subject: |
|
|
Apprentice
Joined: 07 Jun 2012 Posts: 35
|
I have a theory that MQ channel tried to open one of the sockets (security exit) and failed (due to open files limitation) creating a sigfault
Am i guessing right? if so, Is there somewhere i can check to make it a fact? |
|
Back to top |
|
 |
JosephGramig |
Posted: Fri Oct 11, 2013 12:15 pm Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
With Problem Determination (PD), start with FDCs. Get the ProbeID and search the IBM site for hits. Read those and see if the answer presents itself. Next, ensure no file systems are full. Then check the Qmgr error logs.
This could be tuning but hard to say without more homework on your side. |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Oct 11, 2013 1:34 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
If running on Unix or Linux did you set the (u)limits as suggested in the platform specific installation manual?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Oct 11, 2013 2:28 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
|
Back to top |
|
 |
roshan.171188 |
Posted: Sat Oct 12, 2013 7:01 pm Post subject: |
|
|
Apprentice
Joined: 07 Jun 2012 Posts: 35
|
Guys, Yes the kernel settings was built as per prior requirement of IBM (Open files = 32768) which needs to be changed to 524288 as per the new requirement.
But, I have a different concern.
The syslog shows segfault on the amqrmppa process multiple times and This HAS to be linked to the open files limit being reached, but I can not find a relation between the two other than the co-incidentals (same time and same process creating the errors in FDC, MQ logs and syslog - segfault)
It will also be helpful if someone could tell me, how come the segfault occured and no core file was generated.....MQ process if occurs a segfault does not create core dumps? |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Oct 13, 2013 5:15 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
What version/release/mod of WMQ?
What o/s? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
roshan.171188 |
Posted: Sun Oct 13, 2013 5:52 am Post subject: |
|
|
Apprentice
Joined: 07 Jun 2012 Posts: 35
|
version - 7.0.1.6
Linux 64 bit. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Oct 13, 2013 6:18 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
roshan.171188 wrote: |
version - 7.0.1.6
Linux 64 bit. |
Which Linux? What version/release/mod? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Oct 13, 2013 6:22 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
roshan.171188 wrote: |
Guys, Yes the kernel settings was built as per prior requirement of IBM (Open files = 32768) which needs to be changed to 524288 as per the new requirement. |
Whose new requirement? Don't say IBM; rather, cite a document.
Have you made the change? Does WMQ still fail? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
JosephGramig |
Posted: Mon Oct 14, 2013 11:42 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
Did you search for the mqconfig script?
Run it as root and make corrections until everything gets a PASS.
Run it as mqm and make corrections until everything gets a PASS. |
|
Back to top |
|
 |
shashivarungupta |
Posted: Wed Oct 16, 2013 6:26 am Post subject: |
|
|
 Grand Master
Joined: 24 Feb 2009 Posts: 1343 Location: Floating in space on a round rock.
|
Whats the significance of Segments, Sockets and ConnectSimplePipe and in which Order/Sequence MQ refers them internally at Kernel Level ? _________________ *Life will beat you down, you need to decide to fight back or leave it. |
|
Back to top |
|
 |
roshan.171188 |
Posted: Wed Oct 16, 2013 6:27 am Post subject: |
|
|
Apprentice
Joined: 07 Jun 2012 Posts: 35
|
Yes! I ran the mqconfig script which shows open files for mqm is 2048, however after logging from mqm and running "ulimit -n" it shows 10240....whats wrong?
Also, I agree that file descriptor's is the problem, however i am still not able to understand why did the segmentation fault occur? should it not be handled by MQ internally rather than throwing a system fault? |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Oct 16, 2013 6:45 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
roshan.171188 wrote: |
... however i am still not able to understand why did the segmentation fault occur? should it not be handled by MQ internally rather than throwing a system fault? |
I would not expect WMQ internal code to be able to take corrective action based on a hardware error. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
shashivarungupta |
Posted: Wed Oct 16, 2013 7:04 am Post subject: |
|
|
 Grand Master
Joined: 24 Feb 2009 Posts: 1343 Location: Floating in space on a round rock.
|
bruce2359 wrote: |
roshan.171188 wrote: |
... however i am still not able to understand why did the segmentation fault occur? should it not be handled by MQ internally rather than throwing a system fault? |
I would not expect WMQ internal code to be able to take corrective action based on a hardware error. |
Hardware Error !
I agree with the fact that MQ won't take corrective action at Kernel Parameters but rather MQ would be reactive based on the values set on Kernel. (as happened in this case).
Neither Kernel would take the corrective action on hardware error or issue.
Kernel would simply deal with hardware based on its internal functions and values. _________________ *Life will beat you down, you need to decide to fight back or leave it. |
|
Back to top |
|
 |
|