MQSeries.net :: View topic - < Solved (sort of) >2059 and lots of FDC's

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » < Solved (sort of) >2059 and lots of FDC's

< Solved (sort of) >2059 and lots of FDC's

« View previous topic :: View next topic »

Author

Message

nosnhoj

Posted: Wed Sep 07, 2005 4:47 pm Post subject: < Solved (sort of) >2059 and lots of FDC's

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Strange problem... Client works fine for a while, many connects and disconnects - then they get a 2059 error, and then all hell breaks loose. The processes in netstat for the port jump like crazy, and FDC errors generated every minute or so (usually all refering to channel terminated.. RM something in the probe, but according to the IBM site it is 'normal')

Network people say they see the server end the connection (FIN) but the queue manager and listener always remain up and available - MQJexplorer stays connected until we run out of channels.

The only change i can think of is we recently swithced from inetd to runmqlsr (MQ 5.3 CSD6 on hp ux 11). This has worked fine for a year before this.

Any ideas? Should we switch back to inetd? Why the 2059s?

Last edited by nosnhoj on Thu Sep 08, 2005 9:40 am; edited 1 time in total

hopsala

Posted: Wed Sep 07, 2005 5:08 pm Post subject:

Guardian

Joined: 24 Sep 2004
Posts: 960

First of all, there is no need to switch back to inetd, runmqlsr should work just fine; if this is a production system, however, maybe i'd switch back, buying time to investigate without users shouting over my shoulders.

Concerning the actual error, you didn't supply us with enough info to help you, please post /errors/ and /qmgrs/errors AMQERR files, and the relevant FDC sections. (before posting, do a little research - try and see what the first original error is, not the other spawned error msgs; Mind times and dates in doing so.)
Of the top of my head, i'd suggest restarting the server, and installing CSD11 (I remember there were some fixes concerning runmqlsr on unix platforms)

Btw, you stated this prob simply "happens" - does it ever stop "happening"? i.e is it that after a while of receiving those 2059 client channels return to work and everything goes back to normal?

nosnhoj

Posted: Thu Sep 08, 2005 5:51 am Post subject:

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Seeing this a lot:

09/08/05 07:33:38
AMQ9604: Channel 'prod1.prod2' terminated unexpectedly

EXPLANATION:
The process or thread executing channel 'prod1.prod2' is no longer running.
The check process system call returned 545284357 for process 24520.
ACTION:
No immediate action is required because the channel entry has been removed from
the list of running channels. Inform the system administrator who should
examine the operating system procedures to determine why the channel process
has terminated.

And the only way to 'get the system back' is kill all processes (endmqm does not work)

jefflowrey

Posted: Thu Sep 08, 2005 6:01 am Post subject:

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Stop all processes.

Clear all FDCs.

I think you can clear AMQERR* files as well. Try leaving them there, but emtpy first.

Restart the qmgr.

Make note of the first time the problem shows up, and look at the *first* entry in the log and the chronologically first FDC.
_________________
I am *not* the model of the modern major general.

Nigelg

Posted: Thu Sep 08, 2005 6:29 am Post subject:

Grand Master

Joined: 02 Aug 2004
Posts: 1046

The AMQ9604 error log msgs, and the RM487001 FDCs, are reporting the same thing. The cause is that some process, typically runmqsc or amqpcsea, reading the internal channel status table has found that the process ID for a channel which has a status of RUNNING is not longer present. The FDC is then produced and the msg output to the error logs.
The root cause is that the process running these channels (amqrmppa) has crashed on one of its threads, bringing down the rest of the channels running on the other threads.

Are there any FDCs other than RM487001, e.g. XC130003, from a channel process?
Is it possible that some action external to WMQ is being done to kill the channel processes?
_________________
MQSeries.net helps those who help themselves..

nosnhoj

Posted: Thu Sep 08, 2005 7:00 am Post subject:

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

nosnhoj

Posted: Thu Sep 08, 2005 7:43 am Post subject:

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Just switched back to inetd.conf instead of runmqlsr and all seems to be ok - been 7 minutes without a failure so far!!!!

Nigelg

Posted: Thu Sep 08, 2005 7:48 am Post subject:

Grand Master

Joined: 02 Aug 2004
Posts: 1046

This is in the first FDC...

Quote:

Comment1 :- '24 - Too many open files' from socket.

Looks like some system parameter is too low, perhaps number of open files allowed per process?

This affects runmqlsr/amqrmppa because lots of channels run as threads in the same process, but has no effect on inetd because each channel runs in a spearate process.
_________________
MQSeries.net helps those who help themselves..

nosnhoj

Posted: Thu Sep 08, 2005 9:41 am Post subject:

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Looks like it was runmqlsr... switching back to inetd seems to have solved it. Now just to figure out what happened and why.... will update this if anyone wants to know.

Thanks for the help!

PeterPotkay

Posted: Thu Sep 08, 2005 3:04 pm Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

jefflowrey wrote:

I think you can clear AMQERR* files as well. Try leaving them there, but emtpy first.

Yes, you can do this. If MQ doesn't find an AMQERR* file, it creates it.
_________________
Peter Potkay
Keep Calm and MQ On

Cliff

Posted: Fri Sep 09, 2005 5:01 am Post subject:

Centurion

Joined: 27 Jun 2001
Posts: 145
Location: Wiltshire

Assuming HP-UX works like Solaris, your problem could be the soft limit for file descriptors being reached. Runmqlsr is a fully multi-threaded program. From the Solaris Quick Beginnings:

When running a multi-threaded process, you might reach the soft limit for file descriptors. This gives you the WebSphere MQ reason code MQRC_UNEXPECTED_ERROR (2195) and, if there are enough file descriptors, a WebSphere MQ FFST(TM) file.

So it's probably worth checking the equivalent value on HP-UX. Just a shot in the dark .....

Good luck -
Cliff

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » < Solved (sort of) >2059 and lots of FDC's

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP