ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » < Solved (sort of) >2059 and lots of FDC's

Post new topic  Reply to topic
 < Solved (sort of) >2059 and lots of FDC's « View previous topic :: View next topic » 
Author Message
nosnhoj
PostPosted: Wed Sep 07, 2005 4:47 pm    Post subject: < Solved (sort of) >2059 and lots of FDC's Reply with quote

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Strange problem... Client works fine for a while, many connects and disconnects - then they get a 2059 error, and then all hell breaks loose. The processes in netstat for the port jump like crazy, and FDC errors generated every minute or so (usually all refering to channel terminated.. RM something in the probe, but according to the IBM site it is 'normal')

Network people say they see the server end the connection (FIN) but the queue manager and listener always remain up and available - MQJexplorer stays connected until we run out of channels.

The only change i can think of is we recently swithced from inetd to runmqlsr (MQ 5.3 CSD6 on hp ux 11). This has worked fine for a year before this.

Any ideas? Should we switch back to inetd? Why the 2059s?


Last edited by nosnhoj on Thu Sep 08, 2005 9:40 am; edited 1 time in total
Back to top
View user's profile Send private message
hopsala
PostPosted: Wed Sep 07, 2005 5:08 pm    Post subject: Reply with quote

Guardian

Joined: 24 Sep 2004
Posts: 960

First of all, there is no need to switch back to inetd, runmqlsr should work just fine; if this is a production system, however, maybe i'd switch back, buying time to investigate without users shouting over my shoulders.

Concerning the actual error, you didn't supply us with enough info to help you, please post /errors/ and /qmgrs/errors AMQERR files, and the relevant FDC sections. (before posting, do a little research - try and see what the first original error is, not the other spawned error msgs; Mind times and dates in doing so.)
Of the top of my head, i'd suggest restarting the server, and installing CSD11 (I remember there were some fixes concerning runmqlsr on unix platforms)

Btw, you stated this prob simply "happens" - does it ever stop "happening"? i.e is it that after a while of receiving those 2059 client channels return to work and everything goes back to normal?
Back to top
View user's profile Send private message
nosnhoj
PostPosted: Thu Sep 08, 2005 5:51 am    Post subject: Reply with quote

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Seeing this a lot:

09/08/05 07:33:38
AMQ9604: Channel 'prod1.prod2' terminated unexpectedly

EXPLANATION:
The process or thread executing channel 'prod1.prod2' is no longer running.
The check process system call returned 545284357 for process 24520.
ACTION:
No immediate action is required because the channel entry has been removed from
the list of running channels. Inform the system administrator who should
examine the operating system procedures to determine why the channel process
has terminated.

And the only way to 'get the system back' is kill all processes (endmqm does not work)
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Thu Sep 08, 2005 6:01 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Stop all processes.

Clear all FDCs.

I think you can clear AMQERR* files as well. Try leaving them there, but emtpy first.

Restart the qmgr.

Make note of the first time the problem shows up, and look at the *first* entry in the log and the chronologically first FDC.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Nigelg
PostPosted: Thu Sep 08, 2005 6:29 am    Post subject: Reply with quote

Grand Master

Joined: 02 Aug 2004
Posts: 1046

The AMQ9604 error log msgs, and the RM487001 FDCs, are reporting the same thing. The cause is that some process, typically runmqsc or amqpcsea, reading the internal channel status table has found that the process ID for a channel which has a status of RUNNING is not longer present. The FDC is then produced and the msg output to the error logs.
The root cause is that the process running these channels (amqrmppa) has crashed on one of its threads, bringing down the rest of the channels running on the other threads.

Are there any FDCs other than RM487001, e.g. XC130003, from a channel process?
Is it possible that some action external to WMQ is being done to kill the channel processes?
_________________
MQSeries.net helps those who help themselves..
Back to top
View user's profile Send private message
nosnhoj
PostPosted: Thu Sep 08, 2005 7:00 am    Post subject: Reply with quote

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Here is the first FDC generated:

| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Thursday September 08 10:52:51 EDT 2005 |
| Host Name :- dc2c5s (HP-UX B.11.11) |
| PIDS :- 5724B4102 |
| LVLS :- 530.6 CSD06 |
| Product Long Name :- WebSphere MQ for HP-UX |
| Vendor :- IBM |
| Probe Id :- XY076002 |
| Application Name :- MQM |
| Component :- xllRecoverSocketEvent |
| Build Date :- Feb 11 2004 |
| CMVC level :- p530-06-L040211 |
| Build Type :- IKAP - (Production) |
| UserID :- 00000201 (mqm) |
| Program Name :- amqrmppa_nd |
| Process :- 00028464 |
| Thread :- 00000031 |
| QueueManager :- QMGR!MW2 |
| Major Errorcode :- xecF_E_UNEXPECTED_SYSTEM_RC |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6119 |
| Probe Severity :- 2 |
| Probe Description :- WebSphere MQ was unable to open a message catalog to |
| display an error message for message id hexadecimal %6, with inserts %1, |
| %2, %3, %4, and %5. |
| FDCSequenceNumber :- 0 |
| Arith1 :- 24 18 |
| Comment1 :- '24 - Too many open files' from socket. |
| |
| |
+-----------------------------------------------------------------------------+


This is the second:

+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Thursday September 08 10:55:37 EDT 2005 |
| Host Name :- dc2c5s (HP-UX B.11.11) |
| PIDS :- 5724B4102 |
| LVLS :- 530.6 CSD06 |
| Product Long Name :- WebSphere MQ for HP-UX |
| Vendor :- IBM |
| Probe Id :- XC015001 |
| Application Name :- MQM |
| Component :- xcsFreeQuickCell |
| Build Date :- Feb 11 2004 |
| CMVC level :- p530-06-L040211 |
| Build Type :- IKAP - (Production) |
| UserID :- 00000201 (mqm) |
| Program Name :- amqrmppa_nd |
| Process :- 00007731 |
| Thread :- 00000010 |
| QueueManager :- QMGR!MW2 |
| Major Errorcode :- xecS_E_BLOCK_ALREADY_FREE |
| Minor Errorcode :- OK |
| Probe Type :- INCORROUT |
| Probe Severity :- 2 |
| Probe Description :- AMQ6125: An internal WebSphere MQ error has occurred. |
| FDCSequenceNumber :- 0 |
| |
+-----------------------------------------------------------------------------+


Then they just continue..... the error in the qmgr error log appears to be a 2059. Listeners are running, and I can connect remotely to the queue manager... they get 2059s but still we get some connections - very strange
Back to top
View user's profile Send private message
nosnhoj
PostPosted: Thu Sep 08, 2005 7:43 am    Post subject: Reply with quote

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Just switched back to inetd.conf instead of runmqlsr and all seems to be ok - been 7 minutes without a failure so far!!!!
Back to top
View user's profile Send private message
Nigelg
PostPosted: Thu Sep 08, 2005 7:48 am    Post subject: Reply with quote

Grand Master

Joined: 02 Aug 2004
Posts: 1046

This is in the first FDC...

Quote:
Comment1 :- '24 - Too many open files' from socket.


Looks like some system parameter is too low, perhaps number of open files allowed per process?

This affects runmqlsr/amqrmppa because lots of channels run as threads in the same process, but has no effect on inetd because each channel runs in a spearate process.
_________________
MQSeries.net helps those who help themselves..
Back to top
View user's profile Send private message
nosnhoj
PostPosted: Thu Sep 08, 2005 9:41 am    Post subject: Reply with quote

Apprentice

Joined: 07 Sep 2005
Posts: 40
Location: Markham On.

Looks like it was runmqlsr... switching back to inetd seems to have solved it. Now just to figure out what happened and why.... will update this if anyone wants to know.

Thanks for the help!
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Thu Sep 08, 2005 3:04 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

jefflowrey wrote:

I think you can clear AMQERR* files as well. Try leaving them there, but emtpy first.


Yes, you can do this. If MQ doesn't find an AMQERR* file, it creates it.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Cliff
PostPosted: Fri Sep 09, 2005 5:01 am    Post subject: Reply with quote

Centurion

Joined: 27 Jun 2001
Posts: 145
Location: Wiltshire

Assuming HP-UX works like Solaris, your problem could be the soft limit for file descriptors being reached. Runmqlsr is a fully multi-threaded program. From the Solaris Quick Beginnings:

When running a multi-threaded process, you might reach the soft limit for file descriptors. This gives you the WebSphere MQ reason code MQRC_UNEXPECTED_ERROR (2195) and, if there are enough file descriptors, a WebSphere MQ FFST(TM) file.

So it's probably worth checking the equivalent value on HP-UX. Just a shot in the dark .....

Good luck -
Cliff
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » < Solved (sort of) >2059 and lots of FDC's
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.