ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » AMQRMPPA in AIX

Post new topic  Reply to topic
 AMQRMPPA in AIX « View previous topic :: View next topic » 
Author Message
GFORCE
PostPosted: Fri Feb 15, 2008 9:34 am    Post subject: AMQRMPPA in AIX Reply with quote

Voyager

Joined: 16 Jun 2003
Posts: 78
Location: WISCONSIN

We recylce our test AIX box weekly and everytime I have to logon and kill the AMQRMPPA process to recycle MQ. Is there any way around this besides killing the process?
_________________
THANKS
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Fri Feb 15, 2008 10:13 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

I got the same problem on my Linux x86 32bit QMs. It happened at MQ 6.0.1.0, 6.0.2.0 and 6.0.2.1. Going to 6.0.2.3 soon hoping the problem goes away. Its annoying. It doesn't happen every time. Sometimes if I wait 5-10 minutes they eventually stop, but usually when you are restarting a QM you don't have time to sit there and wait who knows how long.

Turning trace on makes the problem go away.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Mon Feb 25, 2008 6:05 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Anyone else got this problem? Our MQ shutdown scripts endmqlsr first, then endmqm -i. Yet if the QM has more than a few running client channels (i.e. there is more than one amqrmppa process running) more often than not we have to kill those amqrmppa processes. Even when they do go down on their own it takes 10-15 minutes. As I said before running trace seems to make the problem go away. I just upgraded to 6.0.2.3 and the problem is still there.


_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
GFORCE
PostPosted: Mon Mar 03, 2008 8:36 am    Post subject: AMQRMPPA in AIX Reply with quote

Voyager

Joined: 16 Jun 2003
Posts: 78
Location: WISCONSIN

I set the TCP keep alive parm in the QM.INI and it appears to work also. I am still trying several options and I will try the trace option as you stated, but I have to go through our change control with every change.....
_________________
THANKS
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Mar 11, 2008 9:28 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

IBM identified a problem and will be creating an interim fix. Tracing fixed the problem each time. They gave us a command to run every few seconds while the QM was taking forever to come down that produced an FDC each time. That finally highlited the problem.

Quote:

Based on the supplied data, we have narrowed down the cause of the
endmqm delay to two parts of the code which stops channels. We are doing
some further testing in order to better understand the delay and hope to
produce an interim fix later this week. I expect to report back with
more information tomorrow.
************
Update non 11th March:
Further to yesterday's update, the FDCs supplied again showed that it
was the ending of channels which delayed the endmqm process. In
particular, channel process 16634 kept running for a very long time
after the queue manager was asked to end. The endmqm process was waiting
for channel process 16634 to end before it could finish.
.
When we look at what the FDCs showed for 16634, it seems that there were
a number of channel threads (e.g. threads 82, 85, 92 and 94) still
active inside it. Since these threads had not finished processing, the
process had not ended.
.
The FFST shows what these last threads had been doing as the queue
manager ended. We see that most threads had noticed that the queue
manager had ended, but then carried on regardless. For example, here is
an excerpt from thread 94's history:
.
----} zstMQGET rc=lrcE_Q_MGR_STOPPING
---} MQGET rc=lrcE_Q_MGR_STOPPING
...
---{ MQCLOSE
----{ zstMQCLOSE
-----{ zstVerifyPCD
-----} zstVerifyPCD rc=OK
-----{ zutCallApiExitsBeforeClose
------{ APIExit
-------{ MQGET
--------{ zstMQGET
---------{ zstVerifyPCD
---------} zstVerifyPCD rc=OK
---------{ ziiBreakConnection
---------} ziiBreakConnection rc=OK
--------} zstMQGET rc=lrcE_CONNECTION_BROKEN
-------} MQGET rc=lrcE_CONNECTION_BROKEN
------} APIExit rc=OK
-----} zutCallApiExitsBeforeClose rc=OK
-----{ zutCallApiExitsAfterClose
------{ APIExit
------} APIExit rc=lrcE_CONNECTION_BROKEN
-----} zutCallApiExitsAfterClose rc=OK
-----{ ziiBreakConnection
-----} ziiBreakConnection rc=OK
----} zstMQCLOSE rc=lrcE_CONNECTION_BROKEN
---} MQCLOSE rc=lrcE_CONNECTION_BROKEN
...
---{ ccxReceive
----{ cciTcpReceive
-----{ ccxAllocMem
-----} ccxAllocMem rc=OK
-----{ recv
-----} recv rc=Unknown(FFFF)
-----{ xcsWaitFd
------{ poll
------} poll rc=Unknown(1)
-----} xcsWaitFd rc=Unknown(1)
-----{ recv
-----} recv rc=Unknown(FFFF)
-----{ xcsWaitFd
------{ poll
------} poll rc=Unknown(1)
.
Despite knowing that the queue manager is ending and that its own
connection to the queue manager has been broken, the thread continued to
run and poll its network socket for more MQI calls from the client.
However, even if such a call arrived there would be nothing useful that
the channel could do with it because its connection has gone. So the
thread should really have ended at that point. It is only after multiple
failed poll() calls that the channel threads finally time out and end,
which allows endmqm processing to complete.
.
We should point out that client applications should specify the
appropriate FAIL_IF_QUIESCING option on all of their MQI calls in order
to speed up endmqm processing. The trace supplied on 3rd March shows
some clients which are not using the "fail if quiescing" option.
However, I believe that endmqm -i should still end the queue manager
within a reasonable time regardless of the MQI options. For this reason,
I think the queue manager should try harder to end client channels than
it currently does.
.
Based on the sequence of events in the FFSTs, it is clear that all of
the threads which failed to end had recieved MQRC_Q_MGR_STOPPING and
MQRC_CONNECTION_BROKEN as early as 08:12:39. Had they detected this fact
they would have ended much sooner, instead of hanging around until
18:18:55 when endmqm finally finished.
.
We are building a test fix which adds extra checking to the server
(SVRCONN) end of the channel in order to better handle shutdown in cases
where MQI calls report that the queue manager is ending. I will also
include additional FFST diagnostics in the code so as to produce better
SIGUSR2 FDC files in cases of future delays.

_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Mar 11, 2008 9:34 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

That seems to point the finger in two places: a) client apps that don't use FAIL_ON_QUIESCE, and b) the channel which should kill itself after it's sent at least one FAIL_ON_QUIESCE.

So I'd a) wait for the fix, and b) fatten your trout for those app teams that aren't using FAIL_ON_QUIESCE.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Mar 11, 2008 9:43 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Even with a Monty Python sized trout you'll never be able to guarantee that every app uses FAIL_ON_QUIESCE. Even if they say the use it. Even if you see some code that uses it, its not proof that that's what's running in PROD. That's why we rely on endmqm -i. I'm glad IBM found the problem. Waiting 10 minutes for the QM to come down is an eternity in the middle of the night with the change window's end time approaching.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Toronto_MQ
PostPosted: Wed Mar 12, 2008 7:46 am    Post subject: Reply with quote

Master

Joined: 10 Jul 2002
Posts: 263
Location: read my name

I'm glad you've gotten somewhere with this. We have the same problem (on Solaris) and our PMRs got us nowhere. We have taken to issuing the endmqm -i, waiting a minute, then a -p, another minute, then we start killing the amqrmppa processes. Nice to see a fix may eventually come around.

I agree in an ideal world we would have the apps code fail_if_quiesce. And we always stress this as rule #1. But I think we all know we don't live in an ideal world. If I have to listen to "this is vendor code, we can't change that" one more time...
Back to top
View user's profile Send private message
GFORCE
PostPosted: Tue Mar 18, 2008 5:04 am    Post subject: Reply with quote

Voyager

Joined: 16 Jun 2003
Posts: 78
Location: WISCONSIN

I am glad this resulted in a fix from IBM. I hope the PTF is available soon.

Thanks for your help...this forum is great!!!!
_________________
THANKS
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Thu Mar 20, 2008 11:56 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Contact IBM Support if you need the interim fix for this. Its called IZ18142. Its past the cutoff for being included in 6.0.2.4. The earliest it would be in is 6.0.2.5.

I only tested the fix for Linux. I informed them that Solaris and AIX appears to have the same bug based on this thread.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » AMQRMPPA in AIX
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.