Author |
Message
|
mqdev |
Posted: Wed Jan 27, 2016 12:31 pm Post subject: what is the easiest way to get rid of messages on SCTQ? |
|
|
Centurion
Joined: 21 Jan 2003 Posts: 136
|
Hello,
We have a bunch of QMs, currently defunct, in a Cluster. However when those MQ servers were decommissioned, the QMs were not removed from the Cluster with the result that FR keeps sending the Cluster updates to the defunct QMs. These msgs pile up on SYSTEM.CLUSTER.TRANSMIT.QUEUE (SCTQ). The CLUSSDR to the defunct QMs are spun up and go into RETRYing status and stay in RETRY status perpetually...we propose to do the following to cleanup this mess:
1. Stop the CLUSSDR chls (this will release any handles the QM processes have on the messages in the SCTQ)
2. Delete the messages from SCTQ (only those msgs whose CORREL-ID = one of the CLUSSDR chls to the defunct QM)
3. Run reset cluster action(forceremove) on the FR to remove the defunct QMs from the FR
Am doubtful about step#2 above (for we should never touch msgs on SCTQ)....but if we do not remove these msgs, they keep spinning up the CLUSSDR chls...so this is needed.
Any thoughts on above procedure - or any alternative, better approach to clear this mess?
Thanks
-mqdev |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 27, 2016 1:32 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
I'd start by taking the CLUSRCVR channel(s) for the defunct qmgr out of the cluster, wait a few minutes, then delete it/them. No new CLUSSDRA channels to the defunct qmgr will be created. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
mqdev |
Posted: Wed Jan 27, 2016 2:15 pm Post subject: |
|
|
Centurion
Joined: 21 Jan 2003 Posts: 136
|
Thanks Bruce, for your quick response. However, we still need to deal with the messages on the SCTQ - any better way then deleting them directly?
Thanks
-mqdev |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Jan 28, 2016 5:38 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Worth a try:
Create a "dummy" qmgr with the same name as the defunct qmgr. Join it to the cluster giving it the correct clusrcvr channel name.
Have it consume all the updates.
Reset the cluster removing the qmid from the defunct qmgr.
Make sure all the FR's are accessible...
Remove the "dummy" qmgr from the cluster in an orderly fashion.
Hope it helps  _________________ MQ & Broker admin |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 28, 2016 6:46 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
fjb_saper wrote: |
Worth a try:
Create a "dummy" qmgr with the same name as the defunct qmgr. Join it to the cluster giving it the correct clusrcvr channel name. |
It will have a different QMID so that may affect whether messages get delivered to it or not.
Depending on how the messages were sent, it's possible that simply creating a queue with the right name on another qmgr and share it in the cluster will cause the messages to get delivered. BIND_ON_OPEN will obviously prevent this. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
mqdev |
Posted: Thu Jan 28, 2016 7:27 am Post subject: |
|
|
Centurion
Joined: 21 Jan 2003 Posts: 136
|
Jeff,
These are the Cluster updates that FR is sending to the PR - so they are targetted to SYSTEM.CLUSTER.RPOSITORY.QUEUE or some such. These Qs are pretty much available on every QM. Since the msgs did not drain until now (guess the routing takes into account the combination of [QMName+Q name] - so this may not work).
As to the other approach of adding dummy QMs with same names.... this is Prod which requires elaborate change documentation and approvals for each a simple action like listener restart. So adding 20+ dummy QMs and deleting them later would just not fly with the Change Approval Board.
Thanks for the ideas folks - much appreciated!
-mqdev |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 28, 2016 8:22 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Are these messages stuck on a single qmgr?
Is it an FR or a PR?
If it's an FR, you might be able to demote it to a PR and then promote it an FR again.
A reset cluster command might also help.
You should also, hopefully obviously, recreate the issue in a non-prod environment and make sure to test all your changes. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
umatharani |
Posted: Sat Jan 30, 2016 8:59 pm Post subject: |
|
|
Apprentice
Joined: 23 Oct 2008 Posts: 39
|
Hi,
Try the following procedure.
On the Full repository queue managers:
1. If MQ version is 701 go to step 4
2. Add the following to in qm.ini. (MQ >= 7.1)
TuningParameters:
TolerateRepositoryFailure=TRUE
More details on the tuning parameter : http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.mig.doc/q001120_copy.htm?lang=en
3. Restart qmgr, otherwise killing amqrmffa will terminate qmgr (MQ >= 7.1)
4. Stop all applications and channels putting messages to cluster queue.
Ensure IPPROCS and OPPROCS are 0 for SCTQ.
5. Use reset cluster to remove the decommissioned queue manager
Try with both qmgr and qmid options.
6. Find out PID of 'amqrrmfa' process of the relevant qmgr and kill the
process.
7. Wait for 10 seconds or more
*** Ensure there is no application messages in the SCTQ by browsing the queue.
Alternatively the messages can be saved using qload or dmpmqmsg utilities before clearing the queue.
8. Clear SYSTEM.CLUSTER.REPOSITORY.QUEUE. If “clear ql” fails with
"OBJECT IN USE" error wait for more time and try clearing the queue again.
9. Restart the queue manager
10. If you see any inconsistency on any PR, issue refresh cluster on the
affected PR. You can also issue refresh cluster on the FR if the number of
queue managers on the cluster is not high(e.g. 25, 50, 100...). |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sun Jan 31, 2016 6:18 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Just issue the RESET CLUSTER command to purge the cluster any knowledge of the QM that is now gone.
At this point, no new messages should be generated by the FRs to the QM that is gone.
Browse the S.C.T.Q., looking for messages with Correlation IDs that match the name of the channel to the QM that is now gone. Those cluster administrative messages can never be delivered (the QM is gone) and are not meant for any other QM. Save those messages to a file in case you ever need them again. Now delete those messages you saved from the S.C.T.Q. Do not issue the clear queue command, or use some tool that destructively chews thru all messages on a queue, because you might delete in flight messages meant for other QMs in the cluster. Selectively and precisely delete just the messages that are for the QM that is gone.
Since I am just some random idiot on the internet, you might want to validate my suggestion by opening a PMR first, before doing this on a Production system. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Sun Jan 31, 2016 7:08 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
PeterPotkay wrote: |
Just issue the RESET CLUSTER command to purge the cluster any knowledge of the QM that is now gone.
At this point, no new messages should be generated by the FRs to the QM that is gone.
Browse the S.C.T.Q., looking for messages with Correlation IDs that match the name of the channel to the QM that is now gone. Those cluster administrative messages can never be delivered (the QM is gone) and are not meant for any other QM. Save those messages to a file in case you ever need them again. Now delete those messages you saved from the S.C.T.Q. Do not issue the clear queue command, or use some tool that destructively chews thru all messages on a queue, because you might delete in flight messages meant for other QMs in the cluster. Selectively and precisely delete just the messages that are for the QM that is gone.
Since I am just some random idiot on the internet, you might want to validate my suggestion by opening a PMR first, before doing this on a Production system. |
The reset cluster command should be issued after the SCTQ has been purged. Issuing it before will have little to no effect as the FR will still know about the now defunct qmgr, due to the messages on the SCTQ...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
umatharani |
Posted: Sun Jan 31, 2016 7:40 am Post subject: |
|
|
Apprentice
Joined: 23 Oct 2008 Posts: 39
|
The cluster inconsistency because of loss of cluster command messages can be resolved using refresh cluster. An easy option would be to clear the SCTQ and then issue refresh cluster on the PR to resync with FR. The refresh is required only in case if there is any cluster inconsistency because of clearing the messages on SCTQ.
Please also swap steps 4 and 5 so that the 2 FRs can sync the changes. If the channels are not running then there will not be any communication between 2 FRs.
4. Use reset cluster to remove the decommissioned queue manager
Try with both qmgr and qmid options.
5. Stop all applications and channels putting messages to cluster queue.
Ensure IPPROCS and OPPROCS are 0 for SCTQ.
If the number of messages is less, then the messages can be selectively removed using a tool without clearing the SCTQ as explained by PeterPotkay . If you are sure there is no application messages, then the SCTQ can be safely cleared.
Thanks,
mahesh |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sun Jan 31, 2016 8:13 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
fjb_saper wrote: |
PeterPotkay wrote: |
Just issue the RESET CLUSTER command to purge the cluster any knowledge of the QM that is now gone.
At this point, no new messages should be generated by the FRs to the QM that is gone.
Browse the S.C.T.Q., looking for messages with Correlation IDs that match the name of the channel to the QM that is now gone. Those cluster administrative messages can never be delivered (the QM is gone) and are not meant for any other QM. Save those messages to a file in case you ever need them again. Now delete those messages you saved from the S.C.T.Q. Do not issue the clear queue command, or use some tool that destructively chews thru all messages on a queue, because you might delete in flight messages meant for other QMs in the cluster. Selectively and precisely delete just the messages that are for the QM that is gone.
Since I am just some random idiot on the internet, you might want to validate my suggestion by opening a PMR first, before doing this on a Production system. |
The reset cluster command should be issued after the SCTQ has been purged. Issuing it before will have little to no effect as the FR will still know about the now defunct qmgr, due to the messages on the SCTQ...  |
The reason I proposed the order I did is if you clean up the S.C.T.Q, and then issue the RESET CLUSTER command, there is a period of time between those 2 steps where additional messages may be produced into the S.C.T.Q., and you will have to do that step again anyway.
I don't think the presence of messages to an obsolete QM in the S.C.T.Q. would negatively impact the successful completion of the RESET CLUSTER command. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Jan 31, 2016 10:15 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
The RESET CLUSTER command issues cluster admin messages to FRs and PRs, which go to SCTQ. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
|