|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Clustered queues not appearing in a full repos |
« View previous topic :: View next topic » |
Author |
Message
|
skydoor |
Posted: Thu Oct 11, 2007 2:31 am Post subject: Clustered queues not appearing in a full repos |
|
|
Apprentice
Joined: 24 Jul 2007 Posts: 43 Location: Cape Town
|
Hi All
I have a cluster with two full repos and 118 partial repos. One of the full repositories qmgr failed and I could not recover the qmgr. I then created another full repository with a listener on a different port. The maxchannels were reached due to the amount of auto channels being created, the default being 100. I then reset the maxchannels and maxactivechannels to 512. However, not all of the clustered queues on the partial repositories is coming through to the new full repository I created, only some.
My theory is that some of the cluster administration messages came through and then maxchannels were exceeded and none of the other cluster administration messages could get to the new full repos. Cluster administration messages should go via SYSTEM.CLUSTER.COMMAND.QUEUE and if it can not reach the destination qmgr, should backout to the DLQ and stay there. My question is this, are these messages persistent and will they survive a reset? My SYSTEM.CLUSTER.COMMAND.QUEUE is empty as is the DLQ. What happened to my cluster administration messages?
The other question is, will a REFRESH CLUSTER resolve my problem and how advisable is it to do a REFRESH CLUSTER?
Versions: Full repos and Broker qmgrs
Name: WebSphere MQ
Version: 6.0.2.1
CMVC level: p600-201-070323
BuildType: IKAP - (Production)
Versions: Other partials is on 6.0.0.0 |
|
Back to top |
|
 |
Vitor |
Posted: Thu Oct 11, 2007 2:38 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Why would the messages go to the DLQ, rather than sit on the xmitq until comms was restored?
The advisability of a refresh cluster depends on how many members the cluster has and how much network capacity you have. If it's a large cluster I'd schedule it out of hours. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
skydoor |
Posted: Thu Oct 11, 2007 2:59 am Post subject: |
|
|
Apprentice
Joined: 24 Jul 2007 Posts: 43 Location: Cape Town
|
Hi Vitor
According to this link http://middleware.its.state.nc.us/middleware/Documentation/en_US/htm/csqzah04/csqzah0417.htm#HDRCSQ6844 under the heading "What happens when a repository fails?"
it states the following "Cluster information is carried to repositories (whether full or partial) on a local queue called SYSTEM.CLUSTER.COMMAND.QUEUE. If this queue fills up, perhaps because the queue manager has stopped working, the cluster-information messages are routed to the dead-letter queue. If you observe that this is happening, from the messages on your queue-manager log or z/OS system console, you need to run an application to retrieve the messages from the dead-letter queue and reroute them to the correct destination.
If errors occur on a repository queue manager, messages tell you what error has occurred and how long the queue manager will wait before trying to restart. On WebSphere MQ for z/OS the SYSTEM.CLUSTER.COMMAND.QUEUE is get-disabled. When you have identified and resolved the error, get-enable the SYSTEM.CLUSTER.COMMAND.QUEUE so that the queue manager can restart successfully. "
I thought that cluster administration happened via SYSTEM.CLUSTER.COMMAND q rather than via the SYSTEM.CLUSTER.TRANSMIT.QUEUE or is this z/OS specific?
Will the REFRESH CLUSTER in this scenario rebuild my new full repos correctly? |
|
Back to top |
|
 |
Vitor |
Posted: Thu Oct 11, 2007 3:09 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
skydoor wrote: |
If you observe that this is happening, from the messages on your queue-manager log or z/OS system console, you need to run an application to retrieve the messages from the dead-letter queue and reroute them to the correct destination.
|
Things you learn. I'd follow the advice & retrieve the messages before I'd run a refresh - much less impact.
And stick to the documentation on the IBM Info Centre - you're no guarantee this other link is kept current, or even transcribed accurately! _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
skydoor |
Posted: Thu Oct 11, 2007 3:31 am Post subject: |
|
|
Apprentice
Joined: 24 Jul 2007 Posts: 43 Location: Cape Town
|
I don't have any messages on my DLQ's. That is why I probably need a refresh. I now have a theory as to what happened. All the partial repos connected to the new full repos and exceeded the maxchannels because a TO.FULLREPOS receiver is created for every partial connecting. The partials for which I am missing the clustered queues probably could not connect as a receiver could not be created, hence the administration messages was never created. This would be daft however, as it would never register these cluster objects then. Hopefully I am wrong about this and find some other reason why this happened.
I could not find in-depth IBM manuals with regards to this issue. I am sure I would if I look a bit harder.
I can not find any docs on this either, are the cluster administration messages persistent messages? |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Oct 11, 2007 9:55 am Post subject: Re: Clustered queues not appearing in a full repos |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
skydoor wrote: |
My theory is that some of the cluster administration messages came through and then maxchannels were exceeded and none of the other cluster administration messages could get to the new full repos. Cluster administration messages should go via SYSTEM.CLUSTER.COMMAND.QUEUE and if it can not reach the destination qmgr, should backout to the DLQ and stay there. My question is this, are these messages persistent and will they survive a reset? My SYSTEM.CLUSTER.COMMAND.QUEUE is empty as is the DLQ. What happened to my cluster administration messages?
|
Nothing went to the DLQ. If the messages couldn't make it across the channel from PR1 to FR1, because the channel from PR1 to FR1 can't start, the messages will sit in the S.C.T.Q. of PR1. There is no reason for them to go to the DLQ on PR1 and no way for them to go to the DLQ on FR1.
If some of the PRs couldn't connect to FR1 because its Max Channels was temporarily exceeded, then those channels would have simply gone into retry until they could connect.
Check all your PRs one by one. Look at their CLUSSNDR and make sure its pointing at a valid FR, either your new one or the other one that is still OK.
Make sure the new FR has its CLUSRCVR defined properly, and that its CLUSSNDR points at your other FR.
If you want to issue Refrash cluster (don't see a reason to), you can do one PR at a time. Use the REPOS(YES) option to insure the PR tlaks to the FR over your new CLUSSNDR definition. Refresh cluster doesn't "refresh the cluster". All it does is FOR THE ONE QM YOU ARE ISSUING THE COMMAND FOR, tells that QM to purge all its cluster info and send all its definitions to the FRs ONLY. That command is not going to make all the QMs in the cluster start sending everything to everybody.
As other PRs now need to send info to the QM that you just did a refresh on, the cluster will start chattering because the FRs need to let the other PRs know that this PR has a valid queue. But its done on an as needed basis. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
skydoor |
Posted: Thu Oct 11, 2007 11:59 pm Post subject: |
|
|
Apprentice
Joined: 24 Jul 2007 Posts: 43 Location: Cape Town
|
PeterPotkay/Vitor
Thank you for the quick assistance.
I have resolved the problem, my new FR is up and running with all the clustered queues visible.
I might be wrong but my procedure to build it correctly worked so I think my theory is correct. My theory is this. The PR's connected to the new FR quickly exceeding the maxchannels. Because the PR's with the shared queues could now not connect due to maxchannels exceeded they did not create the cluster administration messages. I am saying this because there were no cluster administration messages on either the S.C.T.Q or the S.C.C.Q or the DLQ. I am assuming that cluster administration messages are persistent and would not just disappear, therefore no messages were created. I think the cluster would have refreshed this but only after 27 days.
My procedure to resolve this was as follows.
1. Create the qmgr.
2. Increase the number of MaxChannels and the MaxActiveChannels.
3. Restart the mq service to pick up the new settings.
4. Modify the qmgr as a FR. REPOS(CLUSTER)
The FR took a while to be updated with all the shared queues and auto defined channels but it did come up.
Thanks again for the help. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|