|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Restoring a cluster after deleting a full repository QM |
« View previous topic :: View next topic » |
Author |
Message
|
topwoman |
Posted: Wed Feb 04, 2004 6:11 am Post subject: Restoring a cluster after deleting a full repository QM |
|
|
 Novice
Joined: 03 Feb 2004 Posts: 20 Location: Netherlands
|
I'm getting the following message in my error log:
----- amqrrmfa.c : 34146 ------------------------------------------------------
02/04/04 13:10:55
AMQ9412: Repository command received for 'EGVAMW_R15_2004-01-15_09.11.55'.
EXPLANATION:
The repository manager received a command intended for some other queue
manager, whose identifier is 'EGVAMW_R15_2004-01-15_09.11.55'. The command was
sent by the queue manager with identifier 'EGVAGW1R15_2004-01-15_09.11.22'.
ACTION:
Check the channel and cluster definitions of the sending queue manager.
Here is some background:
Yesterday, in a fit of madness, I deleted a cluster queue manager, EGVAMW_R15, which was the full repository for the cluster.
I spent a day rebuilding the cluster, and managed finally to restore the cluster repository environment, and another queue manager which was also a member of the cluster.
A third queue manager, EGVAGW1R15, is the one still giving me problems. I've forced it to leave the cluster, deleted the cluster sender and receiver channels for this queue manager, and redefined them, but it still keeps trying to find the old repository QM, EGVAMW_R15_2004-01-15_09.11.55, instead of the new one, whatever that might be.
I really need to avoid deleting and rebuilding EGVAGW1R15, as it is a gateway for multiple endpoints, already defined and communicating with it, so I would then need to delete and rebuild all of these, which is a daunting thought (not to mention that it would involve the market end users, which is to be avoided at all costs, if at all possible).
So, friends, what do I do next? Ideally, I'd like to go to wherever the information about the old version of the full repository is located, and replace it with current information, whatever that might be. Is there a way to do this directly, or some series of commands I can execute which will accomplish the same end?
Needless to say, I will NEVER delete a full repository from a cluster again! |
|
Back to top |
|
 |
mqonnet |
Posted: Wed Feb 04, 2004 6:22 am Post subject: |
|
|
 Grand Master
Joined: 18 Feb 2002 Posts: 1114 Location: Boston, Ma, Usa.
|
You might want to look at the following thread which dealt with rebuilding a new cluster cache.
http://www.mqseries.net/phpBB2/viewtopic.php?t=12847
You are having problems because the cluster cache is not refreshed. You need to recreate the cluster cache that seems to still hold the old info. I would think that issuing a Refresh cluster on the queue manager that has issues should resolve the problem. Also take a look at the following from the manuals.
"Use REFRESH CLUSTER to discard all locally held cluster information (including any autodefined channels that are in doubt), and force it to be rebuilt. This enables you to perform a "cold-start" on the cluster.
Notes:
The CLUSRCVR channels were removed from a cluster, or their CONNAMEs were altered on two or more full repository queue managers while they could not communicate.
"
Hope this helps.
Cheers
Kumar |
|
Back to top |
|
 |
Michael Dag |
Posted: Wed Feb 04, 2004 6:46 am Post subject: |
|
|
 Jedi Knight
Joined: 13 Jun 2002 Posts: 2607 Location: The Netherlands (Amsterdam)
|
Kumar,
in the thread you are referring to, Peter never re-created the Queuemanager, but restored the old one from a backup, so all
internal names were still the same.
To topwoman,
was the third quuemanager connected to the repository you deleted?
if, yes. stop your manually defined clussdr and receiver, 'point' your clussdr to the other repository, start clussdr and then clusrcvr.
Hope this helps...
Michael |
|
Back to top |
|
 |
mqonnet |
Posted: Wed Feb 04, 2004 7:04 am Post subject: |
|
|
 Grand Master
Joined: 18 Feb 2002 Posts: 1114 Location: Boston, Ma, Usa.
|
Michael, the reason i posted the reference of Peter's thread was because this situation might need the same treatment.
"A third queue manager, EGVAGW1R15, is the one still giving me problems. I've forced it to leave the cluster, deleted the cluster sender and receiver channels for this queue manager, and redefined them, but it still keeps trying to find the old repository QM, EGVAMW_R15_2004-01-15_09.11.55, instead of the new one, whatever that might be."
He seems to have already re-defined the clussdr and clusrcvr channels and still seem to have issues. Because this qm still has an old cache that is pointing to an old queue manager. Refreshing would remove that entry and yes, after that you create the new clussdr and clusrcvr channels. Should work fine.
Cheers
Kumar |
|
Back to top |
|
 |
mqonnet |
Posted: Wed Feb 04, 2004 7:20 am Post subject: |
|
|
 Grand Master
Joined: 18 Feb 2002 Posts: 1114 Location: Boston, Ma, Usa.
|
Just wanted to add to what i mentioned in my previous post. For what i said, you need to have the older queue manager with the cluster defs.
Since you dont seem to you have that. I would go with Michael's suggestion.
Sorry for the confusion.
Cheers
Kumar |
|
Back to top |
|
 |
topwoman |
Posted: Wed Feb 04, 2004 7:43 am Post subject: |
|
|
 Novice
Joined: 03 Feb 2004 Posts: 20 Location: Netherlands
|
Colleagues,
Thank you kindly for your suggestions.
We tried the "refresh" route you described, but without success. However, we finally did solve the problem, and I want to post it here for the benefit of others:
Since we are using the Microsoft Management Console with the MQSeries add-in, I'll describe the procedure from that standpoint. It will vary slightly for you runmqsc buffs, but the idea is the same.
1. The third queue manager was a partial repository in another cluster for a bunch of "end-point" queue managers. We first "SHOW'ed all these end-point queue managers, so as to be able to easily enable us to JOIN them to the cluster again later.
2. We then REMOVEd all these endpoint queue managers from the second cluster.
3. We then STOPped what I have been referring to as the "third" queue manager (the partial repository guy), and deleted it. We then recreated it, and redefined the cluster sender and receiver channels, and all the queue aliases for the endpoint queue managers.
4. We JOINed the previously saved endpoint queue managers to the partial repository cluster, and did a REFRESH on each of these endpoints.
A lot of work, but not destructive, and everything now recognizes everything else.
I hope no one else does something as foolish as what I did, but at least there is now posted to this site a fool-proof way of recovering.
Consider this my heart-felt thank-you for the conscientious responses I've received to my posts. |
|
Back to top |
|
 |
Michael Dag |
Posted: Wed Feb 04, 2004 8:10 am Post subject: |
|
|
 Jedi Knight
Joined: 13 Jun 2002 Posts: 2607 Location: The Netherlands (Amsterdam)
|
good to read you solved your problem!!!
Michael |
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Feb 06, 2004 8:43 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
topwoman, I'm glad you are working now, but you went through a lot of work!
As I understand it, the problem was that the cluster was in a state where it saw both the old Qm and the new QM, even though they had the same name. The problem is a QM is assigned a QMID, and even though you recreated the QM with the same name, the cluster saw both the new and the old QMID.
The way to solve this is the RESET command. RESET allows you to forceably remove a QM from the cluster. You must issue this from the FULL Repository. If you only have one QM name to get rid of, you can use the command with just the QM name. But in your case, you have 2 entries for the same name, so you must use the QMID with the command, like this:
Code: |
reset cluster(YourClusterName) qmid('TheQMIDofTheOldQMYouDeleted') action(FORCEREMOVE) queues(YES)
|
The QMID can by found by issuing DISPLAY CLUSQMGR (*) ALL.
This would have gotten rid of all the referances to the old QM from the entire cluster.
If at this point you also wanted to flush the Gateway QM's partial repository, you could use the REFRESH command, like this:
Code: |
refresh cluster(YourClusterName) Repos(YES)
|
This cause this QM to wipe EVERYTHING it knows about the cluster from its repository, and then pushes out to the FULL repository all its own info. It them learns about the rest of the cluster as it needs to. This REFRESH command can only be run on a partial repository.
Remember, RESET deletes all knowledge of a QM from a cluster, REFRESH wipes a QMs partial repository clean and pushes out its own info only to the full repository.
If you are still at 5.2, get to 5.3 ASAP if you are clustering. Back when we are at 5.2, these commands would not always work. We had to resort to manually clearing the SYSTEM.CLUSTER.REPOSITORY.QUEUES. Since it involved a QM restart, and needed to be done on all QMs in the cluster at the same time to make sure they didn't get polluted from another repository that still had garbage, it was quite disruptive. At 5.3, there are less clustering problems, and the commands actually seem to work.
As a final note, you can delete a full repository QM. If you need to, make sure you follow all the steps: Make it a partial repository first, then decluster it, then delete it. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Feb 06, 2004 8:52 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Oh yeah, forget using MQExplorer with clusters! It doesn't work 100%! Particulary the commands. And it cheats in its views that it shows you. When you see a cluster view, its not showing you what that QMs partial repository holds, but rather what it thinks it should hold. It cheats by connecting to the full repositories itself, then displaying everything the partial should eventually know. Quite a headache when MQExplorer shows some queues on the partial repository, but in actuallity, the QM doesn't yet know about them for some reason, and you can't figure out why its not working.
You have to get familiar with the runmqsc commands. Also, the referance manual for the commands is a must read. It contains tons of info that is not found anywhere else.
http://publibfp.boulder.ibm.com/epubs/html/csqzaj08/csqzaj08tfrm.htm
It is a pain to log onto every server just to run a runmqsc command. That is why I love the MO71 support pack. I have 100 queue managers listed in it. By highlighting one in that GUI, with 1 more click I can open up a runmqsc window directly into that box. And I can have multiple ones running at the same time. Get it, its free. You will be glad you did.
http://www-306.ibm.com/software/integration/support/supportpacs/individual/mo71.html _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|