Author |
Message
|
sam@prof |
Posted: Mon Feb 19, 2007 3:05 am Post subject: Problem Adding a Queue Manager to a Cluster |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
Hi,
I am having problems adding a new queue manager to a cluster. I have created a cluster sender channel on the new queue manager to one of the queue managers in the cluster with the full repository. I have also created a cluster receiver channel for the new queue manager.
I have checked the host names and port numbers are correct on all the channels.
When I run DISPLAY CLUSQMGR(*) on the queue manager holding the full repository, there are no details about the queue manager I am trying to add to the cluster.
When this command is run from the queue manager I am adding to the cluster, I do not have any details return regarding the repository queue manager or the cluster I am trying to add the queue manager to.
I have also run the REFRESH CLUSTER command on both machines but that has not made a difference.
I am guessing this is a problem with the clustered repository updating but I don’t know what else I can do. I have added queue managers from other boxes to this cluster and have not had a problem in the past. Does anyone have any ideas on what is wrong? |
|
Back to top |
|
 |
Mr Butcher |
Posted: Mon Feb 19, 2007 3:58 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
what is the status of the TO.repository channel that you created on the queuemanager that you are trying to add to the cluster? try to get this one into running state first! _________________ Regards, Butcher |
|
Back to top |
|
 |
sam@prof |
Posted: Mon Feb 19, 2007 4:11 am Post subject: |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
The TO.repository channel is running and the RQMNAME contains the value of the queue manager holding the full repository.
The receiver channel on the queue manager I am adding to the cluster is also running but it does not have the RQMNAME of the repository queue manager. |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Feb 19, 2007 4:49 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
What RQMNAME? I'm sure I've never specified an RQMNAME when defining clusters.
Did you review the steps in the Queue Manager Clusters manual for adding a queue manager? Did you get the CLUSTER name right?
Can you amqsputc to a qcluster from the new queue manager? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
sam@prof |
Posted: Mon Feb 19, 2007 5:31 am Post subject: |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
[i]>What RQMNAME? I'm sure I've never specified an RQMNAME when defining clusters.[/i]
I did not specify an RQMNAME but when I enter DISPLAY CHSTATUS(channelName), this is what appears in the RQMNAME field.
[i]>Did you review the steps in the Queue Manager Clusters manual for adding a queue manager? Did you get the CLUSTER name right? [/i]
Yes and yes, I have added queue managers to this cluster in the past and have not had any difficulties
[i]>Can you amqsputc to a qcluster from the new queue manager?[/i]
Nope, I get the error message 'MQOPEN ended with reason code 2085' |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Feb 19, 2007 5:38 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
That seems pretty clearly a problem.
Are the cluster channels in both directions running? Are there any errors or FDCs?
Does it make a difference if you restart the qmgr? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Feb 19, 2007 5:39 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Can you display the output of
Code: |
dis q(system.cluster.*) curdepth ipprocs opprocs |
?
And if the ipprocs and opprocs on any of the 3 queues displayed are 0 you need to bounce the qmgr.
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
sam@prof |
Posted: Mon Feb 19, 2007 6:11 am Post subject: |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
Jeff - restarting the queue manager does not make a difference. The sender channel is running but the reciever is not running for this cluster. There are no errors or FDC's.
fjb_saper - The output is:
dis q(system.cluster.*) curdepth ipprocs opprocs
21 : dis q(system.cluster.*) curdepth ipprocs opprocs
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) IPPROCS(0)
OPPROCS(0) CURDEPTH(534)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.REPOSITORY.QUEUE)
IPPROCS(0) OPPROCS(0)
CURDEPTH(2)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) IPPROCS(0)
OPPROCS(0) CURDEPTH(2)
What do you mean by 'bounce the qmgr'?
Thanks everyone for your replies! ) |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Feb 19, 2007 6:12 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
He meant restart the qmgr.
If the receiver is not running, then that's a problem.
What does dis chs(*) on the FR show? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
sam@prof |
Posted: Mon Feb 19, 2007 6:44 am Post subject: |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
On the FR, it shows the Cluster receiver as running.
I did the same command on the queue manager I am adding to the cluster and it shows the cluster sender as running but the cluster reciever is not displayed.
I have been going through the log files and I found the following error messages:
AMQ9419: No cluster-receiver channels for cluster ''
EXPLANATION:
The repository manager has received information about a cluster for which no
cluster-receiver channels are known.
ACTION:
Define cluster-receiver channels for the cluster on the local queue manager.
AMQ6184: An internal WebSphere MQ error has occurred on queue manager
BT.QM.PMT4.
EXPLANATION:
An error has been detected, and the WebSphere MQ error recording routine has
been called. The failing process is process 23434.
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier, and to save the generated output files. Contact your IBM support
center. Do not discard these files until the problem has been resolved. |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Feb 19, 2007 6:56 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
You should see the Cluster Sender on the FR running AND the Cluster Rcvr.
The same with on the PR.
It seems like the CLUSTER attribute on one of the channels is not correct. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Feb 20, 2007 3:50 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
sam@prof wrote: |
fjb_saper - The output is:
dis q(system.cluster.*) curdepth ipprocs opprocs
21 : dis q(system.cluster.*) curdepth ipprocs opprocs
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) IPPROCS(0)
OPPROCS(0) CURDEPTH(534)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.REPOSITORY.QUEUE)
IPPROCS(0) OPPROCS(0)
CURDEPTH(2)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) IPPROCS(0)
OPPROCS(0) CURDEPTH(2) |
I'd like to know what the system says about the same command(see above) after he restarts the qmgr.
Obviously from the output the cluster repository manager stopped running.
As it is supposed to be under qmgr control he has to bounce the qmgr.
At the same time he might want to clear the system.cluster.command queue before bouncing the qmgr. There just might be some "junk" in there causing the qmgr to terminate it's cluster repository process. If it is a partial repository clear as well the system.cluster.repository.queue.
I'd check the messages on the system.cluster.transmit.queue and find out why the channel to their destination is not running.
He might have gotten some FDC with object changed error on some cluster queue (changed with a "Force" command?)
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
sam@prof |
Posted: Tue Feb 20, 2007 9:39 am Post subject: |
|
|
Apprentice
Joined: 15 Aug 2006 Posts: 30
|
I have:
Suspended the cluster
Stop all the channels
Deleted the TO.REPOS channel
Killed the amqrrmfa process
Restarted the queue manager
Redefined the TO.REPOS channel
Resumed the cluster
Started all the channels
I was unable to delete messages in SYSTEM.CLUSTER.REPOSITORY.QUEUE as this queue manager is a full repository for another cluster.
I re-entered dis q(system.cluster.*) curdepth ipprocs opprocs:
dis q(system.cluster.*) curdepth ipprocs opprocs
7 : dis q(system.cluster.*) curdepth ipprocs opprocs
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) IPPROCS(1)
OPPROCS(0) CURDEPTH(0)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.REPOSITORY.QUEUE)
IPPROCS(1) OPPROCS(1)
CURDEPTH(7)
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) IPPROCS(3)
OPPROCS(3) CURDEPTH(26)
But the receiver channel on the queue manager I am adding to the cluster is still not running.
The log file shows the following error 3 times:
----- amqxfdcx.c : 673 --------------------------------------------------------
02/20/07 04:50:38 PM
AMQ6183: An internal WebSphere MQ error has occurred.
EXPLANATION:
An error has been detected, and the WebSphere MQ error recording routine has bee
n
called. The failing process is process 9105.
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier, and to save the generated output files. Contact your IBM support
center. Do not discard these files until the problem has been resolved.
_______________________________________________________
And 5 FDC files were created during the time I did the above process, all containing:
AMQ6109: An internal WebSphere MQ error has occurred
Major Errorcode :- xecE_W_UNEXPECTED_ASYNC_SIGNAL |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Feb 20, 2007 10:13 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Those would possibly be from the "killed amqrrmfa process" step. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Feb 20, 2007 1:24 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
The output now shows that at least the repository manager is running...
Now you will have to check why the channels are not running between the FR and the PR.
Make sure that the IP and port are available from the PR and the FR.
If need be use telnet to check.
Check the 27 messages in the cluster xmitq and browse them. You need to find out where they are going and why they seem stuck.
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|