|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
cluster troubleshooting |
« View previous topic :: View next topic » |
Author |
Message
|
sebastia |
Posted: Fri Apr 27, 2007 3:47 am Post subject: cluster troubleshooting |
|
|
 Grand Master
Joined: 07 Oct 2004 Posts: 1003
|
We have a problem : primary repository QM has gone down.
Now, it does not see the remote cluster queues.
Is there a sequence of troubleshooting actions for a cluster ?
Thanks. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Apr 27, 2007 4:02 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
There's the Cluster manual
If the full repository has gone down, your cluster should still function (because all the interconnections remain defined) but you won't be able to add queues or queue managers. Unless you've followed the recommendations in the manual and have 2 full repositories!
As to troubleshooting, treat it like a normal queue manager. Make sure it's running normal, check the cluster channels are working normally and it should all sort itself out in time. You might have to refresh the cluster depending on what failed when. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
sebastia |
Posted: Fri Apr 27, 2007 5:58 am Post subject: |
|
|
 Grand Master
Joined: 07 Oct 2004 Posts: 1003
|
Believe me - the Cluster manual has been read from first to last page.
But TroubleShooting (part 4, Apendixes, Apendix A) is quite lite ...
My particular problem has gone away with a "SUSPEND qmgr" command.
Bu I would still ask for a "Cluster TroubleShooting Guide" ...
Cheers ! |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Apr 27, 2007 7:26 am Post subject: |
|
|
Guest
|
:...primary repository QM has gone down" Are you saying that you only have ONE full repository??
If so, what you experienced is the expected outcome of having one full repos. In this case, there is little diagnostic work to be done.
Having two (or more) full repositories ensures that the failure of one full repos will not cause a catastrophic failure like the one you experienced. The failure of one of the full repos qmgrs will cause the remaining full repos to continue doing full repos work. |
|
Back to top |
|
 |
sebastia |
Posted: Fri Apr 27, 2007 11:08 am Post subject: |
|
|
 Grand Master
Joined: 07 Oct 2004 Posts: 1003
|
No, Bruce - I would say my customer is very clever
and dos NOT have only one Full Repository.
But, somehow, all has gone wrong,
and remote queues are not "seen" at one of the Full repositories
(the one that crashed)
SUSPEND/RESUME has fixed that,
but I still feel very unattended when my clusters have problems.
Last month, by exemple,
we had a FR on z/OS and another FR on an AIX machine.
I was just issuing "DISPLAY CLUSQMGR" command every 10 seconds
on the AIX, and all results were diferent.
There was some kind of "TCP" storm on the net.
So, our cluster was .. not stable
But I felt without MQ tools to see what was happening ....
Like today.
Thanks for your patience. |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Apr 27, 2007 12:23 pm Post subject: |
|
|
Guest
|
From the Queue Manager Clusters manual:
DISPLAY CLUSQMGR
Use the DISPLAY CLUSQMGR command to display cluster information about queue managers in a cluster. If you issue this command from a queue manager with a full repository, the information returned pertains to every queue manager in the cluster. If you issue this command from a queue manager that does not have a full repository, the information returned pertains only to the queue managers in which it has an interest. That is, every queue manager to which it has tried to send a message and every queue manager that holds a full repository.
The information includes most channel attributes that apply to cluster-sender and cluster-receiver channels, and also:
DEFTYPE How the queue manager was defined. DEFTYPE can be one of the following:
.CLUSSDR Defined explicitly as a cluster-sender channel
.CLUSSDRA Defined by auto-definition as a cluster-sender channel
.CLUSSDRB Defined as a cluster-sender channel, both explicitly and by auto-definition
.CLUSRCVR Defined as a cluster-receiver channel
QMTYPE Whether it holds a full repository or only a partial repository. CLUSDATE The date at which the definition became available to the local queue manager.
CLUSTIME The time at which the definition became available to the local queue manager.
STATUS The current status of the cluster-sender channel for this queue manager.
SUSPEND Whether the queue manager is suspended.
CLUSTER What clusters the queue manager is in.
CHANNEL The cluster-receiver channel name for the queue manager.
--------------------------------------------------------------------
When you say that after issuing the DIS CLUSQMGR command every 10 seconds, that the results were different on the AIX box? OR that the results differed between the other full repos (on z/OS)? Which?
What kind of things were different? |
|
Back to top |
|
 |
sebastia |
Posted: Fri Apr 27, 2007 1:47 pm Post subject: |
|
|
 Grand Master
Joined: 07 Oct 2004 Posts: 1003
|
I was all the time on AIX using "runmqsc".
Two consecutive displays showed diferent cluster configuration,
and again diferent after 10-15 seconds.
Also, diferent remote (cluster) queues were displayed. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|