Author |
Message
|
McueMart |
Posted: Fri Jun 27, 2014 8:25 am Post subject: Cluster Validation |
|
|
 Chevalier
Joined: 29 Nov 2011 Posts: 490 Location: UK...somewhere
|
In the past few weeks and months, I have observed a number of 'corruptions' with our MQ clusters (which can range from 2-20 QMs).
Sometimes I have seen multiple QMGRs with the same name listed when 'display clusqmgr(*)' is run, and other times I have got RC2189 (Cluster resolution error) for seemingly no apparent reason.
I am not blaming the MQ product for all these issues - they well may have been caused by user error (due to us re-creating QMGRs etc!). We are also running at 7.5.0.2, so upgrading to 7.5.0.3 should be one of our first actions (And this is something we will do in the near future!).
Another thing I wanted to do was create a kind of 'Validate my cluster!' program - which brings me to the point of this post. Has anyone ever tried to create a program which 'programatically' validates that an MQ cluster is hunky-dory?
My thinking is that it could check a number of things:
1) Run 'DISPLAY CLUSQMGR(*)' on each QMGR and make sure the output is consistent across all QMGRS, and there aren't any repeat entries.
2) Check that CLUSTER SNDR/RECR channels are defined correctly and in an acceptable state
3) Check that the output of 'DISPLAY QUEUE(*) TYPE(QCLUSTER)' is consistent across all QMGRS. This will check that all cluster queues can successfully be 'seen' on all the QMGRs.
I would be very interested if anyone has ever attempted something like this - and if so, if they have any constructive feedback about how it can be achieved.
Many thanks! |
|
Back to top |
|
 |
exerk |
Posted: Fri Jun 27, 2014 8:36 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
McueMart wrote: |
...Sometimes I have seen multiple QMGRs with the same name listed when 'display clusqmgr(*)' is run... |
And you have removed the 'offending' QMIDS? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Jun 27, 2014 8:51 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
I have test queues in the cluster that I use to test the cluster. each QM has the same clustered queue, and their own local non clustered queue. One by one the script connects to a QM and tries to send messages to the local non clustered QM on each channel, then throws a burst of messages to the clustered queue. On to the next QM and repeat.
When done you've proven each QM can talk to the other via cluster channels, and you can see load balancing works.
Nothing beats actually using the cluster to prove it works. This identifies channels with problems, suspended QMs that should not be, etc.
Can it be better / more comprehensive? Sure. But it does a pretty good job in finding most common problems. Or just giving us the warm and fuzzies that all the QMs be clustering. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
tczielke |
Posted: Fri Jun 27, 2014 11:45 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
We have noted some odd cluster behavior since we moved a majority of our queue managers (including the full repositories) to 7.5.0.2. We are actually in the process of moving the full repositories to 7.5.0.3, because a PMR suggested it could potentially help with some of what we were seeing. |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Jun 27, 2014 1:30 pm Post subject: Re: Cluster Validation |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
McueMart wrote: |
Sometimes I have seen multiple QMGRs with the same name listed when 'display clusqmgr(*)' is run... |
Do the qmgrs with the same name have the exact same QMID value? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
rammer |
Posted: Sun Jun 29, 2014 5:50 am Post subject: Re: Cluster Validation |
|
|
Partisan
Joined: 02 May 2002 Posts: 359 Location: England
|
bruce2359 wrote: |
McueMart wrote: |
Sometimes I have seen multiple QMGRs with the same name listed when 'display clusqmgr(*)' is run... |
Do the qmgrs with the same name have the exact same QMID value? |
In addition to having different QMIDS check the CONNAME as you may find someone is "testing" by building a Queue Manager and hooking it upto your REPOS but using the name of an exissting QMGR. That way you can see which the offending server is. Also on 7.5 add chlauth to allow connections in from only the IP's of the Queue Managers your know about. |
|
Back to top |
|
 |
McueMart |
Posted: Mon Jun 30, 2014 1:50 am Post subject: |
|
|
 Chevalier
Joined: 29 Nov 2011 Posts: 490 Location: UK...somewhere
|
bruce2359 wrote: |
Do the qmgrs with the same name have the exact same QMID value? |
No, different QMIDs - clearly caused by QMGRs being re-created with the same name.
exerk wrote: |
And you have removed the 'offending' QMIDS? |
Yep - RESET CLUSTER can successfully remove them.
tczielke wrote: |
We have noted some odd cluster behavior since we moved a majority of our queue managers (including the full repositories) to 7.5.0.2. We are actually in the process of moving the full repositories to 7.5.0.3, because a PMR suggested it could potentially help with some of what we were seeing.
|
Glad it's not just me going crazy then. Looking at the APAR list from 7503, there are a good number related to clustering so I think this will be a high priority for us.
@PeterPotkay - I like that approach a lot. I'll look at how we can automatically add something like this to our environments (Maybe in addition to validating the output of some of the runmqsc commands!).
Many thanks all! |
|
Back to top |
|
 |
|