|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
MQ Cluster as part of HA. Messages stuck on TRANSMIT.QUEUE |
« View previous topic :: View next topic » |
Author |
Message
|
tez_i |
Posted: Mon Sep 29, 2008 8:02 am Post subject: MQ Cluster as part of HA. Messages stuck on TRANSMIT.QUEUE |
|
|
Novice
Joined: 03 Apr 2008 Posts: 12
|
I have a cluster set up as follows:
One queue manager QM-SEND
four queue managers QM-RCV-1, QM-RCV-2, QM-RCV-3, QM-RCV-4
All part of a cluster, where QM-RCV1 and QM-RCV2 are the FRs, and QM-RCV-1,2,3&4 all host the same queues.
The idea being mostly load-balancing, so QM-SEND can round-robin to the QM-RCVs. This works great when all 4 are running.
However, it is also intended that this set up be (A SMALL PART) of the overall HA set-up. It would be my assumption that if we took out QM-RCV-1 (one of the FRs) then everything should carry on as normal.
BUT - of course, it doesn't or I wouldn't be posting here.....
When QM-RCV-1 is shutdown, messages are staying on the SYSTEM.CLUSTER.TRANSMIT.QUEUE of QM-SEND.
The messages all have TO.QM-RCV-1 in their correlation ids, so its obvious why they are not being delievered.
TO.QM-RCV-1 is in retry state (as would happen if it fell over unexpectedly - ie deliberatly what I am testing)
TO.QM-RCV-2 (to talk to the other FR) is not started, obviously, because the messages are not being routed there, but can be started. Likewise for QM-RCV-3 and QM-RCV-4, the PRs which also host the relevant queue.
DIS CLUSQMGR on the QM-SEND shows, what I consider to be the correct set up. Correct clussdr channel names, correct conname, correct cluster name. The statuses are 'RETRYING' or 'RUNNING' respectively.
To the best of my ability, I have checked that the app that is putting the messages onto QM-SEND is not specifying a queue manager name, and using bindings NO FIXED.
Is there something wrong with this set up, that is stopping the re-routing of the messages to another QM-RCV?
What else should I be looking at to debug the problem? |
|
Back to top |
|
 |
Mr Butcher |
Posted: Mon Sep 29, 2008 8:16 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
if you do a display of the cluster queue on QM-SEND, do you see all 4 instances of that queue?
i'd expect on retry of the TO.QM-RCV-1 channel that a different destination is picked for the messages, if one of these destinations is available. _________________ Regards, Butcher |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Sep 29, 2008 8:21 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Are these even application messages that are backed up? If you take a FR down you will get some cluster admin messages starting to pile up trying to go to the FR.
If they are app messages, are you sure the app queue is defined on the other QMs, that it is clustered correctly, that it is not put inhibited? And are you 100% sure that the QM name is not set and Bind On Open wasn't used? If yes, how? Becasue the app told you so?
What happens if you send some test messages from your test app connected to QM-SEND to a test queue you have defined on all 4 RCV QMs? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
tez_i |
Posted: Mon Sep 29, 2008 8:34 am Post subject: |
|
|
Novice
Joined: 03 Apr 2008 Posts: 12
|
All four instances of the cluster queue can be seen on QM-SEND.
And, yes, the application message that I am tracking through the system is one of the ones stuck on the transmit queue - as well as, of course, cluster admin and other messages.
The queues are not put inhibited - the problem is definitely that the message is being routed to the "wrong" queue manager..... and as Mr Butcher said, I would have expected the new destination to be picked once the channel was in retry.
As to being 100% about the bindings....well no....the app developer says so!!!! and so, for now, I have to believe it I am not even 100% convinced this has ever been tested before. But its my job to prove or disprove the MQ resiliance (and I can't blame the other side just yet!)
.....More tests needed.....and any more ideas welcome! |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Sep 29, 2008 9:10 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Set up your own test queues and test this yourself where you have control over the sending app and the options used.
What is the short and long retry interval on the CLUSRCVR channel on QM-RCV1? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Sep 29, 2008 11:42 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Use RFHUtil to browse the messages stuck on the cluster xmitq.
Check the correlId to make sure it contains the right channel. You'll have to switch between display modes of the correlId.
Make sure that the channel in the correlId is the current channel in that cluster and that it is active and in running status... This should get the messages to flow...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
tez_i |
Posted: Tue Sep 30, 2008 6:29 am Post subject: |
|
|
Novice
Joined: 03 Apr 2008 Posts: 12
|
We may possibly have narrowed this down to an issue with JMS, especially as this is an upgraded system (MQ, WAS and Java).
The thing is, does anyone have a good way to test putting messages to the cluster queue on QM-SEND? obviously we can't use rfhutils, and by using our test harness, we may not be excluding the JMS/Java problem.
...to reiterate.... the CorrelID of the message on the XmitQ has the incorrect value in it, that's the problem.
And it looks like, although the code *IS* supposed to be binding-not-fixed, it actually seems to be binding-on-open. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Sep 30, 2008 7:09 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
tez_i wrote: |
The thing is, does anyone have a good way to test putting messages to the cluster queue on QM-SEND? obviously we can't use rfhutils, |
Use any program that puts MQ messages. rfhutil, amqsput, MO71, and about 100 other ones.
And to reiterate, you should have a suite of test queues on all your QMs in the cluster to test with. Relying on apps to test MQ infrastructure changes is bad. You should be confident your tests worked first before calling the apps and asking them to test their apps. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Sep 30, 2008 7:36 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
tez_i wrote: |
We may possibly have narrowed this down to an issue with JMS, especially as this is an upgraded system (MQ, WAS and Java).
The thing is, does anyone have a good way to test putting messages to the cluster queue on QM-SEND? obviously we can't use rfhutils, and by using our test harness, we may not be excluding the JMS/Java problem.
...to reiterate.... the CorrelID of the message on the XmitQ has the incorrect value in it, that's the problem.
And it looks like, although the code *IS* supposed to be binding-not-fixed, it actually seems to be binding-on-open. |
Did you check your JNDI definition of the destination. Does it say bind on open there? What is the default of your queue setup? Are you using a uri that specifies bind not fixed? do you have a qmgr name in your uri ?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|