|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Cluster DLQ behavior |
« View previous topic :: View next topic » |
Author |
Message
|
mattfarney |
Posted: Tue Oct 04, 2011 2:41 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
I may not have made this clear in my original post.
This layout/design has worked for 7+ years. I have seen exactly one instance of this oddity out of hundreds of millions of transactions, but it worried me enough to ask a question.
Code: |
."..¿...DLH ....'...SCS.LOGGING.QUEUE SCSAPP1. "...µ...MQSTR ....amqrmppa_nd 2011100305001407<?xml version="1.0"? |
As requested, that is the DLH from the message. I truncated the XML payload since it doesn't matter here.
SCSAPP1. is DELTA, but the name is missing the rest of the qualifiers. I did not modify that. It is missing in the DLH. The queue name as listed is correct and is a local queue on DELTA shared in the Greek cluster only.
I understand why the message went to the DLQ. It cannot deliver the message as presented.
My question is how did it get to other box in the first place? Either the cluster processes delivered it to the wrong machine or the wrong information was put into the transmit header when the message was written to the SCTQ.
-mf |
|
Back to top |
|
 |
mattfarney |
Posted: Tue Oct 04, 2011 2:43 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
Additional information:
WebSphere 6.0, Broker 5.0 (upgrade pending).
The message is being routed from Broker directly with no other application involved. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Oct 04, 2011 7:52 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Matt,
This got me thinking, and perhaps you will never get the resolution.
My thought is: can you run a
Code: |
dis clusqmgr(*) channel conname |
on all the cluster members in both clusters? Is there a possibility that old information on a deleted qmgr lingers somewhere and got acted upon (old channel?) and the other end of the current channel does no longer match and handles the message according to the rules.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
mattfarney |
Posted: Wed Oct 05, 2011 10:30 am Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
The Colors clusters is not mine, so I cannot run it there.
To my knowledge, the queue in question has never been defined in the Colors cluster. It has only ever been defined on Delta in the Greek cluster. The queue in question is for logging information created by the brokers in the Greek cluster.
None of the Colors machines (past or present) has ever been a part of the Greek cluster, so I don't see how latent poison info could exist.
-mf |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Oct 05, 2011 11:59 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
mattfarney wrote: |
The Colors clusters is not mine, so I cannot run it there.
To my knowledge, the queue in question has never been defined in the Colors cluster. It has only ever been defined on Delta in the Greek cluster. The queue in question is for logging information created by the brokers in the Greek cluster.
None of the Colors machines (past or present) has ever been a part of the Greek cluster, so I don't see how latent poison info could exist.
-mf |
I am getting a little bit more complex there.
Imagine there used to be a qmgr in the greek cluster with a channel named to.colors1. If that channel exists in the colors cluster and is known to the greek cluster, this might be a way to float a msg to the wrong destination.
When checking the destination on the greek cluster do you only have names of known greek active qmgrs?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
mattfarney |
Posted: Wed Oct 05, 2011 3:57 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
In the dis clusqmgr(*) channel conname cluster results, I see only active queue managers. The lists match what I expect in name, cluster, conname, and which machines are aware of both clusters.
This gives rise to another question. In my setup, I have queue managers that are in two clusters. Theoretically, you could setup cluster object definitions using a list or just have two independent sets of cluster objects for the clusters. Is either method preferred? Safer?
-mf |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Oct 05, 2011 5:56 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
If you want a definition to be visible to one cluster only, use the CLUSTER(clustername) attribute in the object definition.
If you want the definition to be visible to a list of clusters, use the CLUSNL(clustername, clustername, ...) attribute in the object definition. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Oct 05, 2011 9:13 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
However you need to be very careful with qmgrs bridging clusters i.e. having a membership in both clusters.
Say you have Q1 a queue in the "greek" cluster with instances only on greek qmgrs.
Say you have Q1 a queue in the "colors" cluster with instances only on the color qmgrs.
Say you have GreekColor a qmgr participating in both clusters and it does not host Q1.
Now if you have a msg destined for Q1 that would by any chance happen to be routed through the GreekColor qmgr, where would that message end up on? There is not telling if the message would wind up in the greek or color cluster because from the point of view of the GreekColor qmgr both are a viable destination. With the qmgr name blank both are acceptable choices.
If you have a cluster alias, it is then advisable to have 2 separate gateway qmgrs linked by p2p channels each in its own cluster advertizing in its own cluster a path to the other cluster...
Having overlapping clusters makes this only so much more complicated.
At this point you'd want each qmgr in the cluster to advertize the cluster alias, but for the overlapping qmgr, which defeats somewhat the purpose of routing from one cluster to the other.
 _________________ MQ & Broker admin |
|
Back to top |
|
 |
PeterPotkay |
Posted: Thu Oct 06, 2011 3:31 pm Post subject: Re: Cluster DLQ behavior |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
mattfarney wrote: |
Either Alpha or Beta generated a message destined for a local queue on Delta. |
Well, is that really true? How can you prove that? How do you know the message didn't originate from an app connected to a QM in the Colors cluster? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
mattfarney |
Posted: Fri Oct 07, 2011 2:25 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
Technically, I can't really prove anything.
Close inspection of the message tells me two things:
The DLH has information that the Colors QM does not know: a qlocal definition not shared to Colors and a machine name that they do not (and could not) talk to.
The payload of the message in the DLH was probably generated by a Colors QM.
-mf |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Oct 07, 2011 4:36 pm Post subject: Re: Cluster DLQ behavior |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
I find myself with some time to ponder this, and try to better understand your configuration... Please be patient with me. I've read, and reread your posts, and I remain a bit confused. I'd like to help, but I need some clarifications from you.
mattfarney wrote: |
Observed something that has me worried. I have a pair on intersecting clusters, some of which are my machines, and some are not.
My machines are Alpha, Beta, Gamma, Delta, and Epsilon where Alpha and Beta are full respositiories in the Greek cluster. |
The usual definition of intersecting clusters implies that one or more of the qmgrs is an FR for the two clusters - the qmgrs Alpha and Betta) have REPOSNL(Colors,Greek) as qmgr attributes. Is this true?
mattfarney wrote: |
Alpha and Beta are also satellite QMs in another cluster consisting of Purple and Black, both of which are full repositories in the Colors cluster. There are probably more QMs here but I can't see them (nor do I need to). |
What exactly do you mean by "Satellite"? "Satellite" only appears once in the Intercommunications manual; and appears zero times in the WMQ Clusters manual.
Do you mean that there are SENDER/RECEIVER or other non-cluster channels that connect Colors to Greek?
Clearly, you only see some of the configuration. If there are other qmgrs in Colors that you can't see, you should not summarily dismiss them so quickly. There may be some other configuration issues that you don't see - that might have some bearing on this issue.
mattfarney wrote: |
Here's my what I saw:
Either Alpha or Beta generated a message destined for a local queue on Delta. |
From the MQMD and message application data, you are certain of the source of the message? The creating application can not possibly exist on one of the other qmgrs, or in the other cluster?
mattfarney wrote: |
There is only one copy of the this queue shared in the Greek cluster (and none in the Colors cluster). |
You mean only one instance of the queue (not copy, which implies a 2nd instance elsewhere).
mattfarney wrote: |
We had some volume issues on the Delta and Epsilon QMs (same machine). When I was looking for outstanding issues and stranded traffic, I found the message showed up in the Dead Letter Queue.....on Purple. [Technically, an email from their MQ contact told me.] |
Stranded traffic? What does this mean - in WMQ terminology, please? Stranded in WMQ-speak often refers to messages on a qmgr that is currently not operating (shut down, for example) Exactly what did you find in the WMQ error logs? Or is "stranded traffic" unrelated to WMQ operation?
mattfarney wrote: |
I cannot think of a single reason this could have occurred, that doesn't involve a problem inside MQ. |
Do you mean that there are no explicitly defined channels between qmgrs in the two clusters?
If there are no WMQ channels (of any channel type - cluster or non-cluster) between Greek and Colors, you do have a mystery.
If you have channels between Greek and Colors, then there is/are object definitions (queue-manager alias, for example) that caused the message to be forwarded.
If there are no object definitions that moved the message, it is possible that one of the qmgrs has a default transmission queue (attribute of qmgr object), and a SENDER/SERVER channel that takes messages from that xmit queue, and that's how the message was moved. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
mattfarney |
Posted: Mon Oct 10, 2011 1:10 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
Quote: |
The usual definition of intersecting clusters implies that one or more of the qmgrs is an FR for the two clusters - the qmgrs Alpha and Betta) have REPOSNL(Colors,Greek) as qmgr attributes. Is this true?
|
No. The full repositories for the Colors cluster are not any of my machines.
Quote: |
What exactly do you mean by "Satellite"? "Satellite" only appears once in the Intercommunications manual; and appears zero times in the WMQ Clusters manual. |
I use satellite as short hand for non-repository queue manager in a cluster. Sorry if that was confusing.
Quote: |
Do you mean that there are SENDER/RECEIVER or other non-cluster channels that connect Colors to Greek? |
There are no non-cluster channel definitions between the clusters.
Quote: |
From the MQMD and message application data, you are certain of the source of the message? The creating application can not possibly exist on one of the other qmgrs, or in the other cluster? |
As best as I can tell, the destination information was unique to Greek, while the transaction data was unique to Colors.
Quote: |
You mean only one instance of the queue (not copy, which implies a 2nd instance elsewhere). |
Yes, one instance. Copy was a poor choice of words.
I use the term stranded traffic to refer to things that I have to fix. Messages in a DLQ, message stuck in a transmit queue, along with some other conditions where I have to resubmit traffic from our logging process. It's basically: stuff I need to fix.
Quote: |
Do you mean that there are no explicitly defined channels between qmgrs in the two clusters?
If there are no WMQ channels (of any channel type - cluster or non-cluster) between Greek and Colors, you do have a mystery. |
There are only cluster senders and cluster receivers that bridge the two clusters.
Could a default xmit queue be SYSTEM.CLUSTER.TRANSMIT.QUEUE?
I'm leaning towards the original write of the message to SCTQ had an issue. MQ tried to interpret the header the best it could and didn't end up with a reasonable solution....but the proof for that theory is long gone.
-mf |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Oct 10, 2011 10:56 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Quote: |
As best as I can tell, the destination information was unique to Greek, while the transaction data was unique to Colors. |
Here we're going somewhere. If at the time of the post the cluster name resolution algorithm for some reason could not route adequately (FR not available to deliver routing choices in a timely fashion) the message would have hit the DLQ in the Colors cluster.
Sometimes a simple DLQ handler with a retry will do the trick down the road when comms with the FR's are fine again.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|