Author |
Message
|
Wally |
Posted: Wed Sep 22, 2010 7:24 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
Now I am lost - just waited 5 minutes before sending messages and all messages arrived at the on-line qmgr - no loss !?!?
Ok, if MQ is not recognizing the channel failure in time at least the message should stay in the S.C.T.Q - no ??
So I have now a scenario were it works, but still it a riddle to me where the message goes when I try to send message "to early" ... |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 22, 2010 7:31 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Wally wrote: |
Ok, if MQ is not recognizing the channel failure in time at least the message should stay in the S.C.T.Q - no ?? |
This comes back to the message being visible or not, committed or not, etc, etc.
Wally wrote: |
So I have now a scenario were it works, but still it a riddle to me where the message goes when I try to send message "to early" ... |
You also know 1st hand why trying to use a WMQ cluster for this kind of purpose isn't a good idea. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Sep 22, 2010 7:31 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9471 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
the message should stay in the S.C.T.Q - no ?? |
Yes, presuming that a message was created AND the message didn't expire OR end up in a dead-letter queue.
MQ doesn't lose messages. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Wally |
Posted: Wed Sep 22, 2010 7:38 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
Vitor wrote: |
How do you mean "targeted"? If a message is addressed to a given queue manager it bypasses the cluster workload distribution.
|
I mean targeted in sence of round-robin would try to send the message to this qmgr.
Vitor wrote: |
So if you have 3 messages browsable in the SCTQ and bring one of the queue managers on line what happens?
|
After the channel reconnect the messages are transfered fine.
Vitor wrote: |
If a message (M1) isn't sent to the on-line queue manager but M2 & M3 are, what happens if you then bring the other queue manager on-line?
|
The message is also not appearing in the previous off-line now on-line qmgr.
Vitor wrote: |
Are you certain expiry isn't in use?
|
At least I dont set it anywhere and browsing the SCTQ show expiry unlimited.
I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ?? |
|
Back to top |
|
 |
Wally |
Posted: Wed Sep 22, 2010 7:41 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
bruce2359 wrote: |
Quote: |
the message should stay in the S.C.T.Q - no ?? |
Yes, presuming that a message was created AND the message didn't expire OR end up in a dead-letter queue.
MQ doesn't lose messages. |
I thought this as well ... till now ... but I swear the system.dead.letter.queue is empty ! |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 22, 2010 7:45 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Wally wrote: |
I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ?? |
Which 3?
Have you checked your code to ensure all are put?
If you can prove that when both queue managers are down and the channels are stopped, all 4 messages are put with RC 0 and committed and only 3 are in the SCTQ then it's time for a PMR. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Sep 22, 2010 7:49 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9471 Location: US: west coast, almost. Otherwise, enroute.
|
Is this a new application?
Let's do a simple test.
1. Stop the clussdr channel on the qmgr where you run the application.
2. Any messages in the SCTQ?
3. Run the application putting exactly 3 messages.
4. Any messages in the SCTQ now?
Repeat this test a few times.
Does your code catch errors?
By the way, have you read the WMQ Clusters manual? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Wally |
Posted: Wed Sep 22, 2010 8:15 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
Vitor wrote: |
Wally wrote: |
I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ?? |
Which 3?
Have you checked your code to ensure all are put?
If you can prove that when both queue managers are down and the channels are stopped, all 4 messages are put with RC 0 and committed and only 3 are in the SCTQ then it's time for a PMR. |
I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down.
So I guess it comes all back to the detection of the broken channel communication - but still the message should not be lost.
On a freshly partial started cluster all messages stay fine in the SCTQ - same is true when simply stopping the cluster sender channels.
By the way I use MQ v7.0.1.0 - anyone using the same version and not having this issue? |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Sep 22, 2010 8:24 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9471 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down. |
Are you saying that you looked into the target queues down the network, AND that all the messages were successfully deployed to the target queues? Or only that the messages were no longer in the SCTQ?
Are you saying that after you shut one or both of the qmgrs down, then one or more of the message disappear from the target queues?
Please be more precise and specific in your posts. Messages can only be in a few places: transmission queue waiting to be sent, target queue or dead-letter queue. This presumes that a message was actually created.
Again, does your app catch errors? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Wally |
Posted: Wed Sep 22, 2010 8:28 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
bruce2359 wrote: |
Is this a new application?
Let's do a simple test.
1. Stop the clussdr channel on the qmgr where you run the application.
2. Any messages in the SCTQ?
3. Run the application putting exactly 3 messages.
4. Any messages in the SCTQ now?
Repeat this test a few times.
Does your code catch errors?
By the way, have you read the WMQ Clusters manual? |
Yes, my application catches exceptions and is also transactional and I have tried to read the MQ cluster manual as best as I can ...
So my observation on this is whenever MQ tries to transfer a message to a failed cluster component and has not run its state synch process a message could be lost.
In cases where the overall status is determined the messages are buffered or correctly rerouted. |
|
Back to top |
|
 |
Wally |
Posted: Wed Sep 22, 2010 8:41 am Post subject: |
|
|
Novice
Joined: 22 Sep 2010 Posts: 15
|
bruce2359 wrote: |
Quote: |
I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down. |
Are you saying that you looked into the target queues down the network, AND that all the messages were successfully deployed to the target queues? Or only that the messages were no longer in the SCTQ?
Are you saying that after you shut one or both of the qmgrs down, then one or more of the message disappear from the target queues?
Please be more precise and specific in your posts. Messages can only be in a few places: transmission queue waiting to be sent, target queue or dead-letter queue. This presumes that a message was actually created.
Again, does your app catch errors? |
I will try to describe the failure scenario better:
- Start > T1, T2 and S1 started
- Send messages m1, m2, m3 and m4 to S1
- Check status > T1: m1 and m3 in q; T2: m2 and m4 in q
- Stop T1 (next targeted qmgr with round robin) until stopped status is indicated
- Send messages m5, m6, m7 and m8 (without any exception)
- Check status > S1: SCTQ empty and SDLQ empty; T1: off-line; T2: m6, m7 and m8 in q
- Start T1 and check status > T1: empty q
- m5 missing
Positive scenario:
- Start > S1 started; T1 and T1 off-line
- Send messages m1, m2, m3 and m4 to S1
- Check status > S1: SCTQ m1 - m4;
- Start T1 and check status > S1: SCTQ empty and SDLQ empty; T1: m1 - m4
Also positive scenario:
- Start > T1, T2 and S1 started
- Send messages m1, m2, m3 and m4 to S1
- Check status > T1: m1 and m3 in q; T2: m2 and m4 in q
- Stop T1 (next targeted qmgr with round robin) wait like 5 minutes
- Send messages m5, m6, m7 and m8 (without any exception)
- Check status > S1: SCTQ empty and SDLQ empty; T1: off-line; T2: m5, m6, m7 and m8 in q[/b] |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 22, 2010 9:26 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Wally wrote: |
- Stop T1 (next targeted qmgr with round robin) until stopped status is indicated |
Pedantically you don't know that's next in line. Just because the last put was made to T2 doesn't guarantee that T1 is next. Indeed, the 2nd positive scenario is positive because the target changes.
But if you feel you've found a repeatable bug a PMR is your next action. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 22, 2010 9:27 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Wally wrote: |
has not run its state synch process a message could be lost. |
For the record, which process is this? I've never heard of it. All I know about is the workload balancer, which doesn't hold state that I'm aware of.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Sep 22, 2010 9:55 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9471 Location: US: west coast, almost. Otherwise, enroute.
|
In some, but not all of your narrative, you say "Check status > S1: SCTQ empty"
Does this mean that you actually saw messages in the SCTQ? Was there an instance when you saw messages in the SCTQ empty?
Did you do the test that I suggested?
Quote: |
...has not run its state synch process a message could be lost. |
I'm thinking that it's your application that, for whatever reason, did not commit, and therefore the message(s) were backed out of the SCTQ. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
mvic |
Posted: Wed Sep 22, 2010 11:22 am Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
Wally wrote: |
So when I run the "failover test" I stop the complete qmgr T1; wait a time and then again try to send messages. |
Failover? I'm not sure why you would call it that.
Your Full Repositories should be up 24x7. They should be on good solid networks, and in constant contact with each other.
Also, IMHO (I know opinions differ) it's good practice to NOT have them host application queues, but simply to hold all the cluster information.
Last edited by mvic on Wed Sep 22, 2010 11:31 am; edited 1 time in total |
|
Back to top |
|
 |
|