ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Message loss with failover in MQ V7 Cluster

Post new topic  Reply to topic Goto page Previous  1, 2, 3  Next
 Message loss with failover in MQ V7 Cluster « View previous topic :: View next topic » 
Author Message
Wally
PostPosted: Wed Sep 22, 2010 7:24 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

Now I am lost - just waited 5 minutes before sending messages and all messages arrived at the on-line qmgr - no loss !?!?

Ok, if MQ is not recognizing the channel failure in time at least the message should stay in the S.C.T.Q - no ??

So I have now a scenario were it works, but still it a riddle to me where the message goes when I try to send message "to early" ...
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Sep 22, 2010 7:31 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Wally wrote:
Ok, if MQ is not recognizing the channel failure in time at least the message should stay in the S.C.T.Q - no ??


This comes back to the message being visible or not, committed or not, etc, etc.

Wally wrote:
So I have now a scenario were it works, but still it a riddle to me where the message goes when I try to send message "to early" ...


You also know 1st hand why trying to use a WMQ cluster for this kind of purpose isn't a good idea.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Sep 22, 2010 7:31 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9470
Location: US: west coast, almost. Otherwise, enroute.

Quote:
the message should stay in the S.C.T.Q - no ??

Yes, presuming that a message was created AND the message didn't expire OR end up in a dead-letter queue.

MQ doesn't lose messages.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Wally
PostPosted: Wed Sep 22, 2010 7:38 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

Vitor wrote:

How do you mean "targeted"? If a message is addressed to a given queue manager it bypasses the cluster workload distribution.


I mean targeted in sence of round-robin would try to send the message to this qmgr.

Vitor wrote:

So if you have 3 messages browsable in the SCTQ and bring one of the queue managers on line what happens?

After the channel reconnect the messages are transfered fine.

Vitor wrote:

If a message (M1) isn't sent to the on-line queue manager but M2 & M3 are, what happens if you then bring the other queue manager on-line?

The message is also not appearing in the previous off-line now on-line qmgr.

Vitor wrote:

Are you certain expiry isn't in use?

At least I dont set it anywhere and browsing the SCTQ show expiry unlimited.

I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ??
Back to top
View user's profile Send private message
Wally
PostPosted: Wed Sep 22, 2010 7:41 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

bruce2359 wrote:
Quote:
the message should stay in the S.C.T.Q - no ??

Yes, presuming that a message was created AND the message didn't expire OR end up in a dead-letter queue.

MQ doesn't lose messages.


I thought this as well ... till now ... but I swear the system.dead.letter.queue is empty !
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Sep 22, 2010 7:45 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Wally wrote:
I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ??


Which 3?

Have you checked your code to ensure all are put?

If you can prove that when both queue managers are down and the channels are stopped, all 4 messages are put with RC 0 and committed and only 3 are in the SCTQ then it's time for a PMR.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Sep 22, 2010 7:49 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9470
Location: US: west coast, almost. Otherwise, enroute.

Is this a new application?

Let's do a simple test.
1. Stop the clussdr channel on the qmgr where you run the application.
2. Any messages in the SCTQ?
3. Run the application putting exactly 3 messages.
4. Any messages in the SCTQ now?
Repeat this test a few times.

Does your code catch errors?

By the way, have you read the WMQ Clusters manual?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Wally
PostPosted: Wed Sep 22, 2010 8:15 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

Vitor wrote:
Wally wrote:
I just discovered another strange thing: when i stop both T1 and T2 and try send 4 messages via S1 only 3 are kept in the SCTQ and again the first message is lost - how strange is this ??


Which 3?

Have you checked your code to ensure all are put?

If you can prove that when both queue managers are down and the channels are stopped, all 4 messages are put with RC 0 and committed and only 3 are in the SCTQ then it's time for a PMR.


I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down.

So I guess it comes all back to the detection of the broken channel communication - but still the message should not be lost.

On a freshly partial started cluster all messages stay fine in the SCTQ - same is true when simply stopping the cluster sender channels.

By the way I use MQ v7.0.1.0 - anyone using the same version and not having this issue?
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Sep 22, 2010 8:24 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9470
Location: US: west coast, almost. Otherwise, enroute.

Quote:
I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down.

Are you saying that you looked into the target queues down the network, AND that all the messages were successfully deployed to the target queues? Or only that the messages were no longer in the SCTQ?

Are you saying that after you shut one or both of the qmgrs down, then one or more of the message disappear from the target queues?

Please be more precise and specific in your posts. Messages can only be in a few places: transmission queue waiting to be sent, target queue or dead-letter queue. This presumes that a message was actually created.

Again, does your app catch errors?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Wally
PostPosted: Wed Sep 22, 2010 8:28 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

bruce2359 wrote:
Is this a new application?

Let's do a simple test.
1. Stop the clussdr channel on the qmgr where you run the application.
2. Any messages in the SCTQ?
3. Run the application putting exactly 3 messages.
4. Any messages in the SCTQ now?
Repeat this test a few times.

Does your code catch errors?

By the way, have you read the WMQ Clusters manual?


Yes, my application catches exceptions and is also transactional and I have tried to read the MQ cluster manual as best as I can ...

So my observation on this is whenever MQ tries to transfer a message to a failed cluster component and has not run its state synch process a message could be lost.
In cases where the overall status is determined the messages are buffered or correctly rerouted.
Back to top
View user's profile Send private message
Wally
PostPosted: Wed Sep 22, 2010 8:41 am    Post subject: Reply with quote

Novice

Joined: 22 Sep 2010
Posts: 15

bruce2359 wrote:
Quote:
I have checked again: seems the message only get lost in a situation where previous messages have been distributed across my two target qmgrs and then shutting them down.

Are you saying that you looked into the target queues down the network, AND that all the messages were successfully deployed to the target queues? Or only that the messages were no longer in the SCTQ?

Are you saying that after you shut one or both of the qmgrs down, then one or more of the message disappear from the target queues?

Please be more precise and specific in your posts. Messages can only be in a few places: transmission queue waiting to be sent, target queue or dead-letter queue. This presumes that a message was actually created.

Again, does your app catch errors?


I will try to describe the failure scenario better:

- Start > T1, T2 and S1 started
- Send messages m1, m2, m3 and m4 to S1
- Check status > T1: m1 and m3 in q; T2: m2 and m4 in q
- Stop T1 (next targeted qmgr with round robin) until stopped status is indicated
- Send messages m5, m6, m7 and m8 (without any exception)
- Check status > S1: SCTQ empty and SDLQ empty; T1: off-line; T2: m6, m7 and m8 in q
- Start T1 and check status > T1: empty q
- m5 missing

Positive scenario:

- Start > S1 started; T1 and T1 off-line
- Send messages m1, m2, m3 and m4 to S1
- Check status > S1: SCTQ m1 - m4;
- Start T1 and check status > S1: SCTQ empty and SDLQ empty; T1: m1 - m4

Also positive scenario:

- Start > T1, T2 and S1 started
- Send messages m1, m2, m3 and m4 to S1
- Check status > T1: m1 and m3 in q; T2: m2 and m4 in q
- Stop T1 (next targeted qmgr with round robin) wait like 5 minutes
- Send messages m5, m6, m7 and m8 (without any exception)
- Check status > S1: SCTQ empty and SDLQ empty; T1: off-line; T2: m5, m6, m7 and m8 in q[/b]
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Sep 22, 2010 9:26 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Wally wrote:
- Stop T1 (next targeted qmgr with round robin) until stopped status is indicated


Pedantically you don't know that's next in line. Just because the last put was made to T2 doesn't guarantee that T1 is next. Indeed, the 2nd positive scenario is positive because the target changes.

But if you feel you've found a repeatable bug a PMR is your next action.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Sep 22, 2010 9:27 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Wally wrote:
has not run its state synch process a message could be lost.


For the record, which process is this? I've never heard of it. All I know about is the workload balancer, which doesn't hold state that I'm aware of.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Sep 22, 2010 9:55 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9470
Location: US: west coast, almost. Otherwise, enroute.

In some, but not all of your narrative, you say "Check status > S1: SCTQ empty"

Does this mean that you actually saw messages in the SCTQ? Was there an instance when you saw messages in the SCTQ empty?

Did you do the test that I suggested?

Quote:
...has not run its state synch process a message could be lost.

I'm thinking that it's your application that, for whatever reason, did not commit, and therefore the message(s) were backed out of the SCTQ.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
mvic
PostPosted: Wed Sep 22, 2010 11:22 am    Post subject: Reply with quote

Jedi

Joined: 09 Mar 2004
Posts: 2080

Wally wrote:
So when I run the "failover test" I stop the complete qmgr T1; wait a time and then again try to send messages.

Failover? I'm not sure why you would call it that.

Your Full Repositories should be up 24x7. They should be on good solid networks, and in constant contact with each other.

Also, IMHO (I know opinions differ) it's good practice to NOT have them host application queues, but simply to hold all the cluster information.


Last edited by mvic on Wed Sep 22, 2010 11:31 am; edited 1 time in total
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page Previous  1, 2, 3  Next Page 2 of 3

MQSeries.net Forum Index » Clustering » Message loss with failover in MQ V7 Cluster
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.