Author |
Message
|
PeterPotkay |
Posted: Mon Nov 05, 2007 8:13 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
JYama wrote: |
What I'd like to know is why the clussdr channel indicated 'RUNNING' even the target QMgr was NOT running, why the status didn't change to 'RETRYING' after sending ONE message to the target QMgr which had not been running, why it took 20 to 30 secs that the status of the clussdr changed to 'RETRYING', and what is this 'long' interval.
How can I shorten this interval?
|
I agree with you. If you send 1 message down a channel that is not valid you would think it should go into retrying and thus not accept more messages. Maybe not immediatly, but certainly within 5 or 10 seconds it should know enough.
Read this doc it will help. There is a Japanese version too:
http://www-1.ibm.com/support/docview.wss?rs=203&uid=swg24006699&loc=en_US&cs=utf-8&lang=en
But I don't see that it tells us exactly how fast a channel will go into retrying when it realizes there is a problem. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Nov 05, 2007 2:41 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20757 Location: LI,NY
|
Peter I think that goes back to the retry interval and retry count of the cluster channel . If the retry is 10 times and the retry interval is 1.000 second, you have potentially 10 seconds until the channel notifies the qmgr that it is in retry mode...
A lot of messages can get to the cluster xmitq in 10 seconds.
Note and do not confuse retry count and retry interval with short retry and long retry. Short and long retry only apply after the retry count has been hit.  _________________ MQ & Broker admin |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Nov 05, 2007 3:02 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
FJ those 2 parms you mention (retry interval and retry count) are only applicable to the RCVR side of a channel and only come into play when the RCVR-type MCA cannot put to the destination q. It will wait retry interval ms before trying to reput the message. It will then attempt this retry count times. And then will put the message to the DLQ or get rid of it or stop the channel depending on the scenario.
Those 2 parms don't factor into JYama's problem here, which is how fast will the SNDR side realize there is a problem. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Nov 05, 2007 3:28 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
JYama,
You never answered the question about the NPMSPEED attribute of the CLUSRCVR channels on QM2. If its set to FAST and the messages are non persistent then you are seeing expected behaviour.
If the messages are persistent -or- the message speed of the channel is set to Normal than the 1st message down the channel that is no longer able to talk to QM2 should throw the channel into retry*. All future messages should get routed to QM1. Any uncommitted messages in the channel's batch, including the one that made the channel retry should get rolled back and would be eligible to go to another QM in the cluster. Unless those messages are specifically addressed to QM2.
* A Heartbeat attempt will also do this. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
JYama |
Posted: Mon Nov 05, 2007 4:15 pm Post subject: |
|
|
 Master
Joined: 27 Mar 2002 Posts: 281
|
PeterPotkay wrote: |
JYama,
You never answered the question about the NPMSPEED attribute of the CLUSRCVR channels on QM2. If its set to FAST and the messages are non persistent then you are seeing expected behaviour.
If the messages are persistent -or- the message speed of the channel is set to Normal than the 1st message down the channel that is no longer able to talk to QM2 should throw the channel into retry*. All future messages should get routed to QM1. Any uncommitted messages in the channel's batch, including the one that made the channel retry should get rolled back and would be eligible to go to another QM in the cluster. Unless those messages are specifically addressed to QM2.
* A Heartbeat attempt will also do this. |
What you're saying is exactly what I expected!
But NOT...., this is my problem....
BTW, regarding NPMSPEED, since I want msgs to be 're-routed', I changed the value to NORMAL. Also msgs are non-per.
I think what I've been discussing can be summarized into two questions;
1. Why msgs routed to 'invalid' route(or channel) were NOT re-routed or got rolled back?
2. Why it took such a long period to recognize the invalid route, even the first msg was exchanged?
Any ideas?
Thank you for your kindness.  |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Nov 05, 2007 4:24 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
JYama wrote: |
BTW, regarding NPMSPEED, since I want msgs to be 're-routed', I changed the value to NORMAL. Also msgs are non-per.
|
You made this change before or after you tested? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
JYama |
Posted: Mon Nov 05, 2007 4:29 pm Post subject: |
|
|
 Master
Joined: 27 Mar 2002 Posts: 281
|
PeterPotkay wrote: |
JYama wrote: |
BTW, regarding NPMSPEED, since I want msgs to be 're-routed', I changed the value to NORMAL. Also msgs are non-per.
|
You made this change before or after you tested? |
NPMSPEED=NORMAL is my initial setting so this value was fixed BEFORE the test.
'Changed' means that I changed it on purpose from default value 'FAST' to 'NORMAL'.
Sorry for your confusion. |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Nov 05, 2007 8:48 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20757 Location: LI,NY
|
PeterPotkay wrote: |
FJ those 2 parms you mention (retry interval and retry count) are only applicable to the RCVR side of a channel and only come into play when the RCVR-type MCA cannot put to the destination q. It will wait retry interval ms before trying to reput the message. It will then attempt this retry count times. And then will put the message to the DLQ or get rid of it or stop the channel depending on the scenario.
Those 2 parms don't factor into JYama's problem here, which is how fast will the SNDR side realize there is a problem. |
Thanks for setting me straight. I missed the fact that this was only valid for the cluster receiver, receiver and requester channel types  _________________ MQ & Broker admin |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Nov 06, 2007 8:42 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
JYama based on all the info so far I think you have a case for opening a ticket with IBM support. You shouldn't lose 4 messages. The channel should go into retrying mode as soon as the network layer reports back to the SNDR MCA that the connection is no longer valid. And if there are any uncommited messages in that channel batch they should get rolled back and be eligible to be put to another QM in the cluster. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
Nigelg |
Posted: Tue Nov 06, 2007 12:54 pm Post subject: |
|
|
Grand Master
Joined: 02 Aug 2004 Posts: 1046
|
If the msgs are small it is possible to lose more than 1 msg, since the buffer will not actually be sent down the wire until it is full.
I disagree that a PMR is needed; msgs will not be 'lost' - more accurately, discarded by the system since the user did not specify that they were important enough to keep - if the channel attributes, particularly NPMSPEED, are properly set. Note that for cluster channels the attributes have to be set on the CLUSRCRV, and that changed attributes do not take effect until after a running channel is restarted. _________________ MQSeries.net helps those who help themselves.. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Nov 06, 2007 1:59 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Nigelg wrote: |
If the msgs are small it is possible to lose more than 1 msg, since the buffer will not actually be sent down the wire until it is full.
|
What buffer?
Nigelg wrote: |
I disagree that a PMR is needed; msgs will not be 'lost' - more accurately, discarded by the system since the user did not specify that they were important enough to keep - if the channel attributes, particularly NPMSPEED, are properly set. |
If the channel speed is Normal, and the messages didn't expire or get committed to the destination QM and there are 4 messages "missing", isn't that a problem?
Is waiting 30 seconds before the channel starts retrying unexpected? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
JYama |
Posted: Tue Nov 06, 2007 4:46 pm Post subject: |
|
|
 Master
Joined: 27 Mar 2002 Posts: 281
|
Nigelg wrote: |
If the msgs are small it is possible to lose more than 1 msg, since the buffer will not actually be sent down the wire until it is full.
|
What are you talking about? msg length? batch size or something??
Could you elaborate on that, please?
My test msgs were about 2KB each, BTW.
Nigelg wrote: |
-msgs will not be 'lost' - |
Right, it is exactly the behavior that I expected.
The problem is I found multiple msgs lost in my environment.
Nigelg wrote: |
if the channel attributes, particularly NPMSPEED, are properly set. Note that for cluster channels the attributes have to be set on the CLUSRCRV, and that changed attributes do not take effect until after a running channel is restarted. |
Regarding NPMSPEED attribute and channel restart, NPMSPEED=NORMAL is an initial attribute of my cluster channels, so I've never changed it since I configured my MQ Cluster environment.
What is the point you want to emphasize? |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Nov 06, 2007 7:00 pm Post subject: |
|
|
Guest
|
Quote: |
The problem is I found multiple msgs lost in my environment |
For NPMSPEED(FAST) and non-persistent messages that can't be delivered to the destination queue or the dlq (possible reasons: queue full, msg too big for queue, queue put-inhibited), you have directed the message channel agent to erase, eradicate, purge, destroy, delete, vaporize the message(s).
MQ does NOT lose messages.
MQ does NOT lose messages.
MQ does NOT lose messages.
Repeat as necessary. |
|
Back to top |
|
 |
JYama |
Posted: Tue Nov 06, 2007 10:06 pm Post subject: |
|
|
 Master
Joined: 27 Mar 2002 Posts: 281
|
One possible situation when a msg would be lost in my case is that the message had already arrived in QMgr2 just before QMgr2's (node) shutdown.(In my test, I executed 'halt -q') In this case, I believe this non-per msg would be lost even if NPMSPEED=NORMAL.
What I can't understand is that it seemed MQ cluster was keeping routing incoming msgs to the route to QMgr2 which should not be chosen as a valid route because the node (containing QMgr2) was not available.
Quote: |
APL MQPUT/GET to AQ) -->QMgr0 AQ(alt queue, tgtQ=Q1)
+
+--- QMgr1(Q1)
+
+--- QMgr2(Q1) |
Last edited by JYama on Tue Nov 06, 2007 10:29 pm; edited 2 times in total |
|
Back to top |
|
 |
Nigelg |
Posted: Tue Nov 06, 2007 10:08 pm Post subject: |
|
|
Grand Master
Joined: 02 Aug 2004 Posts: 1046
|
Quote: |
What is the point you want to emphasize? |
WMQ does not lose msgs.
This is incompatible with your statements in this post. Easily the most likely resolution of this syllogism is that your statements are mistaken, and that the conditions in which the channel is running are not as you state. _________________ MQSeries.net helps those who help themselves.. |
|
Back to top |
|
 |
|