Author |
Message
|
LouML |
Posted: Wed Jan 09, 2008 10:50 am Post subject: Determing when an INDOUBT Channel is a problem |
|
|
 Partisan
Joined: 10 Nov 2005 Posts: 305 Location: Jersey City, NJ / Bethpage, NY
|
We're using Omegamon to monitor our MQ environment. In our Channel Status workspace, Omegamon is basically taking a snapshot every 5 minutes. Some of our channels are showing as INDOUBT but then not the next time through. I know that this is not a problem because of how MQ works. It just so happens that at the moment in time, the channel is in doubt.
My question is - what would be good criteria for determining when the indoubt state is a valid concern, versus the annoyingly benign alert?
Out of the box, the monitor just checks it the Queue Manager is active and the channel is indoubt. I'd like to add other checks to eliminate the false alerts. Would any of the following make sense?
1 - I thought about comparing CurSeqNum and LastCommittedSeqNum, but I see that when the channel is indoubt, these are off by one and when it is not indoubt, they are equal, so this does not sound like it will work.
2 - Then I thought about checking channel status. However, if I'm not mistaken, when a running channel goes indoubt, the channel status does not change from 'running' so this too won't work.
3 - Finally, I thought I could check the xmit queue depth - if there is a depth, then report the problem.
Any other possibilities? |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Jan 09, 2008 10:55 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
I don't normally expect to see channels indoubt without errors being thrown...
Are you short of log space on one side or another? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
LouML |
Posted: Wed Jan 09, 2008 11:08 am Post subject: |
|
|
 Partisan
Joined: 10 Nov 2005 Posts: 305 Location: Jersey City, NJ / Bethpage, NY
|
jefflowrey wrote: |
I don't normally expect to see channels indoubt without errors being thrown...
Are you short of log space on one side or another? |
Not that I'm aware of. I did a search on MQSeries.net and saw this thread:
http://www.mqseries.net/phpBB2/viewtopic.php?p=159310&sid=15a2786b0dc8c19be34929226dfab715
and a quote from Nigelg:
Nigelg wrote: |
Of course, every channel is indoubt for a short time after each batch, until the confirm reply is received, but usually this lasts only a few milliseconds at most. |
|
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Jan 09, 2008 11:12 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Hrm, interesting.
My new thing to learn for the day. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Jan 09, 2008 11:32 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Check for an In Doubt condition for 2 (or more) consecutive checks by your tool before alerting. The odds of healthy channel being in doubt 2 times in a row is very small if the "healthy" in doubts only last a few milliseconds.
I've probably issued 100s of Channel Status * commands in MO71 and never saw a "bogus" in doubt. I think plain old Channel Status is what you should be checking. (#2 in your suggestions)
We don't monitor channel statuses by the way, for reasons you've discovered. Happy channels cycle through various states as they do their thing. We monitor XMITQ depth along with dequeue count. As long as the XMITQ is empty or the # of messages leaving the XMITQ is > x, things are fine. We added a delay requiring the alert condition to be true for n minutes before we get paged. This gives MQ time to recover on its own from transitory problems. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Jan 09, 2008 11:39 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
It can be useful to know if a channel is RETRYING or STOPPED. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
LouML |
Posted: Tue Jan 15, 2008 6:00 am Post subject: Re: Determing when an INDOUBT Channel is a problem |
|
|
 Partisan
Joined: 10 Nov 2005 Posts: 305 Location: Jersey City, NJ / Bethpage, NY
|
LouML wrote: |
2 - Then I thought about checking channel status. However, if I'm not mistaken, when a running channel goes indoubt, the channel status does not change from 'running' so this too won't work.
|
Just to follow up on this point -
Is the Channel Status of an Indoubt channel always Running?
Or is it only Running in the scenario mentioned in my 1st post?
I would assume that the Status of a Channel that is 'truly' Indoubt, would be Stopped or Retrying or some other state other than Running. If this is the case, then I can just check Channel Status not equal to Running. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Jan 15, 2008 6:09 am Post subject: Re: Determing when an INDOUBT Channel is a problem |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
LouML wrote: |
I can just check Channel Status not equal to Running. |
No. That will alert you way too much. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
LouML |
Posted: Tue Jan 15, 2008 6:31 am Post subject: Re: Determing when an INDOUBT Channel is a problem |
|
|
 Partisan
Joined: 10 Nov 2005 Posts: 305 Location: Jersey City, NJ / Bethpage, NY
|
jefflowrey wrote: |
LouML wrote: |
I can just check Channel Status not equal to Running. |
No. That will alert you way too much. |
I meant - Channel Staus not equal Running AND Indoubt equal Yes (not just Channel Status alone) |
|
Back to top |
|
 |
SAFraser |
Posted: Tue Jan 15, 2008 7:31 am Post subject: |
|
|
 Shaman
Joined: 22 Oct 2003 Posts: 742 Location: Austin, Texas, USA
|
I do not alert on INACTIVE senders, as I want a sender to go to this state when the DISCINT expires. I do not alert on STOPPING or STOPPED, as such a state is nearly always caused by me (or a cron job). I do not alert on STARTING, just because I've never had a channel hang in STARTING, but others might have different experience.
I do alert on RETRYING, PAUSED, INITIALIZING and BINDING, as these are states that a channel should either not be in, or not be in for more than a few seconds. It is really spiffy if you can set your alert to check for INITIALIZING and BINDING twice before your alert fires (if you can afford the time lag) as healthy channels do cycle through these when starting.
I do not alert on INDOUBT as a healthy channel can be in that state. If the INDOUBT status does not resolve itself, the channel will go RETRYING so I just rely on that as the trigger for an alert.
As Peter mentioned, you can alert on xmitq conditions instead; this works well if the dequeue rate is predictable, or if the curdepth is predictable. For alerts to be really sophisticated, a baseline understanding of how the environment normally operates is needed.
Shirley |
|
Back to top |
|
 |
|