|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Channel Triggering |
« View previous topic :: View next topic » |
Author |
Message
|
swatkats |
Posted: Fri Jan 31, 2014 3:48 am Post subject: Channel Triggering |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
I got to have some suggestion about channel triggering that didn't happen only once.
The setup was working for years, all i did when the problem(xmitq pileup) was reported was to start the channel to fix the problem first and find something in logs to check why the channel didn't start automatically. unfortunately there wasn't much in the qmgr logs.
All the log had was after starting the sdr channel manually, 1) channel starting message AMQ9002
2) after 10 sec's followed by AMQ9514(channel in use)
3)immediately, AMQ9999 ended abrnormally.
4)However after discnt interval(10 mins) log suggest that the channel ended normally.
After this incident, the same triggering is also working normally(as is said channel trigger settings are perfect).
No FDC's found on the day of issue.
Suggestions please on where I should be focusing to find how it could have went wrong only once.
MQ Version: V7.0.1.7
We have svrconn channel's
Application : Java
MQ Version: V7.0.1.7 |
|
Back to top |
|
 |
exerk |
Posted: Fri Jan 31, 2014 4:38 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
What problems, if any, were indicated at the receiving end? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
swatkats |
Posted: Fri Jan 31, 2014 5:15 am Post subject: No Error |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
Normal channel start & end messages were only logged. nothing really at the receiving end. |
|
Back to top |
|
 |
PaulClarke |
Posted: Fri Jan 31, 2014 3:16 pm Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
Was your system heavily loaded at the time ? Is it possible that you just didn't have enough resource to start another process or thread ? What are you using by the way - processes or threads ?
P. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Jan 31, 2014 4:30 pm Post subject: Re: Channel Triggering |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
swatkats wrote: |
... all i did when the problem(xmitq pileup) was reported was to start the channel to fix the problem first and find something in logs to check why the channel didn't start automatically. unfortunately there wasn't much in the qmgr logs.
...
We have svrconn channel's
Application : Java
|
Are you saying that the problem was with a Java application? Java apps connect to SVRCONN channels. SVRCONN channels start because an inbound flow comes from a client. Are you saying that you issued a START CHANNEL command to a SVRCONN channel?
Please copy the errors from the error log, and paste them here. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Last edited by bruce2359 on Fri Jan 31, 2014 8:35 pm; edited 2 times in total |
|
Back to top |
|
 |
swatkats |
Posted: Fri Jan 31, 2014 7:05 pm Post subject: |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
Quote: |
Was your system heavily loaded at the time ? Is it possible that you just didn't have enough resource to start another process or thread ? What are you using by the way - processes or threads ?
|
1)No, other channels were starting normally(as i see in logs)
2)Channels start as a process
Quote: |
Are you saying that the problem was with a Java application? Java apps connect to SVRCONN channels. SVRCONN channels start because an inbound flow comes from a client. Are you saying that you issued a START CHANNEL command to a SVRCONN channel? |
Please copy the errors from the error log, and paste them here. [/quote]
1)No, svrconn channels were mentioned for informational purpose, our focus is on why sender did not trigger properly.
2)As per error logs, i assume, at 16.46.16, soon after i issued manual start cmd, channel started, messages were sent. after 10 sec's not sure why channel again tried to start(may be triggering happened now?!) then returned channel in use and ended abnormally.
10 mins later after disconnect interval the channel went down normally.
29/01/14 16:46:16 - Process(27423.5929) User(mqm) Program(amqrmppa)
Host(XXXXX)
AMQ9002: Channel ‘QM1.QM2.CHL’ is starting.
EXPLANATION:
Channel ‘QM1.QM2.CHL’ is starting.
ACTION:
None.
-------------------------------------------------------------------------------
29/01/14 16:46:27 - Process(3697.1) User(mqm) Program(runmqchl)
Host(XXXXX)
AMQ9514: Channel ‘QM1.QM2.CHL’is in use.
EXPLANATION:
The requested operation failed because channel ‘QM1.QM2.CHL’is
currently active.
ACTION:
Either end the channel manually, or wait for it to close, and retry the
operation.
----- amqrcsia.c : 1029 -------------------------------------------------------
29/01/14 16:46:27 - Process(3697.1) User(mqm) Program(runmqchl)
Host(XXXXX)
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program ‘QM1.QM2.CHL’ended abnormally.
ACTION:
Look at previous error messages for channel program ‘QM1.QM2.CHL’in
the error files to determine the cause of the failure.
29/01/14 16:56:07 - Process(3696.1) User(mqm) Program(runmqchl)
Host(XXXXX)
AMQ9545: Disconnect interval expired.
EXPLANATION:
Channel ‘QM1.QM2.CHL’closed because no messages arrived on the
transmission queue within the disconnect interval period.
ACTION:
None.
[/quote] |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Jan 31, 2014 8:46 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
swatkats wrote: |
2)As per error logs, i assume, at 16.46.16, soon after i issued manual start cmd, channel started, messages were sent. |
Yes.
swatkats wrote: |
...after 10 sec's not sure why channel again tried to start(may be triggering happened now?!) then returned channel in use and ended abnormally. |
No. Runmqchl is you (or automation) issuing the START CHANNEL command - to a channel that is in use. It is runmqchl's attempt to start a channel in use that failed - not the channel itself.
swatkats wrote: |
10 mins later after disconnect interval the channel went down normally. |
The disconnect interval expired. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
swatkats |
Posted: Sat Feb 01, 2014 12:43 am Post subject: |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
Yes correct. But my concern is to understand why triggering didn't happen. I can assure the trigger settings are perfect. I think it's weird and there should be some weird reasons why triggering didn't happen. |
|
Back to top |
|
 |
PaulClarke |
Posted: Sat Feb 01, 2014 12:45 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
I suspect you've reached the point where you should raise a PMR. MQ Service are the best people to know whether there is a bug in your version of MQ. There could be a timing window that we are unaware of.
Cheers,
P. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
swatkats |
Posted: Sat Feb 01, 2014 1:41 am Post subject: |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
nice, thank you. that is the last and best option we have. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sat Feb 01, 2014 7:24 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
A one-time event will likely be difficult to diagnose, but...
Go back 10 minutes (your disconnect interval) earlier in the error logs. Any errors?
Check the DLQ of the SENDER qmgr. If triggering fails, the trigger event message will be MQPUT into the dead-letter queue on the SENDER channel qmgr.
What is the trigger-inverval value of the qmgr?
Did you look ONLY at the error logs of this qmgr? Check the error logs above qmgrs in the filesystem for related errors. You should check both sets of error logs when diagnosing a problem. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
swatkats |
Posted: Sat Feb 01, 2014 10:06 pm Post subject: |
|
|
 Novice
Joined: 22 May 2010 Posts: 22
|
Quote: |
A one-time event will likely be difficult to diagnose, but... |
Yes
Quote: |
Go back 10 minutes (your disconnect interval) earlier in the error logs. Any errors? |
As i mentioned, manual start response was logged(channel starting.. After 10 sec follwed by, channel in use error.. followed by channel ending abnormally, by the time this all ahppened messges were sent and then channel ended normally after discnt interval)
Quote: |
Check the DLQ of the SENDER qmgr. If triggering fails, the trigger event message will be MQPUT into the dead-letter queue on the SENDER channel qmgr. |
Thank you, this now i have checked and there is no trigger message in DLQ.
Quote: |
What is the trigger-inverval value of the qmgr? |
Default value TRIGINT(999999999), so i beleive there was no other trigger message would have been put after the first message(trigtype is first).
Quote: |
Did you look ONLY at the error logs of this qmgr? Check the error logs above qmgrs in the filesystem for related errors. You should check both sets of error logs when diagnosing a problem. |
You mean, check for FDC's? yes i have cheked, nothing there and in the error logs too there is nothing (last update there was 20 days before till date).
anything to suspect from application commiting the messages?, even in that case, the messages were lying(normally) until the manual start worked fine(messages sent i mean). so i couldnt narrow anything down here.
Given a last shot, checked with netwrok team if there was any error at nw level, again in that case i would expect some errors in queue manager logs.. however as i said nothing in qm logs. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Feb 02, 2014 6:45 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
swatkats wrote: |
You mean, check for FDC's? yes i have cheked, nothing there and in the error logs too there is nothing (last update there was 20 days before till date). |
No. There are two folders/directories in the filesystem that hold error log files - one under your qmge, and one under websphere mq. The one under your qmge captures errors detected by your qmgr. The errors directory at the websphere-mq level captures other errors detected by WMQ-level components. i have seen network errors logged here, and not in the qmgr-level error logs. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Last edited by bruce2359 on Sun Feb 02, 2014 6:51 am; edited 2 times in total |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sun Feb 02, 2014 6:48 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
swatkats wrote: |
anything to suspect from application commiting the messages?, even in that case, the messages were lying(normally) until the manual start worked fine(messages sent i mean). so i couldnt narrow anything down here.
|
If an application puts an app message to a triggered on first queue and that message is not committed, but that app message otherwise satisfies trigger conditions, a trigger message is produced in the initiation queue. BUT that trigger message is held back uncommitted as well. Triggering does not yet occur.
If and when the application commits the app message that caused the trigger conditions to be satisfied, only then will the trigger message also be committed at which point the trigger monitor can finally get it and trigger.
If the application backs its app message out, the original trigger message is still committed and triggering will still occur. This was done by IBM to allow for additional app messages that may have been placed and committed on the queue after that first message caused a trigger condition but before the first app message decided if it was going to commit or rollback.
Hypothesis:
Is it possible the XMITQ was empty, the channel was not running, and you had an app put a message that resolved to the XMITQ, but under syncpoint. No trigger yet. It didn't commit or rollback, so the trigger message was being held back. Channel not triggering, trigger message in limbo, additional app messages piling up in the XMITQ.
You then come along a manually start the channel with no issues. The channel starts processing all the app messages that are committed (not that first one, which is still in syncpoint, along with the trigger message!).
Finally the app rolls back or commits that first message, and the trigger message is committed. The trigger monitor goes to start the channel, finds it already running because you started it, and you get the error of channel in use. And finally DISCINT kicks in to normally end the channel after a few minutes of no activity.
Hopefully someone (Paul Clarke) with insight into the inner workings of channels will come along soon to comment. I know this is how triggering on first works for application triggering. If that's how it works for channels too, seems like we would see this type of thing more often. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PaulClarke |
Posted: Sun Feb 02, 2014 10:05 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
Peter,
The triggering mechanism used for transmission queues is almost identical to that of any other local queue. The only concession, as we've heard before, is the lack of a need for a process object for triggering to occur.
Your theory that the message that caused the trigger message is part of a transaction which has neither committed nor backed out is a fair one. This could certainly have caused the symptoms we are seeing I believe. I can't answer why we don't see it more often, I guess because most applications, by the time they reach production anyway, are well behaved and don't hold on to transactions for inordinate amounts of time.
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|