Author |
Message
|
somsayan |
Posted: Thu Oct 24, 2013 2:21 am Post subject: Intermittent delay in message processing |
|
|
Apprentice
Joined: 06 Apr 2007 Posts: 49
|
Hi,
We are facing a issue that is occuring intermittently. Our message flows start with an MQInput node and does computation in 2 compute nodes and then publishes the message.
The 2 different compute nodes is required as there is multiple database operations that needs to be done( name value lookup, stored procedure calls, database inserts etc..) on different databases that are not compatible.
The message flow is triggered when another flow(picking events from Event table) publishes the messages.
When we run the flow most of the times, the datas are updated right away, but sometimes there is a delay of more than 30 mins for the first message and then the messages are updating the end systems right away.
Now we are clueless from where to start the investigation . Can anyone please help. |
|
Back to top |
|
 |
Esa |
Posted: Thu Oct 24, 2013 2:55 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
What do you see if you run user trace? |
|
Back to top |
|
 |
McueMart |
Posted: Thu Oct 24, 2013 2:55 am Post subject: |
|
|
 Chevalier
Joined: 29 Nov 2011 Posts: 490 Location: UK...somewhere
|
Can you try enabling user trace? That should clearly show you what is happening.
(Seems like Esa beat me to it! Great mind and all that....) |
|
Back to top |
|
 |
somsayan |
Posted: Thu Oct 24, 2013 6:05 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2007 Posts: 49
|
Esa wrote: |
What do you see if you run user trace? |
I'm not sure I'd be able to run a trace as this problem is not a regular issue. The next instance of such a issue might be after running the flow for 1000 times without the issue happening. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Oct 24, 2013 6:53 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
somsayan said:
Quote: |
The message flow is triggered when another flow(picking events from Event table) publishes the messages. |
Quote: |
sometimes there is a delay of more than 30 mins for the first message |
Maybe you could write a flow or script or program that
- detects when the flow is triggered
- waits 5 seconds
- if the first message has not yet completed, switches on user trace. _________________ Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. |
|
Back to top |
|
 |
NealM |
Posted: Thu Oct 24, 2013 7:59 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Kimbert, it seems that would be like trying to figure out how the horses got out after you spotted them in the neighbor's pasture but the fence isn't down (a real life situation that I'm unfortunately familiar with). If the flow isn't very high volume, and nothing else in the EG is being user traced, and since the user trace files are circular, why not just set user trace to "Normal" for the flow and wait to capture the trace just after the situation occurs? The "Normal" trace setting will give a timestamp of when each Node in the flow is being exited, so samsayan would at least then know which Node is having an issue, if any, so he could start on a path to finding the cause.
The "if any" above is because we periodically see a similar issue at my client that has never been fully explained, but we at least have proven that it is not a Broker delay issue, rather it is an MQ issue, with the remote qmgr running on Z/VSE being the culprit.
Last edited by NealM on Fri Oct 25, 2013 7:15 am; edited 1 time in total |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Oct 24, 2013 10:40 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
based on the symptoms, my first guess would be that the database is undergoing maintenance when the broker flow runs slow. |
|
Back to top |
|
 |
dogorsy |
Posted: Thu Oct 24, 2013 9:49 pm Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
mqjeff wrote: |
based on the symptoms, my first guess would be that the database is undergoing maintenance when the broker flow runs slow. |
That would not delay the processing for 30 minutes, would it ? if the database is offline or locked, you'd get an error, or a timeout. In any case there would be an error message is the system log/event log |
|
Back to top |
|
 |
Esa |
Posted: Thu Oct 24, 2013 10:58 pm Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
@somsayan:
How do you detect the delay? From your message audit logging?
Timestamps in the source and target databases?
The delay could happen in
- updating the event table
- the triggering flow
- publish/subscribe
- the second flow
Have you figured out exactly where it happens? |
|
Back to top |
|
 |
somsayan |
Posted: Thu Oct 24, 2013 11:28 pm Post subject: |
|
|
Apprentice
Joined: 06 Apr 2007 Posts: 49
|
Esa wrote: |
@somsayan:
How do you detect the delay? From your message audit logging?
Timestamps in the source and target databases?
The delay could happen in
- updating the event table
- the triggering flow
- publish/subscribe
- the second flow
Have you figured out exactly where it happens? |
Esa, I can see the delay from the combination of the queue properties, current depth, uncommited message, message age.
We are currently trying to replicate the same issue with trace enabled on the flow. |
|
Back to top |
|
 |
Esa |
Posted: Fri Oct 25, 2013 12:18 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
somsayan wrote: |
Esa, I can see the delay from the combination of the queue properties, current depth, uncommited message, message age.
|
So you detect the delay while it's still happening?
Have you tried if you can stop the flow while it's delaying?
If you can, you can check the backout count of the first message in the queue. That could tell you something.
Does the MQInput node have catch or failure terminals connected? |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Oct 25, 2013 4:50 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
How do you know the delay is at the flow level and not before the message ever hits the flow?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
NealM |
Posted: Fri Oct 25, 2013 7:12 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Quote: |
but sometimes there is a delay of more than 30 mins for the first message and then the messages are updating the end systems right away. |
Samsayan, so your flow is triggered by MQ messages that come in batches? Or the MQ messages just periodically pile up on the input queue behind a current message (not necessarily the first in a batch) that for some reason is hanging for a while in the flow? |
|
Back to top |
|
 |
somsayan |
Posted: Fri Oct 25, 2013 7:35 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2007 Posts: 49
|
fjb_saper wrote: |
How do you know the delay is at the flow level and not before the message ever hits the flow?  |
@fjb_saper
IMHO the uncommitted message count say the flow is processing the message. So with no new messages and queue depth fixed, we can be sure that the flow is procesing the same message. Am I wrong here ?
NealM wrote: |
Samsayan, so your flow is triggered by MQ messages that come in batches? Or the MQ messages just periodically pile up on the input queue behind a current message (not necessarily the first in a batch) that for some reason is hanging for a while in the flow? |
The flow is triggered by Database events which are populated in batches. We have a flow that only monitors the event tables and publishes the events to MQ.
Here is a snapshot of the flow stats:
Average flows :
Code: |
ProcessID='6619188', Key='44', Label=''MapToNotifyProductRetailEvent_FromPMMToCanonicalMapping'', Type=''ComputeNode'', TotalElapsedTime='2671753', MaximumElapsedTime='2671753', MinimumElapsedTime='2671753', TotalCPUTime='146411', MaximumCPUTime='146411', MinimumCPUTime='146411', CountOfInvocations='1', NumberOfInputTerminals='1', NumberOfOutputTerminals='6'. |
Little poor performance flow :
Code: |
ProcessID='6619188', Key='14', Label=''MapToNotifyProductRetailEvent_FromPMMToCanonicalMapping'', Type=''ComputeNode'', TotalElapsedTime='100435059', MaximumElapsedTime='100435059', MinimumElapsedTime='100435059', TotalCPUTime='85927', MaximumCPUTime='85927', MinimumCPUTime='85927', CountOfInvocations='1', NumberOfInputTerminals='1', NumberOfOutputTerminals='6'. |
|
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Oct 25, 2013 11:49 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
So what is your DB deadlock timeout? :innnocent: _________________ MQ & Broker admin |
|
Back to top |
|
 |
|