MQSeries.net :: View topic - msgs getting stuck in IBMi queue after processing few msgs

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » msgs getting stuck in IBMi queue after processing few msgs

Goto page Previous 1, 2

msgs getting stuck in IBMi queue after processing few msgs

« View previous topic :: View next topic »

Author

Message

Vitor

Posted: Wed Aug 17, 2016 5:51 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

vishBroker wrote:

The message does not even enter debug - prior messages enter debug and get processed.

User trace. The debugger will affect your processing.

What is the channel status of the client connection when the messages get "stuck"?
_________________
Honesty is the best policy.
Insanity is the best defence.

mqjeff

Posted: Wed Aug 17, 2016 6:02 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

A message that is "stuck" is not committed.
_________________
chmod -R ugo-wx /

vishBroker

Posted: Wed Aug 17, 2016 6:17 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

Quote:

A message that is "stuck" is not committed.

The PUT is commited, seems so - as another application (RFHUtilc) can read the 'stuck'ed message.

smdavies99

Posted: Wed Aug 17, 2016 6:20 am Post subject:

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

Vitor wrote:

User trace.

This is clearly the case where debugger is not the right tool for the job.
Using usertrace will let you trace the path of a message through the flow in a timely manner and without waiting for the blob of mostly water that is trying to operate the debugger in real time.
If the 'stuck' message never gets into the flow then it is just not ready to be read.
Normally, this means that it is uncomitted.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.

vishBroker

Posted: Wed Aug 17, 2016 7:17 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

Here are excerpts from userTrace.
++++
2016-08-17 10:26:30.750740 22711 UserTrace BIP11506I: Committed a local transaction.
2016-08-17 10:26:30.750806 22711 UserTrace BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:35.867408 22711 UserTrace BIP11507W: Rolled back a local transaction.
2016-08-17 10:27:30.868978 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.869078 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:27:30.869152 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.899962 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:27:30.930376 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:27:30.961586 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''PTL.SUPPORT''.
22083 22407 22711 3361 3424 3425
++++

PTL.SUPPORT => This is remote QM, connected from MQInput node using 'clientConnection'.
CDCQMD - is a local QM.

I do see, messages are being rolled back - but reason is not sure. Because - when we use the SAME code , but read the message from local queue (instead of remote QM queue) - we DO NOT see this behaviour for SAME messages.

vishBroker

Posted: Wed Aug 17, 2016 7:53 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

Is there a way to upload a file here?

here is the trace for one of the thread. If needed, I can copy/upload complete userTrace.

Code:

2016-08-17 10:26:25.888580 22711 UserTrace BIP13026I: Read message from WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:26:25.888638 22711 UserTrace BIP11501I: Received data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:25.888706 22711 UserTrace BIP3907I: Message received and propagated to 'out' terminal of input node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput'.
2016-08-17 10:26:25.888844 22711 UserTrace BIP6060I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''Properties'' to process a portion of the incoming message of length '0' bytes beginning at offset '0'.
2016-08-17 10:26:25.888880 22711 UserTrace BIP6061I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''MQMD'' to process a portion of the incoming message of length '364' bytes beginning at offset '0'. The parser type was selected based on value ''MQHMD'' from the previous parser.
2016-08-17 10:26:25.888942 22711 UserTrace BIP6061I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''DFDL'' to process a portion of the incoming message of length '1429' bytes beginning at offset '364'. The parser type was selected based on value ''DFDL'' from the previous parser.
2016-08-17 10:26:25.889280 22711 UserTrace BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement ''BEGIN ... END;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '2.2').
2016-08-17 10:26:25.889312 22711 UserTrace BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement ''CALL CopyEntireMessage();'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '3.3').
....
2016-08-17 10:26:25.917926 22711 UserTrace BIP13030I: Opened WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:26:25.918132 22711 UserTrace BIP13027I: Wrote message to WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:26:25.918140 22711 UserTrace BIP2622I: Message successfully output by ''com.ptl.cdc.PTL_CDCPattern_MF.IGNORED_MSGS_QUEUE'' to queue ''CDC_IGNORED_MSGS.QUEUE'' on queue manager ''CDCQMD''.
2016-08-17 10:26:25.918206 22711 UserTrace BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement ''RETURN FALSE;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '29.8').
2016-08-17 10:26:26.289974 22711 UserTrace BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:26:26.292414 22711 UserTrace BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:26:26.292428 22711 UserTrace BIP11506I: Committed a local transaction.
2016-08-17 10:26:26.526078 22711 UserTrace BIP11507W: Rolled back a local transaction.
2016-08-17 10:26:26.855080 22711 UserTrace BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:30.683046 22711 UserTrace BIP13026I: Read message from WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:26:30.683096 22711 UserTrace BIP11501I: Received data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:30.683164 22711 UserTrace BIP3907I: Message received and propagated to 'out' terminal of input node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput'.
2016-08-17 10:26:30.683284 22711 UserTrace BIP6060I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''Properties'' to process a portion of the incoming message of length '0' bytes beginning at offset '0'.

.....

2016-08-17 10:26:30.712790 22711 UserTrace BIP2622I: Message successfully output by ''com.ptl.cdc.PTL_CDCPattern_MF.IGNORED_MSGS_QUEUE'' to queue ''CDC_IGNORED_MSGS.QUEUE'' on queue manager ''CDCQMD''.
2016-08-17 10:26:30.712840 22711 UserTrace BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement ''RETURN FALSE;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '29.8').
2016-08-17 10:26:30.749544 22711 UserTrace BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:26:30.750730 22711 UserTrace BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:26:30.750740 22711 UserTrace BIP11506I: Committed a local transaction.
2016-08-17 10:26:30.750806 22711 UserTrace BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:35.867408 22711 UserTrace BIP11507W: Rolled back a local transaction.
2016-08-17 10:27:30.868978 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.869078 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:27:30.869152 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.899962 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:27:30.930376 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:27:30.961586 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''PTL.SUPPORT''.

[/code]

vishBroker

Posted: Thu Aug 18, 2016 8:13 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

Trying to summarize the issue and observations -

Details â€“

1. IIB flow has MQInput node connecting to remote QM using client Connections and has additional instances = 2.
2. Issue â€“ Message(s) get stuck on to queue after processing few messages. (Or sometimes even for first message itself)
3. We see QDEPTH is greater than 0 even if IPPROCs is 2 (or more)
4. Sometimes, The â€˜stuckâ€™ed message gets processed after sometime. [This happens after â€˜MQConnIdleTimeOutâ€™ occurred - IPPROCS drop to 1 and new connection is created IPPROCS becomes 2 again and processes the message]
5. Sometime â€“ messages get stuck for long.
6. And, sometimes, if we put one more message on the queue â€“ [ IPPROCS increases and] new message and stuck message get processed.
7. The solution uses GC and interacts with DB in the same flow.

Things done to narrow down the issue
8. To prove issue is not because of GC â€“ We removed the GC logic and we were doing DB interactions directly. In this case also, we saw above behavior.
9. To prove that the issue is not because of DB interaction code â€“ We created simple flow [MQInput -> MQOutput] reading from remoteQ using clientConnection and writing into local queue. => We still observed same behavior.

We tend to believe that the issue is in the way IIB flow is reading the messages from queue on remote QM using client connection properties. - Please correct/guide.

NOTE â€“ we change the # of instances to 0 [meaning only single thread is reading from the queue] => we DO NOT SEE the issue.
(we do not see the issue for simple flow[#9] as well as original solution[#7]

Vitor

Posted: Thu Aug 18, 2016 8:30 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

What is SHARECONV set to on the channel?
_________________
Honesty is the best policy.
Insanity is the best defence.

mqjeff

Posted: Thu Aug 18, 2016 8:33 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

When you see that qdepth >0 but messages aren't being processed, what is the UNCOM value when you get the qstatus?
_________________
chmod -R ugo-wx /

vishBroker

Posted: Thu Aug 18, 2016 8:53 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

SHARED connections are 10 on the server channel.
The uncommited messages were NO.

I do not have access to remote QM. I am getting these details from MQExplorer. [I wish, I could share the screenshot, but it seems here screenshots/images are not allowed.]
WIll try to get these details by asking IBMi support person.

Well, using MS0P to monitor remote queue - could capture few things.
Here is .csv file from MQ Activity viewer when message got stuck.

Code:

"Queue name","App name","Process","Thread","App type","App description","Browse","Inquire","Input","Output","Set","Open options","User ID","Channel name","Conn name","UOW type","Handle state","Asynchronous state","QM UOW ID","External UOW ID"
"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","xx.xxx.xxx.xxx","Queue manager","Active","Suspended","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"
"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","xxx.xxx.xx.xx","Queue manager","Inactive","Active","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"
"RIPTIDE.OTR.ROAD_CALL","803530/PPPP/ROAD_CALL","75942","1","User","","No","No","No","Yes","No","Output, Fail if quiescing","PPP","","","Queue manager","Inactive","None","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"

Last edited by vishBroker on Thu Aug 18, 2016 9:05 am; edited 2 times in total

vishBroker

Posted: Thu Aug 18, 2016 8:58 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

One DFE thread is in suspended state.

Quote:

"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","x.xx.xx.xxx","Queue manager","Active","Suspended","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"

Kindly point - where should I start looking.

Have ran user trace - do not see any errors/exceptions.
But, I do see 'local transactions are rolled back'. Here in the flow - we have only MQInput node and MQOutput node - no compute node. I could see previous messages(before getting stuck) are properly PUT on the target queue (which is local for IIB NODE).

Seems like the MQconn/handle is getting closed by the server but client still holds it {or other way round}.
What should I be looking for?
kindly help.

vishBroker

Posted: Thu Aug 18, 2016 9:08 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

FYI - edited the CSV file to mask few things.

When, this is captured - 2 threads were reading from the queue (IPPROC = 2) and QDepth was 1.
The message got picked up after 1 minute or so.

vishBroker

Posted: Thu Aug 18, 2016 11:59 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

I looked into infocenter for the meaning of 'Asynchronous state' = suspended. This is what I got - but not able to understand it completely.

Quote:

Suspended: The asynchronous consumption callback has been suspended so that asynchronous message consumption cannot currently proceed on this object handle. This can be either because an MQCB call with Operation MQOP_SUSPEND has been issued against this object handle by the application, or because it has been suspended by the system. If it has been suspended by the system, as part of the process of suspending asynchronous message consumption the callback function will be called with the reason code that describes the problem resulting in suspension. This will be reported in the Reason field in the MQCBC structure that is passed to the callback function. For asynchronous message consumption to proceed, the application must issue an MQCB call with the Operation parameter set to MQOP_RESUME.

THe consumer application in this case is IIB. We are NOT coding anything out-of-ordinary for MQ Input node.
Transaction Mode - YES
We are using MQ client connection properties to read to remote queue.
Additional instances = 2
OrderMode = default

So, MQ connections are made and handled by IIB itself.
(IIB Node does has Local QM configured.)
And we are NOT doing anything thing in the code - it is just MQInput ->MQOutput. (for debug purpsoe)

I don't see any place to provide MOOP_RESUME option. (And hoping, we are not required to provide it as well)

mqjeff

Posted: Fri Aug 19, 2016 3:58 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

All that stuff about MQCB fields is, indeed, handled by IIB itself.

It sounds like something is going wrong with the client connection - either on the IIB side or the MQ side.

I haven't worked with the MQ*Nodes using client connections, so I'm not sure if there are parameters for things like timeouts or loop intervals or whatever.

It sounds like you've done enough troubleshooting to identify the next place to troubleshoot and also figured out how to replicate the issue. These are both very good things!

... look at the AMQERR logs for the qmgr in question, and look at the system log on the broker side.

If you don't see anything, enable broker service trace, and very carefully examine the output for any mq events that look like they might involve the suspend.

If it looks like the broker is receiving a message to suspend the connection, then look at the qmgr side and run a trace on it. You can limit the trace to specific things - possibly specific enough to only show the broker client connections... but I don't remember if it gets that tight. You should at least be able to identify the amq process that's running the thread that's talking to broker. Then you can only look at that trace file. (their names include the pid.)

If that still gets you nowhere, or you want help going through this process... then open a PMR. And expect to be told to upgrade your system to the latest FP if it's not there.
_________________
chmod -R ugo-wx /

vishBroker

Posted: Mon Aug 22, 2016 2:23 am Post subject:

Centurion

Joined: 08 Dec 2010
Posts: 135

Thank you, MQJeff!

We have opened a PMR. Will keep updated.
Thanks all for the inputs.

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » msgs getting stuck in IBMi queue after processing few msgs

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP