ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » msgs getting stuck in IBMi queue after processing few msgs

Post new topic  Reply to topic Goto page Previous  1, 2
 msgs getting stuck in IBMi queue after processing few msgs « View previous topic :: View next topic » 
Author Message
Vitor
PostPosted: Wed Aug 17, 2016 5:51 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

vishBroker wrote:
The message does not even enter debug - prior messages enter debug and get processed.


User trace. The debugger will affect your processing.

What is the channel status of the client connection when the messages get "stuck"?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Aug 17, 2016 6:02 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

A message that is "stuck" is not committed.
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
vishBroker
PostPosted: Wed Aug 17, 2016 6:17 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

Quote:
A message that is "stuck" is not committed.

The PUT is commited, seems so - as another application (RFHUtilc) can read the 'stuck'ed message.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Wed Aug 17, 2016 6:20 am    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

Vitor wrote:


User trace.




This is clearly the case where debugger is not the right tool for the job.
Using usertrace will let you trace the path of a message through the flow in a timely manner and without waiting for the blob of mostly water that is trying to operate the debugger in real time.
If the 'stuck' message never gets into the flow then it is just not ready to be read.
Normally, this means that it is uncomitted.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
vishBroker
PostPosted: Wed Aug 17, 2016 7:17 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

Here are excerpts from userTrace.
++++
2016-08-17 10:26:30.750740 22711 UserTrace BIP11506I: Committed a local transaction.
2016-08-17 10:26:30.750806 22711 UserTrace BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:35.867408 22711 UserTrace
BIP11507W: Rolled back a local transaction.
2016-08-17 10:27:30.868978 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.869078 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:27:30.869152 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.899962 22711 UserTrace BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:27:30.930376 22711 UserTrace BIP13031I: Closed WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:27:30.961586 22711 UserTrace BIP13024I: Disconnected from WebSphere MQ queue manager ''PTL.SUPPORT''.
22083 22407 22711 3361 3424 3425
++++

PTL.SUPPORT => This is remote QM, connected from MQInput node using 'clientConnection'.
CDCQMD - is a local QM.

I do see, messages are being rolled back - but reason is not sure. Because - when we use the SAME code , but read the message from local queue (instead of remote QM queue) - we DO NOT see this behaviour for SAME messages.
Back to top
View user's profile Send private message
vishBroker
PostPosted: Wed Aug 17, 2016 7:53 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

Is there a way to upload a file here?

here is the trace for one of the thread. If needed, I can copy/upload complete userTrace.
Code:

2016-08-17 10:26:25.888580    22711   UserTrace   BIP13026I: Read message from WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:26:25.888638    22711   UserTrace   BIP11501I: Received data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:25.888706    22711   UserTrace   BIP3907I: Message received and propagated to 'out' terminal of input node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput'.
2016-08-17 10:26:25.888844    22711   UserTrace   BIP6060I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''Properties'' to process a portion of the incoming message of length '0' bytes beginning at offset '0'.
2016-08-17 10:26:25.888880    22711   UserTrace   BIP6061I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''MQMD'' to process a portion of the incoming message of length '364' bytes beginning at offset '0'. The parser type was selected based on value ''MQHMD'' from the previous parser.
2016-08-17 10:26:25.888942    22711   UserTrace   BIP6061I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''DFDL'' to process a portion of the incoming message of length '1429' bytes beginning at offset '364'. The parser type was selected based on value ''DFDL'' from the previous parser.
2016-08-17 10:26:25.889280    22711   UserTrace   BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement   ''BEGIN ... END;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '2.2').
2016-08-17 10:26:25.889312    22711   UserTrace   BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement   ''CALL CopyEntireMessage();'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '3.3').
....
2016-08-17 10:26:25.917926    22711   UserTrace   BIP13030I: Opened WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:26:25.918132    22711   UserTrace   BIP13027I: Wrote message to WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:26:25.918140    22711   UserTrace   BIP2622I: Message successfully output by ''com.ptl.cdc.PTL_CDCPattern_MF.IGNORED_MSGS_QUEUE'' to queue ''CDC_IGNORED_MSGS.QUEUE'' on queue manager ''CDCQMD''.
2016-08-17 10:26:25.918206    22711   UserTrace   BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement   ''RETURN FALSE;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '29.8').
2016-08-17 10:26:26.289974    22711   UserTrace   BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:26:26.292414    22711   UserTrace   BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:26:26.292428    22711   UserTrace   BIP11506I: Committed a local transaction.
2016-08-17 10:26:26.526078    22711   UserTrace   BIP11507W: Rolled back a local transaction.
2016-08-17 10:26:26.855080    22711   UserTrace   BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:30.683046    22711   UserTrace   BIP13026I: Read message from WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:26:30.683096    22711   UserTrace   BIP11501I: Received data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:30.683164    22711   UserTrace   BIP3907I: Message received and propagated to 'out' terminal of input node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput'.
2016-08-17 10:26:30.683284    22711   UserTrace   BIP6060I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CDCIn_MQInput' used parser type ''Properties'' to process a portion of the incoming message of length '0' bytes beginning at offset '0'.

.....

2016-08-17 10:26:30.712790    22711   UserTrace   BIP2622I: Message successfully output by ''com.ptl.cdc.PTL_CDCPattern_MF.IGNORED_MSGS_QUEUE'' to queue ''CDC_IGNORED_MSGS.QUEUE'' on queue manager ''CDCQMD''.
2016-08-17 10:26:30.712840    22711   UserTrace   BIP2537I: Node 'com.ptl.cdc.PTL_CDCPattern_MF.CheckMessageValidity_Compute': Executing statement   ''RETURN FALSE;'' at ('com.ptl.cdc.PTL_CDCPattern_CheckMessageValidity_Compute.Main', '29.8').
2016-08-17 10:26:30.749544    22711   UserTrace   BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:26:30.750730    22711   UserTrace   BIP13033I: Committed a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:26:30.750740    22711   UserTrace   BIP11506I: Committed a local transaction.
2016-08-17 10:26:30.750806    22711   UserTrace   BIP11504I: Waiting for data from input node ''CDCIn_MQInput''.
2016-08-17 10:26:35.867408    22711   UserTrace   BIP11507W: Rolled back a local transaction.
2016-08-17 10:27:30.868978    22711   UserTrace   BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.869078    22711   UserTrace   BIP13031I: Closed WebSphere MQ queue ''CDC_IGNORED_MSGS.QUEUE''.
2016-08-17 10:27:30.869152    22711   UserTrace   BIP13024I: Disconnected from WebSphere MQ queue manager ''CDCQMD''.
2016-08-17 10:27:30.899962    22711   UserTrace   BIP13034I: Rolled back a locally-coordinated transaction on WebSphere MQ queue manager ''PTL.SUPPORT''.
2016-08-17 10:27:30.930376    22711   UserTrace   BIP13031I: Closed WebSphere MQ queue ''RIPTIDE.OTR.ROAD_CALL''.
2016-08-17 10:27:30.961586    22711   UserTrace   BIP13024I: Disconnected from WebSphere MQ queue manager ''PTL.SUPPORT''.
[/code]
Back to top
View user's profile Send private message
vishBroker
PostPosted: Thu Aug 18, 2016 8:13 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

Trying to summarize the issue and observations -

Details –

1. IIB flow has MQInput node connecting to remote QM using client Connections and has additional instances = 2.
2. Issue – Message(s) get stuck on to queue after processing few messages. (Or sometimes even for first message itself)
3. We see QDEPTH is greater than 0 even if IPPROCs is 2 (or more)
4. Sometimes, The ‘stuck’ed message gets processed after sometime. [This happens after ‘MQConnIdleTimeOut’ occurred - IPPROCS drop to 1 and new connection is created IPPROCS becomes 2 again and processes the message]
5. Sometime – messages get stuck for long.
6. And, sometimes, if we put one more message on the queue – [ IPPROCS increases and] new message and stuck message get processed.
7. The solution uses GC and interacts with DB in the same flow.

Things done to narrow down the issue
8. To prove issue is not because of GC – We removed the GC logic and we were doing DB interactions directly. In this case also, we saw above behavior.
9. To prove that the issue is not because of DB interaction code – We created simple flow [MQInput -> MQOutput] reading from remoteQ using clientConnection and writing into local queue. => We still observed same behavior.

We tend to believe that the issue is in the way IIB flow is reading the messages from queue on remote QM using client connection properties. - Please correct/guide.

NOTE – we change the # of instances to 0 [meaning only single thread is reading from the queue] => we DO NOT SEE the issue.
(we do not see the issue for simple flow[#9] as well as original solution[#7]
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu Aug 18, 2016 8:30 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

What is SHARECONV set to on the channel?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu Aug 18, 2016 8:33 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

When you see that qdepth >0 but messages aren't being processed, what is the UNCOM value when you get the qstatus?
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
vishBroker
PostPosted: Thu Aug 18, 2016 8:53 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

SHARED connections are 10 on the server channel.
The uncommited messages were NO.

I do not have access to remote QM. I am getting these details from MQExplorer. [I wish, I could share the screenshot, but it seems here screenshots/images are not allowed.]
WIll try to get these details by asking IBMi support person.

Well, using MS0P to monitor remote queue - could capture few things.
Here is .csv file from MQ Activity viewer when message got stuck.
Code:
"Queue name","App name","Process","Thread","App type","App description","Browse","Inquire","Input","Output","Set","Open options","User ID","Channel name","Conn name","UOW type","Handle state","Asynchronous state","QM UOW ID","External UOW ID"
"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","xx.xxx.xxx.xxx","Queue manager","Active","Suspended","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"
"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","xxx.xxx.xx.xx","Queue manager","Inactive","Active","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"
"RIPTIDE.OTR.ROAD_CALL","803530/PPPP/ROAD_CALL","75942","1","User","","No","No","No","Yes","No","Output, Fail if quiescing","PPP","","","Queue manager","Inactive","None","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"


Last edited by vishBroker on Thu Aug 18, 2016 9:05 am; edited 2 times in total
Back to top
View user's profile Send private message
vishBroker
PostPosted: Thu Aug 18, 2016 8:58 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

One DFE thread is in suspended state.
Quote:
"RIPTIDE.OTR.ROAD_CALL","DataFlowEngine","3047","0","User","WebSphere MQ Channel","No","Yes","Shared","No","No","Save all context, Inquire, Fail if quiescing, Input shared","MQUSER","RIPTIDE.SVRCONN","x.xx.xx.xxx","Queue manager","Active","Suspended","0.0","XA_FORMATID[00000000] XA_GTRID[] XA_BQUAL[]"

Kindly point - where should I start looking.

Have ran user trace - do not see any errors/exceptions.
But, I do see 'local transactions are rolled back'. Here in the flow - we have only MQInput node and MQOutput node - no compute node. I could see previous messages(before getting stuck) are properly PUT on the target queue (which is local for IIB NODE).

Seems like the MQconn/handle is getting closed by the server but client still holds it {or other way round}.
What should I be looking for?
kindly help.
Back to top
View user's profile Send private message
vishBroker
PostPosted: Thu Aug 18, 2016 9:08 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

FYI - edited the CSV file to mask few things.

When, this is captured - 2 threads were reading from the queue (IPPROC = 2) and QDepth was 1.
The message got picked up after 1 minute or so.
Back to top
View user's profile Send private message
vishBroker
PostPosted: Thu Aug 18, 2016 11:59 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

I looked into infocenter for the meaning of 'Asynchronous state' = suspended. This is what I got - but not able to understand it completely.
Quote:
Suspended: The asynchronous consumption callback has been suspended so that asynchronous message consumption cannot currently proceed on this object handle. This can be either because an MQCB call with Operation MQOP_SUSPEND has been issued against this object handle by the application, or because it has been suspended by the system. If it has been suspended by the system, as part of the process of suspending asynchronous message consumption the callback function will be called with the reason code that describes the problem resulting in suspension. This will be reported in the Reason field in the MQCBC structure that is passed to the callback function. For asynchronous message consumption to proceed, the application must issue an MQCB call with the Operation parameter set to MQOP_RESUME.


THe consumer application in this case is IIB. We are NOT coding anything out-of-ordinary for MQ Input node.
Transaction Mode - YES
We are using MQ client connection properties to read to remote queue.
Additional instances = 2
OrderMode = default

So, MQ connections are made and handled by IIB itself.
(IIB Node does has Local QM configured.)
And we are NOT doing anything thing in the code - it is just MQInput ->MQOutput. (for debug purpsoe)

I don't see any place to provide MOOP_RESUME option. (And hoping, we are not required to provide it as well)
Back to top
View user's profile Send private message
mqjeff
PostPosted: Fri Aug 19, 2016 3:58 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

All that stuff about MQCB fields is, indeed, handled by IIB itself.

It sounds like something is going wrong with the client connection - either on the IIB side or the MQ side.

I haven't worked with the MQ*Nodes using client connections, so I'm not sure if there are parameters for things like timeouts or loop intervals or whatever.

It sounds like you've done enough troubleshooting to identify the next place to troubleshoot and also figured out how to replicate the issue. These are both very good things!

... look at the AMQERR logs for the qmgr in question, and look at the system log on the broker side.

If you don't see anything, enable broker service trace, and very carefully examine the output for any mq events that look like they might involve the suspend.

If it looks like the broker is receiving a message to suspend the connection, then look at the qmgr side and run a trace on it. You can limit the trace to specific things - possibly specific enough to only show the broker client connections... but I don't remember if it gets that tight. You should at least be able to identify the amq process that's running the thread that's talking to broker. Then you can only look at that trace file. (their names include the pid.)

If that still gets you nowhere, or you want help going through this process... then open a PMR. And expect to be told to upgrade your system to the latest FP if it's not there.
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
vishBroker
PostPosted: Mon Aug 22, 2016 2:23 am    Post subject: Reply with quote

Centurion

Joined: 08 Dec 2010
Posts: 135

Thank you, MQJeff!

We have opened a PMR. Will keep updated.
Thanks all for the inputs.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page Previous  1, 2 Page 2 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » msgs getting stuck in IBMi queue after processing few msgs
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.