Author |
Message
|
vishBroker |
Posted: Tue Aug 16, 2016 5:12 pm Post subject: msgs getting stuck in IBMi queue after processing few msgs |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
Issue -
IBMi queue messages are getting piled up,after processing few messages, even when IPPROC is greater than 0.
Ran DISPLAY QSTATUS and checked the queue to confirm queue depth and IPPROCS.
Environment
IIB/MQ
IBM MQ - VERSION(08000004)
IIB - v10.0.0.3
OS - Red Hat Enterprise Linux Server release 6.7 (Santiago)
IBMi
IBMi MQ - version 08000004
Oracle DB
Highlevel Flow
1. IIB reads from IBMi Queue using MQInput node - client connection.
2. IIB flow does some transformation and interact with Oracle DB.
3. MQInput node has 'Failure' and 'Catch' terminals are connected to local queue.
4. DB interactions are through compute node, using 'passthru' function.
5. We are reading with 2 additional instances (total 3 threads) from the IBMi Queue.
[We want to process DB transactions in parallel and are using GC to decide, which transaction to ignore and which one to process for DB operation]
Transactionality
1. MQInputNode has transaction Mode to YES.
2. MQ Output Nodes has transaction Mode to 'AUTOMATIC'
3. Compute node has trnsaction mode to 'AUTOMATIC'
We have tried changing MQOutput node and compute node transactionMode to 'YES' as well as 'NO' - But still same issue.
Kindly suggest, why messages are NOT getting picked up and where to look?
Have looked at the Server-channel's details. Used chstat utility (from http://www-01.ibm.com/support/docview.wss?uid=swg24017810) to find issue related to connections. But could not find any thing conclusive.
I have few more observations that I would like to share depending upon suggestions.
[Updated the issue, after response from 'smdavies99'. Thanks.]
Last edited by vishBroker on Wed Aug 17, 2016 3:23 am; edited 1 time in total |
|
Back to top |
|
 |
vishBroker |
Posted: Tue Aug 16, 2016 7:05 pm Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
To Add More,
1. We are interacting with only one DB.
2. There is only one compute node interacting with DB.
3. Currently we have not configured flow/node for global transactions.
I would like to understand the issue before configuring XA transaction.
javascript:emoticon(' ') |
|
Back to top |
|
 |
smdavies99 |
Posted: Tue Aug 16, 2016 9:51 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
As you seem to have a local (to the IIB server) QMGR have you tried to configure a SDR/RCVR channel from the Client system QMGR to the IIB QMGR?
IF that works and your flow works then you might have issues in either
1) your MQ connection auth
or
2) An issue in IIB that should be raised with IBM via a PMR _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 3:20 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
Thanks.
(I see, what I missed in my question)
Well, we are using 'MQ Client connection' properties of MQInput node. Sender-receiver channel pair is not created.
The Issue is - Few messages get processed and after some time, the messages get stuck. Sometimes, messages get stuck for very long and sometimes, they get processed after one minute - once a idle MQConn timeout happens, # of IPPROC is first reduced from 3 to 2 [or sometimes 2 to 1]and then comes back to 3 [or 2] and then message gets processed.
The issue seems to be with the way transactions are handled. Seems like somehow, IIB thread is not getting hold of message(s) after sometime. The behavior is random. Sometimes, 4th message gets stuck [we have total 3 threads running] or sometimes 2n message itself gets stuck.
There are NO errors during DB operations/ No errors MQ/IIB side.
Also, when we split the processing into two steps.
1.First read the messages from IBMi queue (using MQInput node with client connections) and putting them into local queue.
2.And then doing DB operations
We do not see this issue.
Also, when we change transactionality of MQInput node to NO, we do not see this issue.
This is making us believe, the issue is because we have 3 things (remote QM, local QM/IIB and Oracle DB) in single transaction but not able to understand how exactly the transaction is behaving.
Kindly suggest. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Aug 17, 2016 3:54 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Either the application that is putting the messages on a queue is failing to commit them.
Or the IIB flow is not handling errors correctly - causing messages to sit in a transaction with the GET until something causes them to be rolled back. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Aug 17, 2016 4:07 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
vishBroker wrote: |
Well, we are using 'MQ Client connection' properties of MQInput node. Sender-receiver channel pair is not created.
|
I was suggesting going this way to eliminate any issues with the MQInput Node access to the queue as an MQ Client.
Do you have a BOQ defined for this Queue? Perhaps your flow fails and rolls back and the message becomes a poison message? Without knowing the detailed logic for your flow we can't tell.
also, using a SDR/RCVR channel will make sure that the flow only ever tries to read committed messages.
Finally, making your flow work properly using a local queue is a good thing. Then when you are 100000% sure all is good and that errors are handled correctly then and only then would I switch to a remote connection. That way you know at that point in time that any errors are due to the connection and/or the MQ auths and not your flow. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Last edited by smdavies99 on Wed Aug 17, 2016 4:10 am; edited 1 time in total |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 4:08 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
Thanks for the response.
Quote: |
Either the application that is putting the messages on a queue is failing to commit them. |
Well, we are are able to read the message from RFHUtilc. So, I guess the PUT is commited.
Quote: |
Or the IIB flow is not handling errors correctly - causing messages to sit in a transaction with the GET until something causes them to be rolled back. |
We are thinking this - but not able to understand how it is happening.
Here are observations -
1. We have connected 'catch' and 'failure' terminal of the MQInput node.
2. THe compute node - which is doing actual DB interaction is in the 'Try' path of 'tryCatch' node.
3. We are not seeing any DB errors or Data errors.
4. For example - we have 4 Update transactions on the same primary key. All the 4 of them sometimes get processed without any issue and we see the updates in DB. And sometimes, 2nd update itself gets stuck.
When we place 3rd update on to queue - all of them get processed.
So, Not sure - whether it is causing because of wrong error handling.
Our error handling is pretty straight as of now, we just put the message onto an error queue.
FYI- the messages are IBMi journals - FIxed Length messages. The message header contains what operation to perform [update/delete/insert]. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Aug 17, 2016 4:12 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
vishBroker wrote: |
2. THe compute node - which is doing actual DB interaction is in the 'Try' path of 'tryCatch' node.
|
what is connected to the Catch terminal of the tryCatch node? Does it report errors correctly?
Why do you need it?
If so then please try to understand what happens when an error happens and the transaction rolls back. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 4:13 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
To add more about 'not handling error' correctly
1. When we pick the message up from IBMi Q and put it and local queue.
2. And then do the db operation using separate flow.
[The code is same, just in a separate flow which reads from local queue, instead of IBMi Q directly using client connection]
The same messages get processed without any issue. The local queue does not get stuck/piled up. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Aug 17, 2016 4:13 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Disconnect the failure terminal.
end your catch logic with a throw node. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 4:34 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
Thanks. Will try that and update.
FYI - TryCatch block logic ->
We are retrying the DB operation ONLY ONCE and ONLY if it is connectivity ERROR. And then we put the 'erred' message onto another queue.
+++
IF(iErrorCode=08001 OR iErrorCode = 08006) THEN -- need to test this in QA
SET Environment.DBConnectionErrorCount = COALESCE(Environment.DBConnectionErrorCount,0) + 1;
IF Environment.DBConnectionErrorCount = 1 THEN
RETURN TRUE ;
ELSE
+++ |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Aug 17, 2016 4:39 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
vishBroker wrote: |
Thanks. Will try that and update.
FYI - TryCatch block logic ->
We are retrying the DB operation ONLY ONCE and ONLY if it is connectivity ERROR. And then we put the 'erred' message onto another queue.
+++
IF(iErrorCode=08001 OR iErrorCode = 08006) THEN -- need to test this in QA
SET Environment.DBConnectionErrorCount = COALESCE(Environment.DBConnectionErrorCount,0) + 1;
IF Environment.DBConnectionErrorCount = 1 THEN
RETURN TRUE ;
ELSE
+++ |
If you think that those are the only error codes you need to test for then you are sorely mistaken.
In a project a long time ago, the so called 'architect' deemed that we should handle each and every error that we got back from the DB. Once we'd coded for 100+ different errors, they saw the error of their ways.
In general DB connections are pretty solid these days. More recent projects have had errors have been due to deadlocking rather than connectivity. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 5:06 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
True that.
But, this is one of those 'client wants it this way'. They want us to 'retry' the DB operation ONLY ONCE and ONLY in case of connectivity error. Hence the code.
[Have tried to explain a lot if there is connectivity error, it will not be sorted immediately. But we know - how clients are - client wants to retry it once and if connectivity error happens again, put the message into a queue. That is why we are checking for errorcount and if it greater than 1, we put it in the Queue]
Anyhow, currently, we are not hitting those errors and the messages coming to 'catch' leg of 'tryCatch' node are put in the queue.
The MQOutput node has 'transaction mode' to 'AUTOMIC' and the o/p queue is local queue.
Last edited by vishBroker on Fri Aug 19, 2016 11:05 am; edited 1 time in total |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Aug 17, 2016 5:09 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You should handle the "if error occurs, put message to queue" using backout processing - BOQTHRESH, BOQNAME. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
vishBroker |
Posted: Wed Aug 17, 2016 5:34 am Post subject: |
|
|
Centurion
Joined: 08 Dec 2010 Posts: 135
|
Backout threshold is set to 3 and there is backout queue defined.
Well - when the errorCount is one - we are not putting the message back onto input queue, we are reprocessing it by connecting out terminal to 'tryCatch' node.
Anyhow - We got rid of this retry business all together and we are still facing some issue. Few messages are getting processed and then it gets stuck.
The message does not even enter debug - prior messages enter debug and get processed. |
|
Back to top |
|
 |
|