Author |
Message
|
rconn2 |
Posted: Fri Dec 12, 2014 6:39 am Post subject: How to troubleshoot messages stuck in transmission queue? |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
We have a transmission queue that appears to have dozens of messages stuck in it (depth = 45 or so), yet the corresponding Chennel is running (and has been), and other messages are going through the channel. I see no obvious errors in the logs.
Is there any way to troubleshoot this? We can't browse messages on a transmission queue. So, how can I determine what the messages are and why they might be getting stuck?
Or might this just be some kind of artifact and there are no stuck messages?
I've tried to investigate this issue of troubleshooting stuck transmission queue messages in the past, but reached a dead end and moved on to other issues. But, I figure I'll give it another try as its important to get this figured out.
Thanks for any help -- this forum has always been my go-to resource.
Last edited by rconn2 on Fri Dec 12, 2014 7:43 am; edited 1 time in total |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Dec 12, 2014 6:44 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
If xmitq is get disabled, then channel cannot be running. Make sure sender channel def points to correct xmitq. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
rconn2 |
Posted: Fri Dec 12, 2014 7:46 am Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
bruce2359 - Thanks. I just edited my original post. Gets and Puts are enabled (as I can see looking at the properties), but Browsing of a transmission queue is still not allowed (?).
Also, the channels and xmitq configuration are okay as messages are going through the channel. But, over time (days, weeks), the transmission queue begins to show a depth and that depth never goes away and only increases. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Dec 12, 2014 8:18 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The channel has the xmitq open for exclusive get.
So it's open for get, but nobody else can get from it.
What is the Uncommited message count on the xmitq? |
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Dec 12, 2014 8:37 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Display the xmit queue attributes. What is the UNCOM value? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
rconn2 |
Posted: Fri Dec 12, 2014 10:48 am Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
The current depth is 65 and uncommitted is also 65. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Dec 12, 2014 10:49 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
So either the channel is having trouble sending the message to the other side, or one of your applications is failing to commit it's PUTs. |
|
Back to top |
|
 |
tczielke |
Posted: Fri Dec 12, 2014 11:18 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
If it is appropriate to do this for a short duration, I would stop the sender channel for this XMITQ, as this might help in being able to browse the uncommitted messages (if the MCA is the process that has these messages under syncpoint). |
|
Back to top |
|
 |
PaulClarke |
Posted: Fri Dec 12, 2014 11:58 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
A channel moves messages in a single stream so if the channel is moving messages then I don't think it likely that messages are getting 'stuck.
Instead it seems far more likely that an application has put messages to the transmission queue but failed to issue an MQCMIT().
Try issuing a display connection list and look for an old transaction start time. If you have a really old transaction it is likely that this connection is the culprit.
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
rconn2 |
Posted: Thu Jan 08, 2015 8:18 am Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
Thanks everyone for your answers... sorry I haven't responded over the holidays.
I think this is what's happening: the application isn't issuing an MQCMIT(). I'm not the app developer, but the admin, so I can't dig into the code. But, I was told there are transactions occuring in conjunction with some database processing. So, this does seem to be the issue.
But, how should I handle busted transactions? Not only are messages piling up on the xmit queue, but there are 114 messages stuck on a local queue that are all uncommitted messages. Other messages are put and get from this queue, but the 114 stay stuck.
If an application connection is broken (we do rolling restarts of the WAS application), shouldn't uncommitted messages get automatically rolled-back? It seems these are just orphaned.
Should I just set a Retention Interval? |
|
Back to top |
|
 |
rconn2 |
Posted: Thu Jan 08, 2015 8:26 am Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
@PaulClarke - All the active client connections began since yesterday as we did a rolling restart of the application. But, the orphaned, uncommitted messages remain. |
|
Back to top |
|
 |
PaulClarke |
Posted: Thu Jan 08, 2015 9:04 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
Well, if the messages/transactions are lasting longer than the connections that made them then that suggests that the messages are 'prepared'. ie. are 'indoubt'.
I'm not sure what platform you are on but if its Distributed then issue dspmqtrn and see whether you have any prepared transactions.
Of course if the transaction coordinator is WAS then one would expect that these prepared transactions would be resolved sooner or later. Is there any possibility that you have defined your XAOpen string in such a way that the resulting QM you connect to is not unique ? In other words there are multiple QMs that WAS could connect to?
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
rconn2 |
Posted: Thu Jan 08, 2015 10:52 am Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
dspmqtrn -m <QMGR> returns:
There are no matching prepared transactions.
Running MQ 7.0.1.6 on AIX
There's only one QM to connect to -- it's name is unique.
The WAS app is clustered running on two nodes; we stop and restart one, then the other for a rolling restart. The uncommitted, orphaned, messages remain through restarts.
I'm guessing this is what they are -- busted transaction orphans (I'm making this term up)? If their connections have broken, I'd have guessed they'd be automatically rolled-back. I'm not too familair with the app itself and how transactions are programmed. |
|
Back to top |
|
 |
PaulClarke |
Posted: Thu Jan 08, 2015 10:59 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
Well I am confused then. If these are local transactions I don't see how they can live beyond the life of the connection that made them.
Are you absolutely certain that the connections/channels are no longer running?
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
rconn2 |
Posted: Thu Jan 08, 2015 1:59 pm Post subject: |
|
|
Voyager
Joined: 09 Aug 2007 Posts: 79 Location: MD, USA
|
Both app servers were restarted yesterday, so all connections are new as of then.
There were 114 uncommitted messages on one local queue that remained after the restarts. And, they're still there today... though the number may be up a few more.
There are also 215 uncommitted's on a transmission queue. I haven't restarted the corresponding Srvr channel. But, I would guess that the ones on the transmission queue are uncommitted from transactions involving a remote queue... for example, "put" on the remote queue is a "put" on the xmit queue which are ready to be committed and released, but never committed?
This is an active production application. |
|
Back to top |
|
 |
|