|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
"Lost" messages |
« View previous topic :: View next topic » |
Author |
Message
|
Boomn4x4 |
Posted: Mon Jan 20, 2014 12:12 pm Post subject: "Lost" messages |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
I need some help trying to track down "lost" messages, looking for some ideas as I'm clueless.
I have a local QMGR that is sending data across a cluster where broker is consuming data off the cluster queue and putting it in a database. It only happens about once or twice per 10,000 messages, but messages are seemingly getting lost. We have stopped the broker and let the messages back up on the queue and the messages are missing from there so I don't think its a problem with the broker code. On the sender side QMGR, all messages transactions are logged, if i get an MQCC_OK, I log the transaction and its data, if I get anything other than an MQCC_OK on my QPUT or QOpen, I log the error. I'm not showing any errors in the logs.
Any ideas how to track down these so called "lost" messages? |
|
Back to top |
|
 |
exerk |
Posted: Mon Jan 20, 2014 12:44 pm Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Obvious starter questions...
1. Is message expiry set?
2. Are they non-persistent messages?
3. What's the channel NPMSPEED set to?
4. Is there a DLQ?
5. Are you certain the Broker isn't discarding them?
6. Are they on a back-out queue somewhere?
I'm sure others will be along to ask further questions. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
bruce2359 |
Posted: Mon Jan 20, 2014 12:45 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
How do you know that some messages are lost? What evidence?
Does the producing app count the number of requests it receives? Does it count the number of messages it produces? Does this number not match the number of rows in the table? Does the consuming app count the number of messages?
Are the messages non-persistent? Do the messages have an expiry value?
Do the messages flow across an MQ network? Do the send/receive counts (channel status) match the counts the apps produce?
Are there any messages in the dead-letter queue(s)?
What to the "lost" messages have in common? Some account number? Time of day? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Mr Butcher |
Posted: Tue Jan 21, 2014 1:36 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
check the MQ transaction log (depending on your plattform and utilities available and msg persistence) if those "lost" messages have ever been put to mqseries......
from my experience, in 99,999999999% lost messages have never been put, or have already been consumed in a "strange" way so they are assumed to be lost. _________________ Regards, Butcher |
|
Back to top |
|
 |
exerk |
Posted: Tue Jan 21, 2014 1:38 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Mr Butcher wrote: |
check the MQ transaction log (depending on your plattform and utilities available and msg persistence) if those "lost" messages have ever been put to mqseries......
from my experience, in 99,999999999% lost messages have never been put, or have already been consumed in a "strange" way so they are assumed to be lost. |
OP stated "...On the sender side QMGR, all messages transactions are logged, if I get an MQCC_OK, I log the transaction and its data, if I get anything other than an MQCC_OK on my QPUT or QOpen, I log the error. I'm not showing any errors in the logs..." _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Tue Jan 21, 2014 4:39 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
exerk wrote: |
Obvious starter questions...
1. Is message expiry set?
2. Are they non-persistent messages?
3. What's the channel NPMSPEED set to?
4. Is there a DLQ?
5. Are you certain the Broker isn't discarding them?
6. Are they on a back-out queue somewhere?
I'm sure others will be along to ask further questions. |
1. No
2. Well... this may be my problem. Messages were supposed to be persistent and at one time they were, but it appears that at some point in time it was changed to non-persistent.
3. NPMSPEED is default, which is FAST.... which could be yet another problem
4. Yes, messages are not showing up on the DLQ
5. Yes, broker was shutdown completely and the messages were permitted to accumulate on the destination queue, the transactions were missing here to.
6. Not sure, one more thing for me to look into.
Thank You! I have some things to research. |
|
Back to top |
|
 |
exerk |
Posted: Tue Jan 21, 2014 4:56 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Boomn4x4 wrote: |
1. No
2. Well... this may be my problem. Messages were supposed to be persistent and at one time they were, but it appears that at some point in time it was changed to non-persistent.
3. NPMSPEED is default, which is FAST.... which could be yet another problem
4. Yes, messages are not showing up on the DLQ
5. Yes, broker was shutdown completely and the messages were permitted to accumulate on the destination queue, the transactions were missing here to.
6. Not sure, one more thing for me to look into. |
2. All the joys of relying on the queue definition - today's persistent messages are tomorrow's non-persistent messages;
3. You could try changing it to NORMAL to see if that makes a difference, but...
4. ...lack of messages on the DLQ suggests no discards - unless the channel is not authorized to the DLQ? (speculation as I don't know your WMQ version);
5. Which suggests that either they were never there at the sending end, which may mean your logging at that end isn't as tight as you think it is, or messages are being discarded during transit (see 3. and 4.). _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Tue Jan 21, 2014 6:26 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
Another something I found that may help? During the time frame that these messages are being lost, I'm seeing network issues in the logs. Could this be playing into the issue?:
01/20/14 09:41:47 - Process(22084.108) User(mqm) Program(amqrmppa)
Host(pos8777)
AMQ9002: Channel 'TO.QM' is starting.
EXPLANATION:
Channel 'TO.QM' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
01/20/14 09:43:05 - Process(22084.108) User(mqm) Program(amqrmppa)
Host(pos8777)
AMQ9208: Error on receive from host xx.xx.xx.xx.
EXPLANATION:
An error occurred receiving data from xx.xx.xx.xx over TCP/IP. This may be due
to a communications failure.
ACTION:
The return code from the TCP/IP read() call was 104 (X'68'). Record these
values and tell the systems administrator.
----- amqccita.c : 3433 -------------------------------------------------------
01/20/14 09:43:05 - Process(22084.108) User(mqm) Program(amqrmppa)
Host(pos8777)
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'TO.QM' ended abnormally.
ACTION:
Look at previous error messages for channel program 'TO.QM' in the
error files to determine the cause of the failure. |
|
Back to top |
|
 |
exerk |
Posted: Tue Jan 21, 2014 6:32 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Boomn4x4 wrote: |
Another something I found that may help? During the time frame that these messages are being lost, I'm seeing network issues in the logs. Could this be playing into the issue? |
From the manual: '...If a channel terminates while fast, non-persistent messages are in transit, the messages may be lost and it is up to the application to arrange for their recovery if required..." _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Tue Jan 21, 2014 6:45 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
exerk wrote: |
Boomn4x4 wrote: |
Another something I found that may help? During the time frame that these messages are being lost, I'm seeing network issues in the logs. Could this be playing into the issue? |
From the manual: '...If a channel terminates while fast, non-persistent messages are in transit, the messages may be lost and it is up to the application to arrange for their recovery if required..." |
What has me scratching my head about that is that it also says:
"If the receiving channel cannot put the message to its destination queue then it is placed on the dead letter queue, if one has been defined. If not, the message is discarded."
But I'm not seeing the messages on the DLQ. |
|
Back to top |
|
 |
exerk |
Posted: Tue Jan 21, 2014 6:56 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Boomn4x4 wrote: |
exerk wrote: |
Boomn4x4 wrote: |
Another something I found that may help? During the time frame that these messages are being lost, I'm seeing network issues in the logs. Could this be playing into the issue? |
From the manual: '...If a channel terminates while fast, non-persistent messages are in transit, the messages may be lost and it is up to the application to arrange for their recovery if required..." |
What has me scratching my head about that is that it also says:
"If the receiving channel cannot put the message to its destination queue then it is placed on the dead letter queue, if one has been defined. If not, the message is discarded."
But I'm not seeing the messages on the DLQ. |
Pre-Version 7.1
If the target queue is full, and no DLQ is defined in the queue manager, or the MCAUSER on the channel is not authorized to the DLQ or the DLQ is itself full, a non-persistent NPMSPEED(FAST) message will be discarded by the receiving MCA.
Post-Version 7.1
The above still applies but in addition: USEDLQ determines whether the dead-letter queue is used when messages cannot be delivered by channels. NO - Messages that cannot be delivered by a channel are treated as a failure. The channel either discards the message, or the channel ends, in accordance with the NPMSPEED setting.
So, speculatively, you could have an MCAUSER authorised to the DLQ over-ridden by the channel setting - but I'd want to test that to be sure.
Again, it's conjecture on my part because I don't know what version you're on. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Boomn4x4 |
Posted: Tue Jan 21, 2014 7:20 am Post subject: |
|
|
Disciple
Joined: 28 Nov 2011 Posts: 172
|
exerk wrote: |
Pre-Version 7.1
If the target queue is full, and no DLQ is defined in the queue manager, or the MCAUSER on the channel is not authorized to the DLQ or the DLQ is itself full, a non-persistent NPMSPEED(FAST) message will be discarded by the receiving MCA.
Post-Version 7.1
The above still applies but in addition: USEDLQ determines whether the dead-letter queue is used when messages cannot be delivered by channels. NO - Messages that cannot be delivered by a channel are treated as a failure. The channel either discards the message, or the channel ends, in accordance with the NPMSPEED setting.
So, speculatively, you could have an MCAUSER authorised to the DLQ over-ridden by the channel setting - but I'd want to test that to be sure.
Again, it's conjecture on my part because I don't know what version you're on. |
Using MQ V 7.0.1
To start, thank you for all your help. Second question, I do not have a MCAUSER defined, therefore its my assumption that the default user, mqm, is what is used which does have put authority to the DLQ. HOWEVER, the messages are being put onto the cluster.transmit.queue as another user. This user does NOT have authority to put messages on the DLQ Does the user that originally put the message on the cluster.transmit.queue stay with the message or does a failed message get put onto the DLQ using the MCAUSER defined for that channel? |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Jan 21, 2014 7:34 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Boomn4x4 wrote: |
exerk wrote: |
Boomn4x4 wrote: |
Another something I found that may help? During the time frame that these messages are being lost, I'm seeing network issues in the logs. Could this be playing into the issue? |
From the manual: '...If a channel terminates while fast, non-persistent messages are in transit, the messages may be lost and it is up to the application to arrange for their recovery if required..." |
What has me scratching my head about that is that it also says:
"If the receiving channel cannot put the message to its destination queue then it is placed on the dead letter queue, if one has been defined. If not, the message is discarded."
But I'm not seeing the messages on the DLQ. |
You're quoting exerk on one aspect of channel processing, and then referring to another topic in your reply.
If the channel pukes before the RCVR MCA ever has a chance to try and put the message, and the message in non persistent and the channel speed is Fast, the message might be lost.
Your quote about the DLQ is more along the lines of if the destination queue is not defined, or full, or inhibited or etc, etc, and the RCVR MCA has the messages safely in hand and wants to put it. Its just telling you it won't throw it away, it will try and put it to the DLQ. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
exerk |
Posted: Tue Jan 21, 2014 7:53 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Boomn4x4 wrote: |
Using MQ V 7.0.1
To start, thank you for all your help. Second question, I do not have a MCAUSER defined, therefore its my assumption that the default user, mqm, is what is used which does have put authority to the DLQ. HOWEVER, the messages are being put onto the cluster.transmit.queue as another user. This user does NOT have authority to put messages on the DLQ Does the user that originally put the message on the cluster.transmit.queue stay with the message or does a failed message get put onto the DLQ using the MCAUSER defined for that channel? |
If an MCAUSER has authority to a DLQ it will put any messages it cannot put to other queues to the DLQ, irrespective of the userid 'travelling' within the message's MQMD. If no MCAUSER is defined, the channel will start under the authority of the user under which the queue manager is running and no checking of the originating userid is carried out. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Jan 21, 2014 10:41 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
exerk wrote: |
If no MCAUSER is defined, the channel will start under the authority of the user under which the queue manager is running... |
The channel in this case will start with the ID that the MQ Listener is running under. Which almost always is the same ID that started the Queue Manager. But it might be something different. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|