Author |
Message
|
edhi |
Posted: Wed Jan 13, 2010 6:12 am Post subject: Channel program ended abnormall-what happens with roll back? |
|
|
Novice
Joined: 10 Jan 2006 Posts: 15
|
Hi,
Every month we have more traffic over our mqseries servers. Every time more applications use it to communicate.
Yesterday we encountered a problem: "AMQ9513: Maximum number of channels reached." and as of consequence of that: "AMQ9999: Channel program ended abnormally."
MaxChannels was not specified in qm.ini so it still had the default value (100).
After I added it (MaxChannels=200) and restarted mqseries, the problem had disappeared.
But we also seem to have lost some messages and I was wondering how that is possible?
All our queues are defined as persistent. When I try to reproduce the error (forced stop of server-connection channel) while debugging a client program, every works fine. Uncommitted messages are rolled back.
Question: What happens with commit control when a channel ended abnormally (AMQ9999)?
Kind Regards
Edmond Paulussen |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 13, 2010 6:23 am Post subject: Re: Channel program ended abnormall-what happens with roll b |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
edhi wrote: |
But we also seem to have lost some messages and I was wondering how that is possible? |
A misunderstand about how the different types of channel work, combined with interesting application code?
edhi wrote: |
All our queues are defined as persistent. |
As has been stated many, many times on this forum, that setting is a default. There's no reason an application couldn't put a non-persistent message on a queue with DEFPSIST(YES).
edhi wrote: |
When I try to reproduce the error (forced stop of server-connection channel) while debugging a client program, every works fine. Uncommitted messages are rolled back. |
Server connection channels (i.e. client connections) are different things to sender/receiver pairs. Noteably they establish a syncronous connection with their other end & don't handshake. If the queue manager runs out of connections, the application will receive a 2009 or 2019 error on the next WMQ operation and must act accordingly. This means rolling back the transaction if necessary .
edhi wrote: |
Question: What happens with commit control when a channel ended abnormally (AMQ9999)?
|
In summary - nothing. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 13, 2010 6:32 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
But we also seem to have lost some messages and I was wondering how that is possible? |
If there are no more slots available in the channel status table, the channel will not start. Messages in a transmission queue will remain in the xmit queue until the channel starts. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 13, 2010 6:52 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
bruce2359 wrote: |
Quote: |
But we also seem to have lost some messages and I was wondering how that is possible? |
If there are no more slots available in the channel status table, the channel will not start. Messages in a transmission queue will remain in the xmit queue until the channel starts. |
The OP mentions a server connection channel. I wasn't aware those had a transmission queue associated with them. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
edhi |
Posted: Wed Jan 13, 2010 7:12 am Post subject: |
|
|
Novice
Joined: 10 Jan 2006 Posts: 15
|
Hi,
Vitor wrote: |
A misunderstand about how the different types of channel work, combined with interesting application code? |
We are using MQSeries for more then 8 years now. Every month millions of messages and nothing get's lost => the client programs that we have developed have proven to be very reliable!
According to the MQSeries logging, only Server Connection channels suffered from the fact that the max number of channels was reached.
In this case the messages are sent to a queue over a sender/receiver channel. From the queue, the messages are read over a Server connection channel.
Indeed the client program received 2009 and 2019 as reason code.
Vitor wrote: |
This means rolling back the transaction if necessary . |
How for god's sake can you roll back a transaction when you no longer have a connection to the mq manager (because the channel ended abnormally)???
MQ takes care of that. When I forced a stop on Server Connection channel during debugging, MQ immediately rolled back uncommitted messages.
Thanks anyway for your quick response, but it was useless and had a quite arrogant tone.
Edmond |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 13, 2010 7:28 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
edhi wrote: |
According to the MQSeries logging, only Server Connection channels suffered from the fact that the max number of channels was reached. |
For the reason my associate mentioned - the sender & receiver obtain slots in the channel table at start up and don't relinquish them. It's only the more dynamic client connections that are affected.
edhi wrote: |
How for god's sake can you roll back a transaction when you no longer have a connection to the mq manager (because the channel ended abnormally)??? |
The same way you'd handle any other WMQ failure at the client end.
edhi wrote: |
MQ takes care of that. |
Clearly not, or you wouldn't be posting about missing messages
edhi wrote: |
When I forced a stop on Server Connection channel during debugging, MQ immediately rolled back uncommitted messages. |
There's a difference between a forced stop & an out of slot condition. In the same way there's a difference between a forced stop and a network failure.
edhi wrote: |
Thanks anyway for your quick response, but it was useless and had a quite arrogant tone. |
I'm sorry you felt that. I wish you success in tracing your problem. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
JosephGramig |
Posted: Wed Jan 13, 2010 7:37 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
When you run out of channels due to reaching MAX channels, that means you don't get another channel. Not that your channel stopped working. So, if there is a lost message, then it has nothing to do with MQ as the code was not talking to MQ at that point.
On the "only effects SVRCONNs", did you set both MAX params to 10 and define 11 senders and try to start all of them?
Last 200 channels is quite low... Might want to consider capacity planning. |
|
Back to top |
|
 |
nathanw |
Posted: Wed Jan 13, 2010 7:47 am Post subject: |
|
|
 Knight
Joined: 14 Jul 2004 Posts: 550
|
edhi wrote: |
Thanks anyway for your quick response, but it was useless and had a quite arrogant tone.
Edmond |
Mmm considering Vitor helps out alot of people on here and considering how mnay posts he has and his reputation, there is a reason for sounding arrogant maybe because he knows what he is talking about.
He may not have explained his solution quite how you would understand it but I wouldnt say it was useless
Also you may need his help in the future whe no one else can help so maybe that is not the best way to finish a post! _________________ Who is General Failure and why is he reading my hard drive?
Artificial Intelligence stands no chance against Natural Stupidity.
Only the User Trace Speaks The Truth  |
|
Back to top |
|
 |
edhi |
Posted: Wed Jan 13, 2010 8:19 am Post subject: |
|
|
Novice
Joined: 10 Jan 2006 Posts: 15
|
An extract from the logging:
"01/12/10 10:23:20 - Process(16694.1150923) User(mqm) Program(amqrmppa)
AMQ9513: Maximum number of channels reached.
....."
And the immediate next message:
"01/12/10 10:23:20 - Process(16694.1150923) User(mqm) Program(amqrmppa)
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'VKOBRA.SCN01' ended abnormally.
ACTION:
Look at previous error messages for channel program 'VKOBRA.SCN01' in the error files to determine the cause of the failure."
If I interpret this correctly then this means that a server-connection channel has STOPPED working, because max. no. of channels was reached, not an out-of-slot condition.
If a channel stops working, connection to mq manager has been lost and roll back will not work.
When I simulate this situation by forcing a server-connection channel to stop, MQ detects that the connection was lost and rolls back the uncommitted message, which appears on the queue again. So everything looks fine.
Despite the fact that the error has occurred 100's of times, we only lost a few messages.
@JosephGramig: You are right, we have to consider a capacity planning |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 13, 2010 8:21 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Vitor wrote: |
bruce2359 wrote: |
Quote: |
But we also seem to have lost some messages and I was wondering how that is possible? |
If there are no more slots available in the channel status table, the channel will not start. Messages in a transmission queue will remain in the xmit queue until the channel starts. |
The OP mentions a server connection channel. I wasn't aware those had a transmission queue associated with them. |
Of course. Yes, the OP mentioned a svrconn channel, but only in regards to testing something by forcing a channel into stopped state - which was not the original problem - insufficient channel status table slots. SVRCONN came up in the attempt to reproduce the error. What error? Lost messages... or something.
Quote: |
When I try to reproduce the error (forced stop of server-connection channel) while debugging a client program, every works fine. |
_________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 13, 2010 8:24 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
If I interpret this correctly then this means that a server-connection channel has STOPPED working, because max. no. of channels was reached, not an out-of-slot condition. |
Quote: |
AMQ9999: Channel program ended abnormally.
|
The channel program could not start the named channel because no slots were avaialble.
No, you did not interpret this correctly.
The channel failed to START because no slots were available. The channel did not STOP. This means that your client app did not successfully connect to the svrconn channel, and did not successfully put a message to a queue; and therefore, no messages were created AND no msgs were lost (by MQ).
[EDIT]:
For clarity and/or precision, the above statement should have read:
This means that your client app did not successfully connect to the svrconn channel, OR did not successfully put a message to a queue; and therefore, no messages were created AND no msgs were lost (by MQ).
It is possible that following successful mqconnect, the DISCINT for the channel expired, and the slot consumed by another app. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Jan 13, 2010 9:39 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Its possible your app failed to connect, because there were no more channel "slots" available, it didn't check the return code so it assumed it was connected, it went ahead and issued the MQPUT, which also failed, but didn't check that return code, and ended. And now they say MQ lost the message when in fact they never gave it to MQ in the first place.
To correctly reproduce the problems, start up enough channels to = Max Active Channels for your test QM, and then fire up your application to do a put. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 13, 2010 9:48 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
and then fire your application developer.  _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 13, 2010 10:00 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
PeterPotkay wrote: |
Its possible your app failed to connect, because there were no more channel "slots" available, it didn't check the return code so it assumed it was connected, it went ahead and issued the MQPUT, which also failed, but didn't check that return code, and ended. And now they say MQ lost the message when in fact they never gave it to MQ in the first place. |
The kind of code I describe as "interesting", in accordance with the Chinese curse.
An application developer once told me, after suffering a similar problem, that his code didn't check return code because "MQ is an assured delivery product so it's certain to work". Not saying the same attitude prevails here of course, simply illustrating the point with a war story. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 13, 2010 10:01 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
bruce2359 wrote: |
and then fire your application developer.  |
Trout on a 1st offense, fire on a 2nd. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
|