Author |
Message
|
bombyk |
Posted: Tue Aug 20, 2002 7:20 am Post subject: Java MQ polling mysteriously vanishes |
|
|
Novice
Joined: 06 Nov 2001 Posts: 10
|
I am running a Java web app under Tomcat standalone on Win2000 Server that establishes a connection to an MQ queue manager, starts a thread which polls a queue for messages. sleeps for 20000 MS, and then polls again (using MQQueue.getCurrentDepth()). If the current depth is > 0, we get all messages. This works fine 99% of the time. However, on three instances in the last month, the polling has stopped. No exception that I can see is thrown either from the thread class or from MQ classes. The polling just stops. I don't know if the thread is still running or not. But messages queue up and the message count on the server connection channel through which the Java poller is reading stops incrementing, though the connection is still marked as active. I have to stop and start the Tomcat service to re-establish polling.
I don't even know how to begin to debug this, since I am not seeing any exceptions thrown. I don't know if the polling thread is running or not.
Any ideas on how I might debug this? Would an MQ trace help me? I haven't been able to recreate the problem on a test server.
Thanks for any insight you might have. |
|
Back to top |
|
 |
ram_2000 |
Posted: Thu Sep 12, 2002 9:34 pm Post subject: |
|
|
 Novice
Joined: 29 Mar 2002 Posts: 12 Location: Vancouver, Canada
|
I suspect the thread quit (could be due to an exception)...
Make sure you are catching all exceptions/errors in the run() method of your thread..
run() {
try {
// do your thing here...
} catch (Throwable t) {
t.printStackTrace();
}
}
You will not see any stacktrace unless you specifically caught the exception and printed the stacktrace. (only exception being the main() method)....
You should be able to get more info on why the thread quit (if it actually did quit)..
One other possibility is that your thread is hanging....say on some I/O...but this is usually very unlikely...
Let me know if you still can't figure it out....i may have some other wild ideas..  |
|
Back to top |
|
 |
bombyk |
Posted: Fri Sep 13, 2002 10:52 am Post subject: |
|
|
Novice
Joined: 06 Nov 2001 Posts: 10
|
Thanks, but I have try/catch blocks with stacktrace prints around all suspect code. Further inspection of the problem has revealed that we indeed are not throwing any exceptions, but making calls to getQueueDepth() which never return. We have worked around the problem by calling a heartbeart and then monitoring it from another thread. If the MQ call does not return in 90 seconds, we stop the thread and re-establish the connection. This has been happening several times a day.
We have recently seen some OutOfMemoryError exceptions in JSP threads in this Tomcat instance, which may be related. I am going to increase the heap memory allocated to the JVM in Tomcat as soon as I can restart the instance.
Any other insights you might have would be appreciated. |
|
Back to top |
|
 |
kingdon |
Posted: Fri Sep 13, 2002 11:31 am Post subject: |
|
|
Acolyte
Joined: 14 Jan 2002 Posts: 63 Location: UK
|
Hi,
Out of interest, why do you feel the need to call getQueueDepth, as opposed to just trying to GET a message and handling the 2033 if there is none available? Similarly, what's the reason for the sleep instead of using a GET with a wait interval?
Regards,
James. |
|
Back to top |
|
 |
ram_2000 |
Posted: Fri Sep 13, 2002 12:33 pm Post subject: |
|
|
 Novice
Joined: 29 Mar 2002 Posts: 12 Location: Vancouver, Canada
|
Are you catching Exception or Throwable ?
I have seen cases where only Exception is caught...but this does not include any runtime errors (like OutOfMemoryError)... |
|
Back to top |
|
 |
bombyk |
Posted: Fri Sep 13, 2002 12:46 pm Post subject: |
|
|
Novice
Joined: 06 Nov 2001 Posts: 10
|
Yes, absolutely, you may be on to something there. We are not catching Throwable. This problem has been chronic, but it is only in the past day or two that I became aware that we were short on memory, because a different JSP failed and the OutOfMemoryError was caught by Tomcat. Thanks for your help. |
|
Back to top |
|
 |
bombyk |
Posted: Fri Sep 20, 2002 9:02 am Post subject: |
|
|
Novice
Joined: 06 Nov 2001 Posts: 10
|
Re. why call getQueueDepth, as opposed to just trying to GET a message and handling the 2033 if there is none available; and sleeping instead of using a GET with a wait interval:
Is there a reason why we shouldn't do it this way? |
|
Back to top |
|
 |
nimconsult |
Posted: Mon Sep 23, 2002 4:20 am Post subject: |
|
|
 Master
Joined: 22 May 2002 Posts: 268 Location: NIMCONSULT - Belgium
|
1) unnecessary overhead (one additional MQ API call)
2) does not prevent from RC 2033 in a concurrent environment (getCurrentDepth() may return 1, but the queue is empty when you reach your get()) _________________ Nicolas Maréchal
Senior Architect - Partner
NIMCONSULT Software Architecture Services (Belgium)
http://www.nimconsult.be |
|
Back to top |
|
 |
kingdon |
Posted: Tue Sep 24, 2002 1:27 am Post subject: Different approaches |
|
|
Acolyte
Joined: 14 Jan 2002 Posts: 63 Location: UK
|
Simplicity would be one reason. A get with wait loop is about as simple as it gets, and my view is that any extra complexity should only be added if strictly necessary. As Nicolas pointed out, calling getCurrentDepth() doesn't buy you much since it may return misleading results, not only due to concurrency but also the use of syncpoints - see the APR for details.
Latency would be a second, since if the code is sat in a blocking GET, then the message will be processed as soon as it arrives, as opposed to up to the sleep interval later for a check/sleep approach.
Still, the calls should work as advertised, and I was curious to see if the approach you had taken was driven by specific requirements that we might be able to meet in a more satisfactory way in future releases.
Regards,
James. |
|
Back to top |
|
 |
bombyk |
Posted: Tue Sep 24, 2002 5:20 am Post subject: |
|
|
Novice
Joined: 06 Nov 2001 Posts: 10
|
Thanks for your reply...it would make sense to change this.
Still I don't understand why our getCurrentDepth() requests are hanging up. This happens intermittently. I had suspected memory problems in our Tomcat instance (from which the call is made) but I have corrected that problem and the hung/blocked condition persists. As a workaround, we developed a monitor thread in our app that detects the hung/blocked condition, kills the polling thread, and re-establishes the connection to MQ over the server connection channel. Unfortunately after this happens enough times, we encounter a "maximum number of channels reached" condition...I presume because we do not close our connection prior to starting a new one. So we have introduced another bug in our workaround. It is very difficult to deploy workarounds because this is a production system and we have been unable to recreate the problem on a test system - we don't get many chances to try things. What I am really looking for a a methodology for determining the root cause of the hung/blocked condition we are encountering with getCurrentDepth(), because it seems likely that we would have the same problem if simply did a get and waited. Or would we? Perhaps I am missing something here. |
|
Back to top |
|
 |
kingdon |
Posted: Tue Sep 24, 2002 11:19 pm Post subject: |
|
|
Acolyte
Joined: 14 Jan 2002 Posts: 63 Location: UK
|
I've not heard of any of the MQI calls just randomly hanging. If you're convinced this is what is happening then you should call on the service team to have the problem investigated. However, I'm concerned that you haven't been able to reproduce the problem on another system, since if you can't reproduce it then is seems unlikely that the service team will be able to either, and that makes life very difficult.
Have you checked the event logs for any MQSeries or tcp/ip errors?
I can't say whether or not changing to a get-wait will help, since I can't imagine why the getCurrentDepth() call might be hanging. However, since it is a simplification and removes the call that appears to be causing the trouble, it's probably worth trying.
Cheers,
James. |
|
Back to top |
|
 |
|