Author |
Message
|
seacuke23 |
Posted: Fri Apr 05, 2013 9:01 am Post subject: onMessage occasionally stops delivering messages. |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
I've been assigned to look at a legacy system we have implemented in house. It's a Java SE application that creates a QueueSession and sets a message listener. Everything usually works fine but occasionally messages stop being delivered and they pile up on the queue until an application restart. There are no logs printed whatsoever. No errors on the MQ server and nothing in the application logs. The connection does have an ExceptionListener in place which simply logs calls to the onException method. I know that's questionable error handling at best, but I don't see any logs generated from the exception listener anyway. We're unable to recreate the problem in a test environment. The code is too involved to just paste in here, but I have created a test program that mimics the steps being taken.
Code: |
public class Main {
private static final String URL = "file:bindings/";
private static final String ICF = "com.sun.jndi.fscontext.RefFSContextFactory";
public static void main(String[] args){
Hashtable environment = new Hashtable();
environment.put(Context.INITIAL_CONTEXT_FACTORY, ICF);
environment.put(Context.PROVIDER_URL, URL);
environment.put(Context.REFERRAL, "throw");
QueueConnection qc = null;
QueueSession qs = null;
Queue q = null;
QueueReceiver qr = null;
BufferedReader in = null;
InputStreamReader isr = null;
try {
InitialDirContext ctx = new InitialDirContext(environment);
QueueConnectionFactory qcf = (QueueConnectionFactory)ctx.lookup("queueconnfact");
qc = qcf.createQueueConnection();
qc.setExceptionListener(new ExceptionListener() {
@Override
public void onException(JMSException arg0) {
System.out.println("Connection exception encountered.");
arg0.printStackTrace();
}
});
qs = qc.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
q = (Queue)ctx.lookup("tqueue");
qr = qs.createReceiver(q);
qr.setMessageListener(new MessageListener() {
@Override
public void onMessage(Message arg0) {
if(arg0 instanceof TextMessage){
try {
System.out.println("Received message " + ((TextMessage)arg0).getText());
}
catch (JMSException e) {
e.printStackTrace();
}
}
else{
System.out.println("Received message of type " + arg0.getClass());
}
}});
qc.start();
isr = new InputStreamReader(System.in);
in = new BufferedReader(isr);
while(!(in.readLine()).equalsIgnoreCase("quit"));
}
catch (NamingException e) {
e.printStackTrace();
}
catch (JMSException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
finally{
if(qr != null){
try {
qr.close();
}
catch (JMSException e) {
e.printStackTrace();
}
}
if(qs != null){
try {
qs.close();
}
catch (JMSException e) {
e.printStackTrace();
}
}
if(qc != null){
try {
qc.close();
}
catch (JMSException e) {
e.printStackTrace();
}
}
if(in != null){
try {
in.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
if(isr != null){
try {
isr.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
} |
We have created a script to try to gather info (thread dump, lsof, etc) next time it happens but I don't have any further information at this time.
One thing I noticed when running my test program is that if I actually unplug the network connection, the program seems perfectly happy for almost 8 minutes and the exceptionListener isn't notified until then. I'm not sure if connectivity is a factor at all...Is there some way to detect connectivity problems more quickly?
I also noticed that the consumer seems to be single threaded so really slowly completing onMessage calls will result in pileups on the queue...but I don't see much evidence of synchronization or other potentially slow action taking place in the onMessage method so that seems like an unlikely cause.
Right now I'm just grasping at straws, so any ideas on where I can start looking would be greatly appreciated.
Last edited by seacuke23 on Fri Apr 05, 2013 9:54 am; edited 1 time in total |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 9:26 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
The code is relying on the route to QMGR being static and accessible forever, which is not possible. Upon error, or every so many messages, reinitialize the ctx, qcf, and qc variables. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 9:48 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
The code is relying on the route to QMGR being static and accessible forever, which is not possible. |
Can you clarify what you mean by "the route to QMGR"?
lancelotlinc wrote: |
Upon error, or every so many messages, reinitialize the ctx, qcf, and qc variables. |
I am never notified of an exception, so I can't do anything there. Are you saying I should close the queue receiver, session, connection, and reinitialize the context and recreate the whole connection and reregister the connection listener at message count intervals? |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 10:05 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
You are opening a connection to a qmgr and expecting it to be fault free forever, without interruption. This is not possible, even on a local box.
Find a way to re-initialize the connection every so often. Maybe every 100 messages if your message rate is several per hour. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 10:17 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
Find a way to re-initialize the connection every so often. Maybe every 100 messages if your message rate is several per hour. |
That is something I've been considering, even at time intervals...but have been avoiding it because it seems like more of a workaround than a production solution. It would be like recreating a database connection every 100 queries....you should only have to do that if there's a bug in the driver. Are there any known bugs that you're aware of that could cause this? Do you know how I can find release notes for mqseries releases or bug reports? |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 10:29 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
seacuke23 wrote: |
lancelotlinc wrote: |
Find a way to re-initialize the connection every so often. Maybe every 100 messages if your message rate is several per hour. |
That is something I've been considering, even at time intervals...but have been avoiding it because it seems like more of a workaround than a production solution. It would be like recreating a database connection every 100 queries....you should only have to do that if there's a bug in the driver. Are there any known bugs that you're aware of that could cause this? Do you know how I can find release notes for mqseries releases or bug reports? |
Or, you could have a nightly restart or some other way to occasionally reset the connection. If you coded it right, you would not have to reset the DB conx, and you would only be resetting the QMGR conx details.
Another idea: set a timer and reset the conx after ten minutes of inactivity. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 10:32 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
Or, you could have a nightly restart or some other way to occasionally reset the connection. If you coded it right, you would not have to reset the DB conx, and you would only be resetting the QMGR conx details.
Another idea: set a timer and reset the conx after ten minutes of inactivity. |
I wasn't saying we have to reset the database connection, rather, likening the mq connection to a database connection and saying that neither should really need to be recreated unless there's a bug in the drivers or they're not being used correctly. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 10:42 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
seacuke23 wrote: |
lancelotlinc wrote: |
Or, you could have a nightly restart or some other way to occasionally reset the connection. If you coded it right, you would not have to reset the DB conx, and you would only be resetting the QMGR conx details.
Another idea: set a timer and reset the conx after ten minutes of inactivity. |
I wasn't saying we have to reset the database connection, rather, likening the mq connection to a database connection and saying that neither should really need to be recreated unless there's a bug in the drivers or they're not being used correctly. |
In theory, as long as traffic is flowing, you should never have to reset the conx. Have you investigated why you do not get an exception when the program stops receiving messages? Can you recreate the problem to troubleshoot it? _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 10:46 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
Have you investigated why you do not get an exception when the program stops receiving messages? Can you recreate the problem to troubleshoot it? |
I have been trying to figure out why I wouldn't receive any exception notification and have so far come up empty. I am not able to recreate the problem and it only occurs sporadically at a client site. The last two occurrences were two weeks apart. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 10:47 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
seacuke23 wrote: |
lancelotlinc wrote: |
Have you investigated why you do not get an exception when the program stops receiving messages? Can you recreate the problem to troubleshoot it? |
I have been trying to figure out why I wouldn't receive any exception notification and have so far come up empty. I am not able to recreate the problem and it only occurs sporadically at a client site. The last two occurrences were two weeks apart. |
What OS and what version of MQ? _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 10:49 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
What OS and what version of MQ? |
I've asked for that info and so far have been told that they're running "7.0.1.x"...no OS details yet. |
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 11:48 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
Version is:
Name: WebSphere MQ
Version: 7.0.1.7 |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 11:51 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
|
Back to top |
|
 |
seacuke23 |
Posted: Fri Apr 05, 2013 11:54 am Post subject: |
|
|
Novice
Joined: 04 Apr 2013 Posts: 11
|
lancelotlinc wrote: |
OS - AIX, Sun, HP, Windows ?? |
Application runs on Solaris, MQ server is on AIX |
|
Back to top |
|
 |
lancelotlinc |
Posted: Fri Apr 05, 2013 11:55 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
You can look in /var/adm/user.log or debug.log to see any errors on the QMGR side. Similarly for Sun on the client side. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
|