Author |
Message
|
awatson72 |
Posted: Tue May 09, 2006 5:42 am Post subject: Sporatic 2058 errors using WebSphere MQ Base Java |
|
|
Acolyte
Joined: 14 Apr 2004 Posts: 69 Location: Freeport, Maine
|
A Java application that uses MQ base java classes, (not JMS), is generating occassional 2058 errors. The code contains logic to retry the connection to MQ if there is a failure the first time, and the retry usually works fine. These have become more frequent since upgrading to MQ 6.0, (refresh pack 1). The server connection channel being used has Heartbeat Interval set to 10, Keep alive interval set to auto, and in qm.ini,
TCP:
KeepAlive=YES
An obvious suspect would be the network between the server running the JVM hosting the Java app and the MQ server, but this network is quite reliable.
Any thoughts or comments or similar experiences on this would be appreciated. _________________ Andrew Watson
L.L. Bean, Inc. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue May 09, 2006 5:49 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
What statement is receiving the 2058? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
wschutz |
Posted: Tue May 09, 2006 5:53 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
and do you see anything relevant in the AMQERR01.LOGs on the server? _________________ -wayne |
|
Back to top |
|
 |
Tibor |
Posted: Tue May 09, 2006 6:02 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
As far as I remember, this reason code is "qmgr name error" so the network connection should be correct before you can get this message. Usually the API sends a rc=2059 in a network problem.
HTH,
Tibor |
|
Back to top |
|
 |
awatson72 |
Posted: Tue May 09, 2006 6:20 am Post subject: |
|
|
Acolyte
Joined: 14 Apr 2004 Posts: 69 Location: Freeport, Maine
|
Here is an example of the exception from the JVM SysOut log
[5/9/06 9:50:27:317 EDT] 77374840 SystemOut O >>---->Unable to connect, generating LLBSoftException.
>>---->The Exception thrown is: com.ibm.mq.MQException: MQJE001: An MQException
occurred: Completion Code 2, Reason 2058
MQJE036: Queue manager rejected connection attempt
Yes, 2058 is Qmgr name error. We are assuming that the error is reported when the application attempts to connect to the QM. I'm not a java programmer, but I think this is where the problem occurs:
try {
MQException.log = null;
MQEnvironment.hostname = aLLBMQInfo.getQMgrHostName();
MQEnvironment.channel = channelName;
MQEnvironment.port = aLLBMQInfo.getQMgrPort();
mqQueueManager = new MQQueueManager(qMgrName);
if (mqQueueManager.isConnected()) {
aLLBMQInfo.setQMQQueueManager(mqQueueManager);
} else {
exceptionToThrow = ExceptionUtilities.generateSoftException(errorMsg);
log.warn(errorMsg);
}
There are no associated errors in any of the QM logs. _________________ Andrew Watson
L.L. Bean, Inc. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue May 09, 2006 6:24 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
awatson72 wrote: |
We are assuming that the error is reported when the application attempts to connect to the QM. |
You can waste a lot of time troubleshooting assumptions.
Is there more of a java stack trace, that would include the class and line number that threw the exception. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
wschutz |
Posted: Tue May 09, 2006 6:50 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
again...
wschutz wrote: |
and do you see anything relevant in the AMQERR01.LOGs on the server? |
_________________ -wayne |
|
Back to top |
|
 |
awatson72 |
Posted: Tue May 09, 2006 7:52 am Post subject: |
|
|
Acolyte
Joined: 14 Apr 2004 Posts: 69 Location: Freeport, Maine
|
Again:
Quote: |
There are no associated errors in any of the QM logs. |
Including AMQERR01.LOG _________________ Andrew Watson
L.L. Bean, Inc. |
|
Back to top |
|
 |
RogerLacroix |
Posted: Tue May 09, 2006 9:13 pm Post subject: |
|
|
 Jedi Knight
Joined: 15 May 2001 Posts: 3264 Location: London, ON Canada
|
Hi Andrew,
My guess is that the Java code has whacked the 'qMgrName' variable or there is a bug in the isConnected() code. If I remember correctly, isConnected() only reports the state of the connection from the last MQ API call - it does not actually perform a real test!!
Personally, I would rewrite it as (plus add more logging):
Code: |
try
{
MQException.log = null;
MQEnvironment.hostname = aLLBMQInfo.getQMgrHostName();
MQEnvironment.channel = channelName;
MQEnvironment.port = aLLBMQInfo.getQMgrPort();
mqQueueManager = new MQQueueManager(qMgrName);
//
aLLBMQInfo.setQMQQueueManager(mqQueueManager);
}
catch (MQException mqex)
{
exceptionToThrow = ExceptionUtilities.generateSoftException(errorMsg);
log.warn(errorMsg);
System.err.println("aLLBMQInfo.getQMgrHostName() : " + aLLBMQInfo.getQMgrHostName());
System.err.println("channelName : " + channelName);
System.err.println("aLLBMQInfo.getQMgrPort() : " + aLLBMQInfo.getQMgrPort());
System.err.println("qMgrName : " + qMgrName);
}
catch (Exception ex)
{
System.err.println("Exception : " + ex);
} |
Also, how come 2 parameters (hostname & port) are stored in the aLLBMQInfo class but the other 2 parameters (channelName & qMgrName) are not?
Regards,
Roger Lacroix
Capitalware Inc. _________________ Capitalware: Transforming tomorrow into today.
Connected to MQ!
Twitter |
|
Back to top |
|
 |
awatson72 |
Posted: Fri May 12, 2006 5:15 am Post subject: |
|
|
Acolyte
Joined: 14 Apr 2004 Posts: 69 Location: Freeport, Maine
|
Hi Roger, thanks for your reply.
The reason that the parameters are treated differently is to do with how the code randomizes a connection attempt between a list of one to many queue managers. Information about these QMs, (host, channel, port) is stored in XML files.
The problem occurs on well below 1% of the connection attempts, so if there is a bug in the code, it only crops up very occassionally. I've thought about turning on a trace, but to do so would require changing the code to enable it which is difficult to coordinate for a production system.
Not sure which direction to head with this at the moment.... I was hoping someone had seen similar behavior and had found a solution. _________________ Andrew Watson
L.L. Bean, Inc. |
|
Back to top |
|
 |
|