ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » Orphaned connections bringing MQ down

Post new topic  Reply to topic
 Orphaned connections bringing MQ down « View previous topic :: View next topic » 
Author Message
HenriqueS
PostPosted: Fri Sep 24, 2010 1:51 pm    Post subject: Orphaned connections bringing MQ down Reply with quote

Master

Joined: 22 Sep 2006
Posts: 235

Hello,

Folks, I am here after almost 2 weeks of trials, research and still not sure of what is going on.

1) We have a WAS cluster (2 nodes) hosted on virtual machines and a MQ server hosted also on a virtual machine.

2) The WAS cluster got recently migrated to 7.0, the MQ stills at 6.0.2.10.

3) All MQ connections from one of the WAS nodes perform well, the connection pool is cleaned and all put/get operations are 100%.

4) BUT all the MQ connections from the other WAS node does not perform well, the connection pool is left with dozens of TCP connections until the max connection configured on MQ is reached. Also many get (using JMS) operatiions keep failing returnin null until, all of sudden, a batch of 50 messages is delivered, and soon after keeps returning noon for some period.

We did several changes on the JMS connection pool management on the WAS side but it not give any practical results.

One thing I noticed is that MQ has a lot of TCP/IP errors relating to the bad WAS node:
Quote:

09/23/2010 03:34:56 AM - Process(17795.7298) User(root) Program(amqrmppa)
AMQ9209: Connection to host 'wasnode01-t (172.17.105.110)' closed.

EXPLANATION:
An error occurred receiving data from 'wasnode01-t (172.17.105.110)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.


So, everything points to some network-related issue, but I wanted to hear from any forum members if anyone saw something like this already.

I am getting to the point of:
1) Ask operating systems team to change the falulty WAS node to the same node where the good one is hosted.
2) Ask the networking team to do some deep network traffic analysis to discover why I have so many broken connections.

The networking team already told me about the ocurrence of some large frames flowing on the network (called jumbo frames), tipical seen in virtual lans (VLANs) but that may be reject at some point because of they size, causing issues on the TCP sequencing.

Any other guesses?
Back to top
View user's profile Send private message
RogerLacroix
PostPosted: Fri Sep 24, 2010 2:18 pm    Post subject: Re: Orphaned connections bringing MQ down Reply with quote

Jedi Knight

Joined: 15 May 2001
Posts: 3264
Location: London, ON Canada

HenriqueS wrote:
09/23/2010 03:34:56 AM - Process(17795.7298) User(root) Program(amqrmppa)

I'll start with the one error that I can see. Why are you starting/running the queue manager with the 'root' UserID? You should be starting the queue manager with the 'mqm' UserID.

Regards,
Roger Lacroix
Capitalware Inc.
_________________
Capitalware: Transforming tomorrow into today.
Connected to MQ!
Twitter
Back to top
View user's profile Send private message Visit poster's website
HenriqueS
PostPosted: Fri Sep 24, 2010 2:49 pm    Post subject: Re: Orphaned connections bringing MQ down Reply with quote

Master

Joined: 22 Sep 2006
Posts: 235

I have no idea why this information is shown...maybe at sometime after a crack (max conn reached), I restarted the queue manager by hand...

Currently, there it is:

Code:

[DEINF.SEGAN@sbcdf365]$ ps aux | grep amqrmppa
mqm       2848  0.1  0.5 112364  5196 ?        Ssl  19:21   0:01 /opt/mqm/bin/amqrmppa -m QM.MQ_T_BC                                     
60050     4610  0.0  0.0  61172   720 pts/0    R+   19:46   0:00 grep amqrmppa
mqm      22994  0.0  0.4 108512  5108 ?        Ssl  Sep21   0:06 /opt/mqm/bin/amqrmppa -m QM.MQ_T_MON                                     
mqm      23015  0.0  0.4 109800  4828 ?        Ssl  Sep21   0:08 /opt/mqm/bin/amqrmppa -m QM.MQ_T_IF                                     
mqm      30220  0.0  0.8 138360  8692 ?        Ssl  18:09   0:05 /opt/mqm/bin/amqrmppa -m QM.MQ_T_BC                                     
[DEINF.SEGAN@sbcdf365]$


RogerLacroix wrote:
HenriqueS wrote:
09/23/2010 03:34:56 AM - Process(17795.7298) User(root) Program(amqrmppa)

I'll start with the one error that I can see. Why are you starting/running the queue manager with the 'root' UserID? You should be starting the queue manager with the 'mqm' UserID.

Regards,
Roger Lacroix
Capitalware Inc.
Back to top
View user's profile Send private message
HenriqueS
PostPosted: Wed Sep 29, 2010 2:17 pm    Post subject: Reply with quote

Master

Joined: 22 Sep 2006
Posts: 235

So, I am leaving this here for reference...problem solved.

Apparently the WebSphere team found it. There is a fault on WAS 7.0 (don´t know the exactly patch level) that just ignores external MQ libraries (from 6.0.X) or even newest ones from lastest patches.

They discovered this because thanks god we had a good WAS node where MQ and WAS were doing just fine. Then they checked the open process files by the WAS process (Linux 'lsof' command gives this). They compared to the other faulty servers and noticed that the referenced jars were not the same!

So they tricked the faulty WAS servers moving the bad MQ jars to somewhere else and creating a symbolic link to the good ones.

After that, we had no more orphaned, buggy JMS connections from WAS cummulating until MQ refusal.

They still researching why the WAS namespace got messed up and are in contact with IBM to check what really happened.

***Well, not only using the wrong lib but ignoring any customizations they made on the setupcmdline.sh script, including when MQ 6 libs were expressely pointed (it´s an command line were WAS admins usually push more libs into the namespace). ***

Faulty WAS servers were using this lib

Code:

Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.0
Created-By: 1.4.2 (IBM Corporation)
Copyright-Notice: Licensed Materials - Property of IBM          5724-H
 72, 5655-L82, 5724-L26          (c) Copyright IBM Corp. 2008 All Righ
 ts Reserved.          US Government Users Restricted Rights -       
   Use,duplication or disclosure restricted by GSA ADP Schedule Contra
 ct          with IBM Corp.
Sealed: false
Specification-Title: J2EE Connector Architecture
Specification-Version: 1.5
Implementation-Title: WebSphere MQ Resource Adapter
Implementation-Version: 7.0.0.0-k700-L080820
Implementation-Vendor: IBM Corporation


The good WAS server was using this lib

Code:

Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.0
Created-By: 1.4.2 (IBM Corporation)
Copyright-Notice: Licensed Materials - Property of IBM       5724-H72,
  5655-R36, 5724-L26, 5655-L82          (c) Copyright IBM Corp. 2008 A
 ll Rights Reserved.          US Government Users Restricted Rights -
          Use,duplication or disclosure restricted by GSA ADP Schedule
  Contract          with IBM Corp.
Sealed: false
Specification-Title: J2EE Connector Architecture
Specification-Version: 1.5
Implementation-Title: WebSphere MQ Resource Adapter
Implementation-Version: 7.0.1.2-k701-102-100504
Implementation-Vendor: IBM Corporation
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu Sep 30, 2010 1:28 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

That suggested you had the two servers at different patch levels of WAS, which is not really a good idea.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » Orphaned connections bringing MQ down
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.