Author |
Message
|
RikBaeten |
Posted: Tue Jan 04, 2011 5:46 am Post subject: |
|
|
 Novice
Joined: 26 Feb 2007 Posts: 19
|
Thanks everyone for your help . The issue is solved. I'm adding my findings as future reference for other people.
I've done some testing on Solaris UNIX with different queue managers: one with default settings, one with the KeepAlive=YES setting and a third one with the ClientIdle=3600 setting. Then I've connected 3 different apps connecting to these queue managers on one machine. These apps establish a connection and then wait forever. On another machine I created the same setup, but afterwards I disconnect the network card abruptly to create the orphaned connections.
During these tests I have observered the sockets, the output from "dis conn" and the output from "dis chstatus" in runmqsc.
Here are the test results:
- The KeepAlive setting only cleans the orphaned connections after the OS-configured KeepAlive setting. Sockets, connections and channels are directly cleaned.
- The ClientIdle setting cleans both orphaned and idle connections after 1 hour. It cleans both connections and channels, but the sockets take between 8 and 11 additional minutes to time-out and be released.
- Using the default settings both orphaned and idle connection resources remain active: sockets, connections and channels.
After these tests I tried to manually clean up orphaned connections in the following ways:
- stop conn => this only stops the connections and leaves channel instances and sockets active. Therefore it doesn't seem a very useful command in my opinion.
- stop channel(xyz) mode(quiesce) => does nothing
- stop channel(xyz) mode(force) or mode(terminate) => both do the same thing for me. They effectively clean orphaned connection: both connection and channel instance are removed. The socket takes about 8 additional minutes to time-out.
Conclusions:
- I will implement the KeepAlive setting on all queue managers. I don't understand why it's not configured by default (other than to avoid impact in case of a very long network outage and the app is not capable of reestablishing it's connections).
- As far as I understand, the most granular manual cleanup possible is to stop all channel instances of a specific name from a specific IP (find info using "dis chstatus") by issuing stop channel(xyz) conname(abc). Since our application is running on two different nodes we can stop one node and then cleanup all connections originating from that node's ip. If you are sharing the same channel names and IP's with different applications or you only have one node, you have no means for an online manual cleanup. |
|
Back to top |
|
 |
HubertKleinmanns |
Posted: Tue Jan 04, 2011 7:08 am Post subject: |
|
|
 Shaman
Joined: 24 Feb 2004 Posts: 732 Location: Germany
|
Just one remark:
RikBaeten wrote: |
- stop conn => this only stops the connections and leaves channel instances and sockets active. Therefore it doesn't seem a very useful command in my opinion. |
This maybe not useful to clean-up orphaned channels. But you could use this command to disconnect a resource-consuming application - which is connected local via bindings or remote via client channel. Other - maybe mor important - applications connected to this QMgr then could go on working. _________________ Regards
Hubert |
|
Back to top |
|
 |
George Carey |
Posted: Mon Apr 18, 2011 2:21 pm Post subject: Excellent info |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
This is a very good set of posts ... on some of the more esoteric aspects of MQ and Client connections ...
Might want to put this in some ... Keeper file !
GTC _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Wed Apr 20, 2011 2:27 pm Post subject: trying to reactivate this post |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
This is a bit late to the gate ... but the subject of these post are of interest to me at this time ... if anyone still available to respond on it.
1.) How do you know that is the behaviour of the 'stop conn' command? Namely to stop the connection to the queue manager but leave the socket connection to the client live. I mean where is that documented?
I indeed see that as the apparent behavior. As when I stop all connections associated with an IP, I still see the MQ Explorer sees the client channel instances for that IP as running.
2.) Also when I go into the MQ Explorer and do a 'status' on the client channel. I see all instances of that client channel with different IP associated if any. However in MQ Explorer I cannot 'stop' an instance there is no option for it from the status screen. So no equivalent to the CLI 'stop chl(client.chl) conname(1.2.3.4)' for example. How come ! C'est la vie ?
3.)Also no way to restart a conn once stopped by IP/conname as it is not an option on start conn.
4.) What is the value/purpose of being able to 'stop conn' if it is just the connection to the QMGR and not also the socket connection to client?
As I have currently run into a problem. I have stopped all conns coming from conname(1.2.3.4) and have also stopped all channel instances of client channel 'client.chl' with conname (1.2.3.4), yet MQ Explorer shows one instance for that channel and that IP as 'stopping' but it never stops !!
I rebooted MQ Explorer to see if that would refresh its display but no good.
The channels were originally coming from a DataPower client connection from an MQ Manager object but I even disabled that and still the client channel just shows 'stopping'. So how to get rid of this stopping channel instance without rebooting the QMGR and killing all other channel instances.
5.) Why would there not be a cross command tie for connID shown in the output of the commands of dis chs(client.chl) conname(1.2.3.4) and the command dis conn(*) conname(1.2.3.4) ? I don't see that there is.
Thanks in advance for any and all responders to these questions.
GTC _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Apr 20, 2011 9:36 pm Post subject: Re: trying to reactivate this post |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
George Carey wrote: |
This is a bit late to the gate ... but the subject of these post are of interest to me at this time ... if anyone still available to respond on it.
1.) How do you know that is the behavior of the 'stop conn' command? Namely to stop the connection to the queue manager but leave the socket connection to the client live. I mean where is that documented?
I indeed see that as the apparent behavior. As when I stop all connections associated with an IP, I still see the MQ Explorer sees the client channel instances for that IP as running.
2.) Also when I go into the MQ Explorer and do a 'status' on the client channel. I see all instances of that client channel with different IP associated if any. However in MQ Explorer I cannot 'stop' an instance there is no option for it from the status screen. So no equivalent to the CLI 'stop chl(client.chl) conname(1.2.3.4)' for example. How come ! C'est la vie ? |
Try MO71. It gives you the option...
George Carey wrote: |
3.)Also no way to restart a conn once stopped by IP/conname as it is not an option on start conn.
4.) What is the value/purpose of being able to 'stop conn' if it is just the connection to the QMGR and not also the socket connection to client?
As I have currently run into a problem. I have stopped all conns coming from conname(1.2.3.4) and have also stopped all channel instances of client channel 'client.chl' with conname (1.2.3.4), yet MQ Explorer shows one instance for that channel and that IP as 'stopping' but it never stops !! |
Known problem. You will need to use the stop chl(chlname) mode(force) status(inactive) conname(1.2.3.4).... This will allow the client to reconnect immediately...
George Carey wrote: |
I rebooted MQ Explorer to see if that would refresh its display but no good.
The channels were originally coming from a DataPower client connection from an MQ Manager object but I even disabled that and still the client channel just shows 'stopping'. So how to get rid of this stopping channel instance without rebooting the QMGR and killing all other channel instances. |
See my above comment
George Carey wrote: |
5.) Why would there not be a cross command tie for connID shown in the output of the commands of dis chs(client.chl) conname(1.2.3.4) and the command dis conn(*) conname(1.2.3.4) ? I don't see that there is.
Thanks in advance for any and all responders to these questions.
GTC |
_________________ MQ & Broker admin |
|
Back to top |
|
 |
George Carey |
Posted: Thu Apr 21, 2011 3:00 pm Post subject: Tnks for responses |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Thanks much for responses, I will check them out.
On my question 5.) I see you appeared to be going to reply to it but then I see/saw no reply ?
Any thing on that question?
In otherwords why wouldn't a 'ConnID' appear in the 'dis chs(abc.client.chl) where (conname lk 1.2.3.4*) all' output that could be tied to the 'dis conn(*) conname(1.2.3.4) all' output.
Rgrds,
GTC _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Tue May 10, 2011 10:29 am Post subject: trying to snyc up commands |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Using combination of the latest MQ Explorer client output as well as output from dis chs and dis conn commands for a queue (call it ABC.Q) that is being accessed by 19 client threads all from the same amqrmppa process associated with a client channel (call it ABC.client.chl) but only one thread shows handle state as active all other threads as inactive in the qstatus output from the MQ Explorer client.
The dis conn output shows no difference on all 19 outputs when selecting on conname equal to client IP and dis chs selecting on same shows no differences, all running status.
And even netstat -n | grep {client IP} shows all 19 connections ESTABLISHED .
1.) Why does the handle state of the queue opened for input showing inactive in 18 of 19 connections not have a more clearly discernable correllation/effect on the other dis conn and dis chs commands. I mean clearly only one thread from amqrmppa process associated with the MQ client channel 'ABC.client.chl' is doing any work getting messages from the ABC.Q. It is not clear from looking at LSTMSGTI ... they don't seem to be updated as expected.
2.) How can one kill just the inactive threads (if one can) and why is this not an easier task ... client channel state analysis and admin seems more difficult than it should be.
I think if the thread number in qstatus is same as TID in dis conn output maybe that combo could be used to do a stop conn(*) where(tid eq 123) ... don't know have to test.
As I say more difficult than need be. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
|