|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Unexpected Port Issues |
« View previous topic :: View next topic » |
Author |
Message
|
vsathyan |
Posted: Mon Dec 29, 2014 9:50 pm Post subject: Unexpected Port Issues |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Hi all,
Lately, we are facing some port issues in our RedHat Linux 6.5 servers, MQ v7.5.0.2
Even though the back log is set to 100, its maxing out and the application faces connectivity issues to the queue manager. After we restart the listener, it works fine for some time and again the issue resurfaces.
If we set the backlog to 200 in qm.ini and start the queue manager, it maxes out at 201 and we have to restart the listener again.
This is occurring randomly on different applications connecting to MQ queue manager in Linux 6.5 and MQ 7.5.0.2.
Below is the output of ss -l command in the Linux server.
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:isoft-p2p *:*
LISTEN 0 100 *:33101 *:*
LISTEN 0 128 *:55149 *:*
LISTEN 0 100 *:33102 *:*
LISTEN 0 128 *:ssad *:*
LISTEN 0 128 *:33551 *:*
LISTEN 0 100 *:33103 *:*
LISTEN 0 128 *:sunrpc *:*
LISTEN 0 100 *:33104 *:*
LISTEN 0 128 *:8176 *:*
LISTEN 0 100 *:beacon-port *:*
LISTEN 0 2 127.0.0.1:findviatv *:*
LISTEN 0 128 *:ssh *:*
LISTEN 129 128 *:33501 *:*
If you observe the last line, the RecvQ is 129, where the application starts facing MQRC_HOST_NOT_AVAILABLE issue. After restarting the listener, the RecvQ becomes 0 and goes on increasing over a period of time (around 2 hours) and finally reaches the max value and app disconnects.
We checked with the network team, and they say the tuning parameters, keep alive etc are fine.
We also tried using a different port, same issue. This is occurring randomly with different applications connecting to MQ on different ports in different servers.
Have anyone faced this kind of issue earlier?
Thanks in advance. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Dec 30, 2014 4:15 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
what is your value for max channels and max active channels in qm.ini ?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
PaulClarke |
Posted: Tue Dec 30, 2014 6:30 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
This seems very odd. There is very little that happens in MQ between the receiving a notification from the listen() and the TCP accept() call. Sockets should not stay on the queue for long. Unless, of course, there is something going wrong. I assume that you have checked ALL the MQ error logs? Have you tried running an MQ trace of the listener to see whether there is a blockage/bottleneck of some sort?
What I don't fully understand is that you seem to imply that the first 100 or so application connecting in work just fine and yet the back log is still fillling. I don't see how both can happen. Either MQ accepts the socket, and therefore processes it, so it is taken off the backlog or MQ has a problem issuing the accept which would lead to the backup you are seeing. However, in this case the applcation clearly wouldn't work. If you really are seeing this behaviour is it possible that we are looking at some sort of bug in the Linux stack?
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Dec 30, 2014 6:33 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Are all of the failing applications using the same general pattern for connecting to MQ ?
Do you see lots of left over clntconn or svrconn instances on the qmgr?
What errors are the clients reporting, before they start throwing MQRC_HOST_NOT_AVAILABLE? |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Dec 30, 2014 6:34 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Paul,
Could a channel stanza left at default value account for this kind of behavior?
What happens if, on top of a default channel stanza, connecting apps are not well behaved and never close a channel, or disconnect ?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Dec 30, 2014 6:41 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
fjb_saper wrote: |
connecting apps are not well behaved and never close a channel, or disconnect ?  |
It would be easy for a single application to do this, and cause problems for every other application connected to the same queue manager, in an apparently random pattern based on when each other app reconnects.
It would be easy for a single application to cause the same random failure to connect by connecting and disconnecting very rapidly (faster than the keepalive), and causing sockets to build up on the listener. |
|
Back to top |
|
 |
PaulClarke |
Posted: Tue Dec 30, 2014 7:04 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
The Backlog value is completely different from, say, MaxChannels. On my Windows Client machine Microsoft (bless them) limit a backlog to only 5. However, I can still connect and run 1000's of channels into my machine. I can't, however, have more than 5 trying to connect at exactly the same instant.
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
vsathyan |
Posted: Tue Dec 30, 2014 7:59 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
@fjb_saper - The below is an extract from qm.ini
TCP:
KeepAlive=Yes
ListenerBacklog=128
CHANNELS:
MaxChannels=8192
MaxActiveChannels=8192
@Jeff
Not all applications use the same pattern. Some are hosted in weblogic, some are c# clients.
Yes, a lot of svrconn instance on the queue manager. Though the max channels are set to 8192 in qm.ini, the maxinst is 512 and maxinstc is 256 on the server conn channel which is used by the application to connect to MQ.
No other errors reported in client log. As soon as the application service is started, for C# client the error starts around 1 minute and for weblogic application the error starts after 5 to 8 minutes.
@Paul,
We are not leaving the channels stanza to default values.
Also, MQ should accept the socket connections before a back log is built up by the clients.
There are multiple services running on the queue manager, and this may have adverse effects on other running app instances.
We have already opened a PMR today and sent the traces. Waiting for further update from them.
Still not sure if the Linux kernel needs a patch on its network or is it an MQ issue in accepting multiple socket connections over the same port in a very short time.
This is in development. Thanks it was identified and not carried all through production.
Thanks. |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Dec 30, 2014 8:35 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Are the svrconn instances all from the same application?
Are they all in running status? |
|
Back to top |
|
 |
tczielke |
Posted: Tue Dec 30, 2014 9:03 am Post subject: Re: Unexpected Port Issues |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
vsathyan wrote: |
Hi all,
Lately, we are facing some port issues in our RedHat Linux 6.5 servers, MQ v7.5.0.2
Even though the back log is set to 100, its maxing out and the application faces connectivity issues to the queue manager. After we restart the listener, it works fine for some time and again the issue resurfaces.
If we set the backlog to 200 in qm.ini and start the queue manager, it maxes out at 201 and we have to restart the listener again.
This is occurring randomly on different applications connecting to MQ queue manager in Linux 6.5 and MQ 7.5.0.2.
Below is the output of ss -l command in the Linux server.
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:isoft-p2p *:*
LISTEN 0 100 *:33101 *:*
LISTEN 0 128 *:55149 *:*
LISTEN 0 100 *:33102 *:*
LISTEN 0 128 *:ssad *:*
LISTEN 0 128 *:33551 *:*
LISTEN 0 100 *:33103 *:*
LISTEN 0 128 *:sunrpc *:*
LISTEN 0 100 *:33104 *:*
LISTEN 0 128 *:8176 *:*
LISTEN 0 100 *:beacon-port *:*
LISTEN 0 2 127.0.0.1:findviatv *:*
LISTEN 0 128 *:ssh *:*
LISTEN 129 128 *:33501 *:*
If you observe the last line, the RecvQ is 129, where the application starts facing MQRC_HOST_NOT_AVAILABLE issue. After restarting the listener, the RecvQ becomes 0 and goes on increasing over a period of time (around 2 hours) and finally reaches the max value and app disconnects.
We checked with the network team, and they say the tuning parameters, keep alive etc are fine.
We also tried using a different port, same issue. This is occurring randomly with different applications connecting to MQ on different ports in different servers.
Have anyone faced this kind of issue earlier?
Thanks in advance. |
To clarify, your MQ listener was listening on port 33501 in the example above? That looks like an ephemeral port. I would think it would not be a good idea to assign your MQ listener to listen on an ephemeral port, as that would be a dynamic port that could be assigned to another application.
Also, please note that your Send-Q (max backlog value) is 128. When you set your backlog value on your listener to 200, did the Send-Q stay at 128? This is what I have observed on Linux, that Linux will not let you go over 128 for the max backlog value on a listening socket, even if you ask for a backlog value of > 128. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Dec 30, 2014 9:04 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
vsathyan wrote: |
Though the max channels are set to 8192 in qm.ini, the maxinst is 512 and maxinstc is 256 on the server conn channel which is used by the application to connect to MQ. |
Does a single instance of an application really need 256 concurrent connections (maxinstc) to the SVRCONN channel?
If the application is used concurrently with other instances of the same app, do you really need 512 (maxinst) of the same channel?
I'm suspecting a run-away application - connecting, but not disconnecting - over and over and over. The best-practice is that once instantiated, an application will connect once, and disconnect only once, at end-of-job. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
vsathyan |
Posted: Tue Dec 30, 2014 11:24 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Thanks tczielke. Your suggestion was informative indeed.
But we faced this issue on port 3501 as well. Some of the other ports which faced this problem - 3101, 3106, 3021, 3026. Refered google and read the wiki article on ephemeral ports. The 3 thousand series ports doesnt seem to be in this category.
tczielke wrote: |
Also, please note that your Send-Q (max backlog value) is 128. When you set your backlog value on your listener to 200, did the Send-Q stay at 128? This is what I have observed on Linux, that Linux will not let you go over 128 for the max backlog value on a listening socket, even if you ask for a backlog value of > 128. |
This seems interesting. Initially, we had set the backlog to 300, but it maxed at 128. Later we changed it to 128. Yes, what you said is true.
bruce2359 wrote: |
Does a single instance of an application really need 256 concurrent connections (maxinstc) to the SVRCONN channel? |
Hi Bruce, as mentioned earlier, this is in weblogic, and the particular application service have around 216 queues. Each service opening the queue connects as a separate instance, and hence, the number of maxinstc is set to 256.
Thank you. |
|
Back to top |
|
 |
tczielke |
Posted: Tue Dec 30, 2014 12:21 pm Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
It does sound like your listener is being overrun by too many TCP connections, it can't keep up, and eventually you hit the backlog queue max of outstanding established connections on the MQ listener's established connection backlog queue. (As a side note, my understanding is that Linux has two backlog queues for a passive or listening socket. The synsent and established connection backlog queue. With ss you are viewing the established connection backlog queue, which is probably what you only care about, anyway).
If it was me, I would check things like the following:
1. Run an Application Activity Trace and see if there is an application doing an excessive amount of MQCONNs. The amqsactz program can be helpful in quickly summarizing this data for you. If so, get the app team to fix this! As bruce2359 mentioned, your applications should connect sparingly to the queue manager.
2. Do you have a high connecting MQ application that has a more favorable (i.e. nice) dispatch value than the MQ listener process? If so, consider changing the MQ queue manager to dispatch at the same or less nice (the less nice you are, the more dispatch priority you get) value as the application. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Dec 30, 2014 12:31 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
vsathyan wrote: |
Hi Bruce, as mentioned earlier, this is in weblogic, and the particular application service have around 216 queues. Each service opening the queue connects as a separate instance, and hence, the number of maxinstc is set to 256.
Thank you. |
If your WebLogic app really reads from 216 queues, I do hope that you don't have 216 MDB's running on it. That would mean to me at least 648 connections (assuming a shareconv of 1 and a request reply pattern)...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
vsathyan |
Posted: Wed Dec 31, 2014 1:15 pm Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Update :
We had a working session with one of the c# app teams to determine the cause of this unexpected port issue.
Started off with a completely different queue manager (non-clustered), with only two queues (one inbound and one outbound), one server connection channel with mcauser set to mqm.
The application connected, continued to run for more than 3 hours.
Next step, on the server connection channel - changed the mcauser to be user id intended to be used by the application and started the application.
Application started facing channel maxing out issues immediately after starting it, and the port backlog issue within 1 minute of starting.
The below was found in the queue manager error log.
Code: |
----- amqrmrsa.c : 898 --------------------------------------------------------
12/26/2014 03:35:49 AM - Process(3204.44929) User(mqm) Program(amqrmppa)
Host(mqhostname) Installation(Installation1)
VRMF(7.5.0.2) QMgr(QMGRNAME)
AMQ9999: Channel 'CHANNEL' to host 'apphostname (10.123.117.92)' ended abnormally.
EXPLANATION:
The channel program running under process ID 3204 for channel 'CHANNEL'
ended abnormally. The host name is 'apphostname (10.123.117.92)'; in some
cases the host name cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error logs to
determine the cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or "SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information can be
found in the System Administration Guide.
|
Looking at the previous error message also had the same information.
The same application code was working fine if connected to a windows queue manager. It makes 5 and exactly 5 connections to a windows queue manager.
But when just pointed to a linux queue manager, the channel maxed out within seconds of starting the application and soon the port backlog reached max, resulting in MQRC_HOST_NOT_AVAILABLE.
When we compared the windows and linux queues. We checked attribute by attribute, and then checked the queue object authorities. The only difference seen was in the queue object authorities.
Just to check, we did the same setup of authorities for the queues in linux as it was in windows (+allmqi) and started the application. Perfect. No issues reported. The application started working fine without any channel max out issues and even the port backlog issue also stopped.
I'm still not convinced why a queue authority setup would result in channel max out and port backlog issues. Trust me, but this is what happened. I'm into MQ for almost 11 years now, but never expected a queue authority to result in channel or port maxing out.
Error logs dont say the object is missing permissions.
When we did the same for the weblogic application, it also just started working fine.
I'm working tomorrow with the app team to do a binary series on the queue object authorities by turning off/on on each attribute to identify which authority is particularly creating this problem.
I'm updating the status here, as it may be helpful for some one in the future if they face similar issues.
Thank you.
Wish you happy new year 2015. |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|