Author |
Message
|
vicks_mq |
Posted: Mon May 13, 2019 8:45 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
bruce2359 wrote: |
vicks_mq wrote: |
... when we migrated our application to MQ appliance 9, we saw the channel instances were dropping automatically after a period of inactivity(few hours). We Don't want that. we want to keep channel instances running. |
Ok. But other than that, what other symptoms do you see? Are you/your apps missing service levels (SLAs)?
What is the expected/usual transaction rate? 50,000/second? 5,000/second? 500/second? One or two per hour? |
bruce2359 wrote: |
If there is no activity on the channel hours, why would you want it to stay open? |
Quote: |
These are Spring based listeners , once channel instance drop, these listeners drop they are not able to come back up even during peak incoming messages. so don't want them to drop.
During peak time around 20 per second, in normal time only 1 message per second and most of the time inactivie. |
|
|
Back to top |
|
 |
PaulClarke |
Posted: Mon May 13, 2019 9:26 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
I am a little confused. You say you want to keep the channels running and yet on your Appliance you have set a DISCINT of 3600 which will drop the connection after a period of inactivity.Why do you not just set the DISCINT on your appliance to 0 ? _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
vicks_mq |
Posted: Mon May 13, 2019 11:10 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
PaulClarke wrote: |
I am a little confused. You say you want to keep the channels running and yet on your Appliance you have set a DISCINT of 3600 which will drop the connection after a period of inactivity.Why do you not just set the DISCINT on your appliance to 0 ? |
There are bigger powers in my organization which doesn't allow this
I think one fact I am a little confused is "DISCINT definition says "This attribute is the length of time after which a channel closes down, if no message arrives during that period.". Does this no message rules apply to Messages only, it has nothing to do with 28 bytes channel received as part of heartbeat, right? |
|
Back to top |
|
 |
PaulClarke |
Posted: Mon May 13, 2019 11:26 am Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
DISCINT used to apply only to message channels (not clntcon/svrconn) channels and sadly much of the documentation could be clearer. That quote you give applies only to Sender/Receiver type channels. Look at the section which explicitly talks about SVRCONN/CLNTCONN channels to see how it might affect a client.
However, if 'bigger powers' prevent you from changing the DISCINT value then I suspect you can never fix the problem of MQ dropping inactive clients. _________________ Paul Clarke
MQGem Software
www.mqgem.com
Last edited by PaulClarke on Mon May 13, 2019 2:59 pm; edited 1 time in total |
|
Back to top |
|
 |
vicks_mq |
Posted: Mon May 13, 2019 11:41 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
PaulClarke wrote: |
DISCINT used to apply only to message channels (not clntcon/svrconn) channels and sadly much of the documentation could be clearer. That quote you give applies only to Sender/Receiver type channels. Look at the section which explicitly talks about SVRCONN/CLNTCONN channels to see how it might affect a client. |
Code: |
For SVRCONN channels using the TCP protocol, DISCINT has a different interpretation. It is the minimum time in seconds for which the SVRCONN instance remains active without any communication from its partner client. A value of zero disables this disconnect processing. The SVRCONN inactivity interval applies only between IBM WebSphere MQ API calls from a client, so no client is disconnected during an extended MQGET with wait call. This attribute is ignored for SVRCONN channels using protocols other than TCP. |
so as long as no MQI calls initiated from the application that will start the counter on inactivity interval and based on value of DISCINT matching the inactivity interval the channel instance will disconnect. |
|
Back to top |
|
 |
vicks_mq |
Posted: Mon May 13, 2019 3:55 pm Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
I increased the DISCINT interval to 20 hours but still the channel instances are dropping, it is not a good news
and everytime I receive the following error.
These are the errors on our MQ Appliance logs
Code: |
---- amqrccca.c : 1131 -------------------------------------------------------
05/13/19 18:57:04 - Process(1400849.37299) User(mqsystem) Program(amqrmppa)
Host(MQA005B12) Installation(MQAppliance)
VRMF(9.1.0.0) QMgr(CHAMAN)
Time(2019-05-13T22:57:04.161Z)
RemoteHost(10.11.122.133)
CommentInsert1(LONDON.TO.PARIS)
CommentInsert2(10.11.122.133)
CommentInsert3(select() [TIMEOUT] 65 seconds)
AMQ9271E: Channel 'LONDON.TO.PARIS' timed out.
EXPLANATION:
A timeout occurred while waiting to receive from the other end of channel
'LONDON.TO.PARIS'. The address of the remote end of the connection was
'10.11.122.133'.
ACTION:
The return code from the select() [TIMEOUT] 65 seconds call was 0 (X'0').
Record these values and tell the systems administrator.
----- amqccita.c : 4779 -------------------------------------------------------
05/13/19 18:57:04 - Process(1400849.37299) User(mqsystem) Program(amqrmppa)
Host(MQA005B12) Installation(MQAppliance)
VRMF(9.1.0.0) QMgr(CHAMAN)
Time(2019-05-13T22:57:04.161Z)
RemoteHost(10.11.122.133)
CommentInsert1(LONDON.TO.PARIS)
CommentInsert2(1400849)
CommentInsert3(10.11.122.133)
AMQ9999E: Channel 'LONDON.TO.PARIS' to host '10.11.122.133' ended abnormally.
EXPLANATION:
The channel program running under process ID 1400849 for channel
'LONDON.TO.PARIS' ended abnormally. The host name is '10.11.122.133'; in some
cases the host name cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error logs to
determine the cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or "SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information can be
found in the System Administration Guide.
|
|
|
Back to top |
|
 |
hughson |
Posted: Tue May 14, 2019 7:53 am Post subject: |
|
|
 Padawan
Joined: 09 May 2013 Posts: 1959 Location: Bay of Plenty, New Zealand
|
vicks_mq wrote: |
when we migrated our application to MQ appliance 9, we saw the channel instances were dropping automatically after a period of inactivity(few hours). We Don't want that. we want to keep channel instances running.
DISCINT which is set to 3600 in new MQ appliance (where channel instances are dropping) |
With your previous value of DISCINT(3600) your channel instances were dropping automatically after an hour of inactivity.
With your new value of DISCINT(72000) how long before your channel instances are dropping?
Cheers,
Morag _________________ Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software |
|
Back to top |
|
 |
vicks_mq |
Posted: Tue May 14, 2019 9:21 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
hughson wrote: |
vicks_mq wrote: |
when we migrated our application to MQ appliance 9, we saw the channel instances were dropping automatically after a period of inactivity(few hours). We Don't want that. we want to keep channel instances running.
DISCINT which is set to 3600 in new MQ appliance (where channel instances are dropping) |
With your previous value of DISCINT(3600) your channel instances were dropping automatically after an hour of inactivity. (not an hour, it is dropping in around 40 minutes)
With your new value of DISCINT(72000) how long before your channel instances are dropping? (same still dropping in around 40 minutes)
Cheers,
Morag |
|
|
Back to top |
|
 |
gbaddeley |
Posted: Tue May 14, 2019 3:51 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
Quote: |
A timeout occurred while waiting to receive from the other end of channel
'LONDON.TO.PARIS'. The address of the remote end of the connection was
'10.11.122.133'.
ACTION:
The return code from the select() [TIMEOUT] 65 seconds call was 0 (X'0').
Record these values and tell the systems administrator. |
This is not normal. The MQ channel agent (amqrmppa) was waiting to receive a TCP packet from the remote end 10.11.122.133 but it didn't arrive within 65 seconds. The OS TCP Socket function "select()" timed out.
This is outside the processing that occurs for heartbeat, keepalive and discint. Were there any issues or errors recorded at the remote end? _________________ Glenn |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue May 14, 2019 4:10 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9472 Location: US: west coast, almost. Otherwise, enroute.
|
gbaddeley wrote: |
This is outside the processing that occurs for heartbeat, keepalive and discint. |
Sounds like network/firewall imposed issue. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
hughson |
Posted: Tue May 14, 2019 10:54 pm Post subject: |
|
|
 Padawan
Joined: 09 May 2013 Posts: 1959 Location: Bay of Plenty, New Zealand
|
vicks_mq wrote: |
hughson wrote: |
vicks_mq wrote: |
when we migrated our application to MQ appliance 9, we saw the channel instances were dropping automatically after a period of inactivity(few hours). We Don't want that. we want to keep channel instances running.
DISCINT which is set to 3600 in new MQ appliance (where channel instances are dropping) |
With your previous value of DISCINT(3600) your channel instances were dropping automatically after an hour of inactivity. (not an hour, it is dropping in around 40 minutes)
With your new value of DISCINT(72000) how long before your channel instances are dropping? (same still dropping in around 40 minutes)
Cheers,
Morag |
|
Then it is not DISCINT that is dropping the connections. Sorry - I was under the impression that the connections were dropping "after a few hours" before.
This sounds very much like a firewall is dropping the connection after 40 minutes, although not due to inactivity since you know that heartbeat flows are going across. The timeout message you report is going to be the sender of the heartbeat waiting for the answer back from the heartbeat flow and not getting anything because the socket is no longer there. The 65 second timeout suggests this because other receive-wait (select) calls would use the negotiated heartbeat value (plus a bit) which you have told us is 300.
Cheers,
Morag _________________ Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software |
|
Back to top |
|
 |
vicks_mq |
Posted: Wed May 15, 2019 2:52 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
Thank you all for replying to this.
I will collect more data today and share across. |
|
Back to top |
|
 |
vicks_mq |
Posted: Wed May 15, 2019 4:58 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
hughson wrote: |
Then it is not DISCINT that is dropping the connections. Sorry - I was under the impression that the connections were dropping "after a few hours" before.
This sounds very much like a firewall is dropping the connection after 40 minutes, although not due to inactivity since you know that heartbeat flows are going across. The timeout message you report is going to be the sender of the heartbeat waiting for the answer back from the heartbeat flow and not getting anything because the socket is no longer there. The 65 second timeout suggests this because other receive-wait (select) calls would use the negotiated heartbeat value (plus a bit) which you have told us is 300.
Cheers,
Morag |
I have a question, if Firewall is dropping connection in 65 seconds, then all the connections should drop in 65 seconds of inactivity, why the connection is taking 40 minutes to drop and after that it is pretty random, sometime one connection drop in 40 minutes and next immediately in 2-3 minutes then wait another 35-40 mins and so on. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed May 15, 2019 5:46 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9472 Location: US: west coast, almost. Otherwise, enroute.
|
vicks_mq wrote: |
I have a question, if Firewall is dropping connection in 65 seconds, then all the connections should drop in 65 seconds of inactivity, why the connection is taking 40 minutes to drop and after that it is pretty random, sometime one connection drop in 40 minutes and next immediately in 2-3 minutes then wait another 35-40 mins and so on. |
Questions best brought to your network firewall/router team. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
vicks_mq |
Posted: Wed May 15, 2019 7:00 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
bruce2359 wrote: |
vicks_mq wrote: |
I have a question, if Firewall is dropping connection in 65 seconds, then all the connections should drop in 65 seconds of inactivity, why the connection is taking 40 minutes to drop and after that it is pretty random, sometime one connection drop in 40 minutes and next immediately in 2-3 minutes then wait another 35-40 mins and so on. |
Questions best brought to your network firewall/router team. |
I just restarted the connecting application at 9:49 EST and I got my 1st TIMEOUT error at 05/15/19 10:28:08AM and then the 2nd timeout at 05/15/19 10:30:22 and 3rd TIMEOUT at 05/15/19 10:33:09 and 4th one at 05/15/19 10:35:22 and now no TIMOUT for the last 25 minutes.
I will check again. |
|
Back to top |
|
 |
|