Author |
Message
|
fjb_saper |
Posted: Thu May 16, 2019 10:59 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Vitor wrote: |
fjb_saper wrote: |
15 mins later the connection was working and they still assured me they hadn't done a thing....  |
Isn't it weird how network problems go away shortly after you manage to get the network people involved (often at gunpoint) yet not one of them has every done a contact admin thing to fix them? |
Not that they'd ever do anything to fix it. They do. But they'd never admit to having done it (because then it would have become their fault???).  _________________ MQ & Broker admin |
|
Back to top |
|
 |
Vitor |
Posted: Thu May 16, 2019 11:13 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
fjb_saper wrote: |
Vitor wrote: |
fjb_saper wrote: |
15 mins later the connection was working and they still assured me they hadn't done a thing....  |
Isn't it weird how network problems go away shortly after you manage to get the network people involved (often at gunpoint) yet not one of them has every done a contact admin thing to fix them? |
Not that they'd ever do anything to fix it. They do. But they'd never admit to having done it (because then it would have become their fault???).  |
 _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bruce2359 |
Posted: Thu May 16, 2019 11:55 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
It's not just the network folks. I've been lied to by developers who swore that they made no changes recently to the offending app.
If I had a dollar ... _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
vicks_mq |
Posted: Thu May 16, 2019 2:28 pm Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
hughson wrote: |
vicks_mq wrote: |
#1.I noticed one more behaviour, whenever the channel TIMEOUT error comes, all the channel instances which started together goes down, even for some of the channel instances the LSTMSGTI is current. |
vicks_mq wrote: |
#2.Hi Morag, I forgot to mention here that although the heartbeat flows are going across every 5 mins and I am seeing corresponding value of BYTSSENT increasing by 28 for all the instances of channels which are dropping but at the same time their LSTMSGTI has not changed for last 30-40 mins, I know HBINT and LSTMSGTI are not related but just want to mention that the instances of SVRCONN which are dropping are the one whose LSTMSGTI has not updated for last 30-40 mins. |
These two statements from you seem contradictory at first reading. Perhaps there is more information behind them? For example, are applications making more than one connection, but then when one connection that has not done an API call for 40 minutes (thus LSTMSGTI is 40 minutes ago) the connection is dropped and the the application then ends all the other connections it has made at the same time? Could that be the pattern of your applications? While I still think your networking team should be assisting in the diagnosis here, it would be interesting to understand more about the client end of the application rather than just the set of disparate SVRCONNs that are seen on the queue manager.
Cheers,
Morag |
Hi Morag, I take my statement back about point no#1, as this is something which I had seen only once or twice.
but the statement in Point#2 is consistent and happening everytime. |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri May 17, 2019 4:48 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Can you please show us the qm.ini for the queue manager?
The channel dropping after 40 mins of "inactivity" still smells like a firewall idle timeout to me.  _________________ MQ & Broker admin |
|
Back to top |
|
 |
vicks_mq |
Posted: Fri May 17, 2019 9:41 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
|
Back to top |
|
 |
Vitor |
Posted: Fri May 17, 2019 9:54 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
vicks_mq wrote: |
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
It means either that they've been open longer than the firewall rules allow, or they've been open and idle (according to the firewall definition of "idle") for longer than the firewall rules allow.
Enjoy. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
hughson |
Posted: Fri May 17, 2019 10:46 pm Post subject: |
|
|
 Padawan
Joined: 09 May 2013 Posts: 1959 Location: Bay of Plenty, New Zealand
|
Vitor wrote: |
vicks_mq wrote: |
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
It means either that they've been open longer than the firewall rules allow, or they've been open and idle (according to the firewall definition of "idle") for longer than the firewall rules allow. |
So have you requested that they make the rules for longer connections for your particular IP addresses so that they no longer drop?
Cheers,
Morag _________________ Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software |
|
Back to top |
|
 |
vicks_mq |
Posted: Tue May 21, 2019 4:42 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
hughson wrote: |
Vitor wrote: |
vicks_mq wrote: |
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
It means either that they've been open longer than the firewall rules allow, or they've been open and idle (according to the firewall definition of "idle") for longer than the firewall rules allow. |
So have you requested that they make the rules for longer connections for your particular IP addresses so that they no longer drop?
Cheers,
Morag |
Hi Hughson, no not yet. there are corporate hurdles to pass through to get it done. I am trying increasing the HBINT and see if it helps. I will share the results here. |
|
Back to top |
|
 |
hughson |
Posted: Tue May 21, 2019 6:30 am Post subject: |
|
|
 Padawan
Joined: 09 May 2013 Posts: 1959 Location: Bay of Plenty, New Zealand
|
vicks_mq wrote: |
I am trying increasing the HBINT and see if it helps. I will share the results here. |
By increasing, do you mean "increasing the frequency of the heartbeats"? i.e. by making the number in the HBINT value (at both ends of the channel) smaller?
Cheers,
Morag _________________ Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software |
|
Back to top |
|
 |
vicks_mq |
Posted: Wed May 22, 2019 5:48 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
hughson wrote: |
vicks_mq wrote: |
I am trying increasing the HBINT and see if it helps. I will share the results here. |
By increasing, do you mean "increasing the frequency of the heartbeats"? i.e. by making the number in the HBINT value (at both ends of the channel) smaller?
Cheers,
Morag |
Hi Morag, I meant increasing the value of the HBINT means decreasing the frequency but that didn't help out the issue.
our JBOSS team added this in the MQ Resource Adaptor configuration file and that seems to have helped to rebuild the channel instances.
<transaction-support>XATransaction</transaction-support>
so now when the channel instances drop after 30 mins, the application is able to create new channel instances automatically which is keeping our IPPROCS count consistent on the local queue. |
|
Back to top |
|
 |
gbaddeley |
Posted: Sun May 26, 2019 3:52 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
vicks_mq wrote: |
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
Find out the network aged-out interval. Set HBINT on your channels to less than this. eg. aged-out = 500 sec, set HBINT = 400 sec. This forces MQ to regularly send a over the TCP socket session, which resets the network aged-out timer to 0 each time. _________________ Glenn |
|
Back to top |
|
 |
vicks_mq |
Posted: Mon May 27, 2019 3:42 am Post subject: |
|
|
Disciple
Joined: 03 Oct 2017 Posts: 162
|
gbaddeley wrote: |
vicks_mq wrote: |
Network team has finally responded saying all the MQ channel instances dropping are because of aged-out, whatever that means. |
Find out the network aged-out interval. Set HBINT on your channels to less than this. eg. aged-out = 500 sec, set HBINT = 400 sec. This forces MQ to regularly send a over the TCP socket session, which resets the network aged-out timer to 0 each time. |
Hi, while our HBINT is 300 seconds, these are the following parameters defined at Firewall configuration
TCP TIMEOUT = 1800 Sec
TCP Half closed - 120 sec
TCP Time Wait - 15 sec |
|
Back to top |
|
 |
gbaddeley |
Posted: Mon May 27, 2019 3:47 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
vicks_mq wrote: |
Hi, while our HBINT is 300 seconds, these are the following parameters defined at Firewall configuration
TCP TIMEOUT = 1800 Sec
TCP Half closed - 120 sec
TCP Time Wait - 15 sec |
Half Closed and Time Wait apply to socket sessions that are being finalized (closed) by the apps. The TIMEOUT is greater than your HBINT, so the firewall should not be dropping the sessions due to lack of packet traffic. Request your n/w team to provide firewall logs for the source and destination IP addresses, to demonstrate it dropping a session due to timeout, even though you claim that MQ HBINT should be regularly sending packets over the session. _________________ Glenn |
|
Back to top |
|
 |
|