Author |
Message
|
dudetom |
Posted: Fri Sep 29, 2017 4:08 am Post subject: Server can't see our active listener while connected |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
We have IBM MQ v8.0.0.5 running. And we connect using .NET using the IBM.XMS library.
We encouter several problems but the most important one is the following:
When connected for a long time (more than 1 day) the system engineer at the server side says we aren't listening anymore.
but at client side, we can confirm that the connection is up because we are getting the following reason codes each 3 minutes:
• MQRC_RECONNECTING_ERRORCODE = 2544
• MQRC_RECONNECTED_ERRORCODE = 2545
This means we are connected right? So as you can see, we have two problems.
Problem one: We keep getting the reason code 'reconnecting' and 'reconnected'. We don't want this behaviour because our event viewer is full of errors.
Problem two: At client-side we can't receive any messages through the queue anymore once a day is passed (but we keep getting the reason codes 2544 and 2545, which means there is a working connection).
Can someone help me with this? :help: |
|
Back to top |
|
 |
exerk |
Posted: Fri Sep 29, 2017 5:09 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
If the error codes are being generated after one day I would suspect a firewall. Have you checked for that? _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Sep 29, 2017 5:17 am Post subject: Re: Server can't see our active listener while connected |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
dudetom wrote: |
This means we are connected right? |
No.
If you're getting a continuous cycle of reconnecting/reconnected, then a) something is causing the client to need to reconnect and b) the client is likely not having a working connection long enough to be reading messages.
dudetom wrote: |
Problem one: We keep getting the reason code 'reconnecting' and 'reconnected'. We don't want this behaviour because our event viewer is full of errors. |
2544 is a warning, 2545 is informational. Fix your event handler accordingly
dudetom wrote: |
Problem two: At client-side we can't receive any messages through the queue anymore once a day is passed (but we keep getting the reason codes 2544 and 2545, which means there is a working connection). |
See above. I would theorize (given that it fails after a day) that your connection is being flagged as stale by a firewall or some other network component & being terminated. This would also explain the churning of reconnection (because the client reconnects and the firewall kills it again) and why the system engineer on the server doesn't see you listening (because he's the other side of the firewall) _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
dudetom |
Posted: Fri Sep 29, 2017 5:21 am Post subject: |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
exerk wrote: |
If the error codes are being generated after one day I would suspect a firewall. Have you checked for that? |
Thanks for your time. The error codes 2544 and 2545 are generated constantly, even the first day while everything works accordingly. After one day we aren't able to receive messages but we keep getting these error codes.
EDIT: We didn't check the firewalls. I want to be sure it's not our fault before contacting the firewall guys...
Last edited by dudetom on Fri Sep 29, 2017 5:26 am; edited 1 time in total |
|
Back to top |
|
 |
dudetom |
Posted: Fri Sep 29, 2017 5:24 am Post subject: |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
And the event viewer errors we are getting are like this:
Error on receive from host .
An error occurred receiving data from over TCP/IP. This may be due to a communications failure.
The return code from the TCP/IP (socket.Receive) call was 10054 (X'2746'). Record these values and tell the systems administrator.
To be clear: The first day, everything is working properly while these errors are thrown. So I think these are separate problems we are facing
This is the complete error:
Code: |
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="WebSphere MQ Managed Client" />
<EventID Qualifiers="0">9208</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2017-09-27T14:09:00.000000000Z" />
<EventRecordID>65571</EventRecordID>
<Channel>Application</Channel>
<Computer>SRV01</Computer>
<Security />
</System>
<EventData>
<Data>Error on receive from host . An error occurred receiving data from over TCP/IP. This may be due to a communications failure. The return code from the TCP/IP (socket.Receive) call was 10054 (X'2746'). Record these values and tell the systems administrator.</Data>
</EventData>
</Event>
|
And level is 'Error'.
Last edited by dudetom on Fri Sep 29, 2017 6:41 am; edited 2 times in total |
|
Back to top |
|
 |
exerk |
Posted: Fri Sep 29, 2017 5:28 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
dudetom wrote: |
And the event viewer errors we are getting are like this:
Error on receive from host .
An error occurred receiving data from over TCP/IP. This may be due to a communications failure.
The return code from the TCP/IP (socket.Receive) call was 10054 (X'2746'). Record these values and tell the systems administrator. |
Google returned THIS, and there are others... _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
dudetom |
Posted: Fri Sep 29, 2017 5:36 am Post subject: |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
exerk wrote: |
dudetom wrote: |
And the event viewer errors we are getting are like this:
Error on receive from host .
An error occurred receiving data from over TCP/IP. This may be due to a communications failure.
The return code from the TCP/IP (socket.Receive) call was 10054 (X'2746'). Record these values and tell the systems administrator. |
Google returned THIS, and there are others... |
This is actually not the main problem. We now that we can solve this problem by using KeepAlive in the mqclient.ini file.
The other problem is more important actually (listener that stops working after one day). So it's most probably a firewall issue? |
|
Back to top |
|
 |
dudetom |
Posted: Fri Sep 29, 2017 5:41 am Post subject: Re: Server can't see our active listener while connected |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
Vitor wrote: |
This would also explain the churning of reconnection (because the client reconnects and the firewall kills it again) and why the system engineer on the server doesn't see you listening (because he's the other side of the firewall) |
But we receive those error codes from the IBM server. So the system engineer must at least see those messages being sent to the client?
At which side you think the firewall shuts down the connection after one day? it could be both sides I think?
Can I maybe see somewhere (website url?) the PORTS and PROTOCOLS that have to be allowed on the firewalls? Then I can ask the firewall guys to check the connection on those ports if it gets shut down |
|
Back to top |
|
 |
Vitor |
Posted: Fri Sep 29, 2017 5:55 am Post subject: Re: Server can't see our active listener while connected |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
dudetom wrote: |
Vitor wrote: |
This would also explain the churning of reconnection (because the client reconnects and the firewall kills it again) and why the system engineer on the server doesn't see you listening (because he's the other side of the firewall) |
But we receive those error codes from the IBM server. So the system engineer must at least see those messages being sent to the client? |
Why do you think the server is sending those messages? How could the server send a 2544 message because, by definition, the connection has failed and is being reestablished? Those codes come from the client.
dudetom wrote: |
At which side you think the firewall shuts down the connection after one day? it could be both sides I think? |
What do you mean by "both sides" of the firewall? Firewalls don't take sides, they handle connections. And, as my worthy associate & I have pointed out, that 1 day time frame indicates a firewall thinks your connection has expired. Or should have. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Sep 29, 2017 5:59 am Post subject: Re: Server can't see our active listener while connected |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
dudetom wrote: |
Can I maybe see somewhere (website url?) the PORTS and PROTOCOLS that have to be allowed on the firewalls? |
Yes, there's bound to be some kind of administrative website that shows the ports and protocols the firewall is allowing that you can look at. It would be a shame if all the hackers trying to get into the network the firewall is protecting had to guess what the firewall would allow; that would really slow down their attacks......
dudetom wrote: |
Then I can ask the firewall guys to check the connection on those ports if it gets shut down |
Just ask the firewall people. Who will undoubtedly look on the kind of website you describe because they have access to it. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Sep 29, 2017 6:05 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Don't just ask your firewall guys, also ask your network engineers.
The previously shown TCP error is one we had when there was a routing problem and the connection could not succeed because of it...
In case of a multi-instance qmgr make sure the connection will work from / to both of its hosts...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
dudetom |
Posted: Fri Sep 29, 2017 6:39 am Post subject: Re: Server can't see our active listener while connected |
|
|
Apprentice
Joined: 29 Sep 2017 Posts: 45
|
Vitor wrote: |
Why do you think the server is sending those messages? How could the server send a 2544 message because, by definition, the connection has failed and is being reestablished? Those codes come from the client. |
Ok, if it comes from the client (me) and the client gives 2545 (reconnected). Reconnected with what? I suppose it reconnected to the server? And I ask now, why can't the server side system engineer see that the client is reconnected? |
|
Back to top |
|
 |
Vitor |
Posted: Fri Sep 29, 2017 6:45 am Post subject: Re: Server can't see our active listener while connected |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
dudetom wrote: |
Ok, if it comes from the client (me) and the client gives 2545 (reconnected). Reconnected with what? With itself? I suppose it reconnected to the server? |
No, it reconnected with whatever it sees at the other end of the network connection, i.e. it's got an ACK and thinks it's good.
dudetom wrote: |
And I ask now, why can't the server side system engineer can't see that the client is reconnected? |
This is further evidence that whatever the client has connected to, it's not the queue manager on the remote server.
But if you think I'm wrong (and I've dragged by worthy associates into the maelstrom of my error) and there's no network issue, raise a PMR with IBM to get MQ fixed. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Sep 29, 2017 7:29 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
telnet to the host and port.
The admin should see errors complaining about bad messages.
If you can't even connect, then the listener is down.
The admin should troubleshoot that. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
Vitor |
Posted: Fri Sep 29, 2017 7:39 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
mqjeff wrote: |
If you can't even connect, then the listener is down. |
Or you're not reaching the listener.
The admin will not be able to resolve telnet returning a "connection refused" error. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
|