ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » TCP/IP errors from AIX to Windows 64

Post new topic  Reply to topic Goto page 1, 2  Next
 TCP/IP errors from AIX to Windows 64 « View previous topic :: View next topic » 
Author Message
MartynB
PostPosted: Mon Feb 18, 2008 4:38 am    Post subject: TCP/IP errors from AIX to Windows 64 Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Hi,

We are having some strange problems exhibited when we start a sender channel from a queue manager on AIX to windows.

The channel initially starts and a few messages flow then we get a TCP/IP error reported as an issue on the windows reciever channel:

Quote:

Channel program ended abnormally.
Look at previous error messages for channel program in the error files to determine the cause of the failure.


Then...

Quote:

Error on receive from host.
An error occurred receiving data from over TCP/IP. This may be due to a communications failure.
The return code from the TCP/IP (recv) call was 10054 (X'2746'). Record these values and tell the systems administrator.


recv is a TCP/IP API used to read data down a socket. This is failing with the error 10054.

In the C header files this equates to:

Quote:

#define WSAECONNRESET (WSABASEERR+54)


In the docs the equates to:

Quote:

WSAECONNRESET The virtual circuit was reset by the remote side executing a hard or abortive close. The application should close the socket

because it is no longer usable.


I'm not an expert on TCP/IP but to my knowledege this is a definate indication that the problem is on the sender channel on the unix queue

manager. However I'm not massively skilled with unix.

We see this in the unix logs...

Quote:

Error Details:
12/02/08 15:41:22 - Process(7696492.1) User(mqm) Program(runmqchl_nd)
AMQ9206: Error sending data to host.
EXPLANATION:
An error occurred sending data over TCP/IP. This
may be due to a communications failure.
ACTION:
The return code from the TCP/IP(write) call was 32 X('20'). Record these values
and tell your systems administrator.


So this looks to me like there is a definate issue with TCP/IP.

We have tried:
1). Using a different queue manager on the unix server.
2). Using a different queue manager on a different windows server.
3). Using a different queue manager on a different windows server from a new queue manager on a different unix server.

We experience the same problem each time.

Note that we DO have connectivity from the original unix server to other windows queue managers - including ones on 64 bit windows operating

systems.

I believe that all the problematic windows servers are on the same network segment and I'm currently trying to test again one which isn't.

Does anyone know what might be causing this issue? Do you concur with my thoughts of this being network related?

I'm banging my head against a wall here so any help would be appreciated to make my headache go away!

Thanks,

Martyn
Back to top
View user's profile Send private message Send e-mail
Mr Butcher
PostPosted: Mon Feb 18, 2008 5:03 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

Maybe this helps...

http://www-1.ibm.com/support/docview.wss?rs=171&context=SSFKSJ&uid=swg21237211
_________________
Regards, Butcher
Back to top
View user's profile Send private message
MartynB
PostPosted: Mon Feb 18, 2008 5:25 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Thanks for your input Mr Butcher.
Back to top
View user's profile Send private message Send e-mail
MartynB
PostPosted: Mon Feb 18, 2008 7:08 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

We've had someone from networks attempt to look at this issue but we are still struggling.

Despite the fact that the sender channel is
TCP type we can see, from a network trace, Netbios calls from Unix to the windows server with host unreachable.
From the working link (from the same unix server) we do not see any netbios calls.

I have looked at the channel and queue manager definations to determine if there are settings which would result in a netbios call all to no avail.

As mentioned the same queue manager on unix does communicate successfully to other windows queue managers so I am struggling to determine what the problem is.

Should MQ be making netbios calls over a TCP type sender channel? Is this just a red herring?

Can anyone shed any light?

Any help greatly appreciated.

In the meantime I'll take some more aspirin for my ever worsening headache.

Regards,

Martyn
Back to top
View user's profile Send private message Send e-mail
Vitor
PostPosted: Mon Feb 18, 2008 7:21 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

MartynB wrote:
Should MQ be making netbios calls over a TCP type sender channel? Is this just a red herring?


I've never seen it do it, but that doesn't mean it doesn't - not spent a lot of time looking at packet sniffers.

MartynB wrote:
Any help greatly appreciated.


Going to first principles, can you telnet from the AIX box to the WMQ port on the Windows box? Does that listener then throw a "Who are you? You're not an MQ channel!" error?

If that doesn't work, attack the network wizard with a sharpened trout until he fixes it.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
MartynB
PostPosted: Mon Feb 18, 2008 7:29 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Quote:

If that doesn't work, attack the network wizard with a sharpened trout until he fixes it.


Thanks - sharpened trout on order.

Failing that I'll hit him with a wet carp.
Back to top
View user's profile Send private message Send e-mail
MartynB
PostPosted: Mon Feb 18, 2008 7:37 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Can you perhaps shed a bit more light on this technique?

Quote:

Going to first principles, can you telnet from the AIX box to the WMQ port on the Windows box? Does that listener then throw a "Who are you? You're not an MQ channel!" error?


I can open a telnet session from the unix server but what will this prove?
What should I be looking for in the windows or unix logs?

Regards,
Back to top
View user's profile Send private message Send e-mail
MartynB
PostPosted: Mon Feb 18, 2008 7:39 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Failing that, perhaps you can instead expand on the technique of applying pressure utilising a sharpened fish!

Back to top
View user's profile Send private message Send e-mail
jefflowrey
PostPosted: Mon Feb 18, 2008 7:51 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

The point is that if you can't establish a connection to the MQ listener using "telnet hostname port#", then there's no way that MQ itself can establish a connection.

As for the proper techniques of applying sharpened trout, it's really subject to local variations. But do keep your carpal tunnel in mind, and keep your wrist firm.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Feb 18, 2008 8:41 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

MartynB wrote:
Failing that, perhaps you can instead expand on the technique of applying pressure utilising a sharpened fish!




For those confused about the new WMQ de facto standard for dealing with tricky, people-based issues -The Search Button Is Your Friend.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Feb 18, 2008 8:43 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

MartynB wrote:
I can open a telnet session from the unix server but what will this prove?


Not to the default telnet port, but to the one the MQ listener is using.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
MartynB
PostPosted: Mon Feb 18, 2008 9:56 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

Yeah I appreciate that you meant the listener's port just thought (initially) you meant I should then send some special characters to invoke a conversation.

As a further update we've now isolated the problem just between two windows queue managers without involving the unix server. They are both on the same network segment and have only one network hop between them. This introduces the same problems as before only now I ALSO periodically see a TCP/IP send API failing (10054 again) on the queue manager hosting the sender channel.

If I run a queue manager to queue manager link on the same server I don't get any issues at all. This hopefully eliminates MQ (which I installed).

Given that there is no additional network hop for the errant link I am assuming this means that there can't possibly be any network device that is failing. It's also odd that I can estabilish pings between the two servers consistently even when the channel cheekily shuts itself down. This seems to imply there is no intermittent cabling/network problem at all?!?

I've also stuck a network sniffer on both servers and I can't see any failing packets so we are starting to believe this is something to do with the build of the servers themselves however I don't know what.

Is there a way of enabling some form of windows TCP/IP trace. I dont even think, when the error occurs, anything is getting down to the wire?

I think I definately need to bring a wet fish into work tommorrow to slap someone with.
Back to top
View user's profile Send private message Send e-mail
jefflowrey
PostPosted: Mon Feb 18, 2008 10:18 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Clearly you've isolated the issue to network configuration on Windows.

I'd say the best bet is TCP/IP Timeout, that the network stack is timing out (and therefore closing) the connection after too short a time.

Firewall?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Feb 18, 2008 11:35 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

MartynB wrote:
Is there a way of enabling some form of windows TCP/IP trace. I dont even think, when the error occurs, anything is getting down to the wire?


I'm not aware of anything that useful in Windoze, but I'll be corrected on that.

MartynB wrote:
I think I definately need to bring a wet fish into work tommorrow to slap someone with.


Even if it doesn't help, it's fun. And theroputic.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
MartynB
PostPosted: Tue Feb 19, 2008 9:00 am    Post subject: Reply with quote

Novice

Joined: 14 Jan 2008
Posts: 24

We are still seeing the IP problems from MQ and so as a further test I run a test tool I wrote ot excercise the windows TCP/IP APIs outside of MQ.

Even running on the same server using two instances of the tool I am getting the 10054 error periodically.
I thought this had eliminated MQ from the equation.

HOWEVER:

There is another application that talks TCP/IP on the server and while running a python script to this application the 10054 error NEVER occured.
There is no "retry" logic in the python script and nor are there any errors on the responding TCP/IP application.

We have also used a standard network tool: netcat from one errant server to another to check that IP is working correctly and this test does not reproduce the 10054 problem.

I also found that my test application intermittently got the 10054 problem when I talked to the responding TCP/IP application however for some reason this does now seem more stable on one of the machines.

The only thing I can think is that my tool and MQ are using sockets in a subtely different way to netcat and the IP application. Having said that this environment used to work and I'm told there have been no changes.

These servers are running a 64 bit fault tolerant version of MQ supplied specifically for win64 by IBM.
The version is just plain 6.0.0.0. Note also that we have working win64 servers running identical windows service packs using MQ 6.0.0.0.

I've tried install MQ fixpack 6.0.2.1 but this is complaining about pre-requisites. When I try and install the pre-requisites msiexec.exe just sits there in task manager doing bugger all (and yes I've tried the MQPINUSEOK=1 flag).

I'm slowly loosing the will to live so please feel free to respond with any suggestions before I slap myself to death with the wet fish I brought in to work this morning.
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » General IBM MQ Support » TCP/IP errors from AIX to Windows 64
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.