Author |
Message
|
smdavies99 |
Posted: Thu Aug 01, 2013 12:23 am Post subject: TCPClientOutput and Retries |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Version 7.0.0.4 on Windows Server 2008 R2
Flow is basically
MQInput---->TCPClientOutput
Setting the output host/port to localhost:5566 and the input queue to a BackoutCount = 10, I can see (via user trace) that the connection attempt is tried 10 times. Each time the error is Connection Refused. At the end of the retries the message is sent to the BOQ.
However using localhost, the Node does not wait for the timeout period (10 secs) configured into the node.
Changing the output host/port to say 192.168.98.23:5566 (network exists but the address does not), results in the connection attempt timing out BUT it does not handle the backout count loop. After one failure, the input message ends up on the BOQ.
Any thoughts on the following most welcome:-
1) Why the different behaviour depending upon the address used?
2) Why does using a remote address cause the bypassing of the BOQ retry mechanism? _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Thu Aug 01, 2013 3:40 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
One thing to check is the difference between how your specific laptop/Windows system resolves IP address between the two. For example, if on Linux, 127.0.0.1 would be in the hosts file but another IP would not. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
smdavies99 |
Posted: Thu Aug 01, 2013 4:49 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
I think I have solved the problem by adding a compute node connected to the FAILURE Terminal of the TCPCLientOutput Node and then doing the Throw from there. There is a bit of logic that controls the Error Number but on the whole it seems to work fine (famous last words methinks... ) _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Aug 02, 2013 12:47 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
The next issue I'm seeing is when I try to override the TIMEOUT value that is set in the node.
I set the node value to 3 (seconds) for testing.
Then I experimented and tried to override it
Here is the LocalEnvironment in a trace node prior to the TCPClientOutputNode
Code: |
11:37:12.862032 4156 UserTrace BIP4060I: Data ''===Prior to TCPClient Output node
( ['MQROOT' : 0xae5fcc0]
(0x01000000:Name):Destination = (
(0x01000000:Name):TCPIP = (
(0x01000000:Name):Output = (
(0x03000000:NameValue):Timeout = 120 (INTEGER)
)
)
)
)
|
However when the connection (which in this case was working) was terminated by shutting down the receiver the flow seems to only take 1 second to detect this and return an error.
Code: |
2013-08-02 11:37:10.841419 4156 UserTrace BIP4067I: Message propagated to output terminal for trace node 'APPSRV1_XMIT.Trace1'.
The trace node 'APPSRV1_XMIT.Trace1' has received a message and is propagating it to any nodes connected to its output terminal.
No user action required.
2013-08-02 11:37:11.843984 4156 UserTrace BIP2231E: Error detected whilst processing a message in node 'APPSRV1_XMIT.Send_Data_TO_ABC_Server'.
The message broker detected an error whilst processing a message in node 'APPSRV1_XMIT.Send_Data_TO_ABC_Server'. The message has been augmented with an exception list and has been propagated to the node's failure terminal for further processing.
See the following messages for details of the error.
2013-08-02 11:37:11.844013 4156 RecoverableException BIP3586E: Failed to create a client connection using hostname: ''192.168.98.217'', port: ''7778''. Reason: ''Connection refused: no further information''.
The connection to the remote computer failed.
|
I had hoped that a combination of a high Backout Count on the input Queue AND a high timeout might all me to setup the flow so that in effect it could handle a network outage of say 10 minutes without problem. It looks like I may have to go back to the drawing board
UNLESS someone reading this knows different. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
dogorsy |
Posted: Fri Aug 02, 2013 12:59 am Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
Reason: ''Connection refused: no further information''
if the connection is refused , say for example , invalid credentials, there is no need to wait for the timeout period. So you need to work out why the connection was refused.
so, the node times out when no response is received within a specified period. if a response is received, whether good or bad, the timeout is not activated. |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Aug 02, 2013 1:16 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
You make a good point.
So disabling the network interface on the target system results in this.
Code: |
2013-08-02 12:01:40.730360 244 UserTrace BIP4060I: Data ''===Prior to TCPClient Output node
( ['MQROOT' : 0xae5fcc0]
(0x01000000:Name):Destination = (
(0x01000000:Name):TCPIP = (
(0x01000000:Name):Output = (
(0x03000000:NameValue):Timeout = 120 (INTEGER)
)
)
)
)
'' from trace node 'APPSRVR_XMIT.Trace1'.
The trace node 'APPSRVR_XMIT.Trace1' has output the specified trace data.
This is an information message provided by the message flow designer. The user response will be determined by the local environment.
2013-08-02 12:01:40.730368 244 UserTrace BIP4067I: Message propagated to output terminal for trace node 'APPSRVR_XMIT.Trace1'.
The trace node 'APPSRVR_XMIT.Trace1' has received a message and is propagating it to any nodes connected to its output terminal.
No user action required.
2013-08-02 12:02:01.745792 244 UserTrace BIP2231E: Error detected whilst processing a message in node 'APPSRVR_XMIT.Send_Data_TO_ABC_Server'.
The message broker detected an error whilst processing a message in node 'APPSRVR_XMIT.Send_Data_TO_ABC_Server'. The message has been augmented with an exception list and has been propagated to the node's failure terminal for further processing.
See the following messages for details of the error.
2013-08-02 12:02:01.745814 244 RecoverableException BIP3586E: Failed to create a client connection using hostname: ''192.168.98.217'', port: ''7778''. Reason: ''Connection timed out: no further information''.
The connection to the remote computer failed.
Check that the connection details are correct and the remote system is listening on the correct port.
|
The output node now waits for 21 seconds before throwing an exception.
If anyone knows where this value comes from I'd be very interested. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
dogorsy |
Posted: Fri Aug 02, 2013 1:23 am Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
''Connection timed out: no further information''.
that is better, now it is timing out. But why only after 21 secs ? |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Aug 02, 2013 1:29 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
dogorsy wrote: |
. But why only after 21 secs ? |
Yep, that's what I would like to know. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
dogorsy |
Posted: Fri Aug 02, 2013 2:10 am Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
smdavies99 wrote: |
dogorsy wrote: |
. But why only after 21 secs ? |
Yep, that's what I would like to know. |
ok, this is only guesswork. But, it could well be that the OS is timing out and not the node.
for example, if you ping ''192.168.98.217'' you will get a time out as the network is disabled, so the node does get a response ( which just happens to be 'timed out' ).
so, it would be interesting to try, if you can, enabling the network and calling a service that does not reply, and see if node waits the specified timeout period for a response. |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Aug 02, 2013 2:35 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Yes it is the O/S that is handling this. It seems to be buried deep in the Windows registry somewhere.
Why does MS treat us as if we were idiots by hiding all this stuff away?
Sigh TGIF. Is it beer O'Clock yet?
 _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
fatherjack |
Posted: Fri Aug 02, 2013 2:38 am Post subject: |
|
|
 Knight
Joined: 14 Apr 2010 Posts: 522 Location: Craggy Island
|
smdavies99 wrote: |
Is it beer O'Clock yet? |
Any time is beer o'clock. Drink! Drink! Drink! _________________ Never let the facts get in the way of a good theory. |
|
Back to top |
|
 |
dogorsy |
Posted: Fri Aug 02, 2013 2:40 am Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
smdavies99 wrote: |
Is it beer O'Clock yet?
 |
The Dolphin is open ! |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Aug 02, 2013 2:52 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
dogorsy wrote: |
The Dolphin is open ! |
It is a pity that I'm 30 miles away up the A31  _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Aug 02, 2013 5:21 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
smdavies99 wrote: |
dogorsy wrote: |
The Dolphin is open ! |
It is a pity that I'm 30 miles away up the A31  |
That's a ten minute drive.
At least the way I drive.... |
|
Back to top |
|
 |
dogorsy |
Posted: Fri Aug 02, 2013 6:06 am Post subject: |
|
|
Knight
Joined: 13 Mar 2013 Posts: 553 Location: Home Office
|
smdavies99 wrote: |
dogorsy wrote: |
The Dolphin is open ! |
It is a pity that I'm 30 miles away up the A31  |
Five pints past 3... |
|
Back to top |
|
 |
|