MQSeries.net :: View topic - Where is TCP timeout value set on z/OS

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » Mainframe, CICS, TXSeries » Where is TCP timeout value set on z/OS

Goto page 1, 2, 3 Next

Where is TCP timeout value set on z/OS

« View previous topic :: View next topic »

Author

Message

zpat

Posted: Mon Mar 04, 2019 6:13 am Post subject: Where is TCP timeout value set on z/OS

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

On a sender channel between two z/OS QMs (QMR1 at v7.1, QMR2 at v8.0) - we see a timeout from TCP.

My question is where is the value for this timeout configured?

Quote:

08.08.27 S0026645 +CSQX259E !QMR1 CSQXRCTL Connection timed out, 016
016 channel QMR1.QMR2
016 connection (nn.nn.nn.nn)
016 (queue manager QMR2)
016 TRPTYPE=TCP

08.08.27 S0026645 +CSQX599E !QMR1 CSQXRCTL Channel QMR1.QMR2 ended abnormally

Also any suggestions as to possible cause would be helpful. The two QMs are located in different organisations.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

Vitor

Posted: Mon Mar 04, 2019 6:37 am Post subject: Re: Where is TCP timeout value set on z/OS

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

zpat wrote:

My question is where is the value for this timeout configured?

Ask your system programmers. There are a number of ways TCP can be set up in z/OS, from a standard stack to VIPR, and it's going to be both site specific & inaccessible to normal mortals in most RACF configurations.
_________________
Honesty is the best policy.
Insanity is the best defence.

hughson

Posted: Mon Mar 04, 2019 3:24 pm Post subject: Re: Where is TCP timeout value set on z/OS

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

zpat wrote:

My question is where is the value for this timeout configured?

As with all IBM MQ error messages (or in fact error messages from any software product) it is always worth having a read about the error in the manual.

Knowledge Center wrote:

CSQX259E: csect-name Connection timed out, channel channel-name connection conn-id (queue manager qmgr-name) TRPTYPE=trptype

Explanation
The connection conn-id timed out. The associated channel is channel-name and the associated remote queue manager is qmgr-name; in some cases the names cannot be determined and so are shown as '????'. trptype shows the communications system used:

TCP TCP/IP
LU62 APPC/MVS

Probable causes are:

A communications failure.
For a message channel, if the Receive Timeout function is being used (as set by the RCVTIME, RCVTTYPE, and RCVTMIN queue manager attributes) and no response was received from the partner within this time.
For an MQI channel, if the Client Idle function is being used (as set by the DISCINT server-connection channel attribute) and the client application did not issue an MQI call within this time.

Severity
8

System action
The channel stops.

System programmer response
For a message channel, check the remote end to see why the time out occurred. Note that, if retry values are set, the remote end will restart automatically. If necessary, set the receive wait time for the queue manager to be higher.

For an MQI channel, check that the client application behaviour is correct. If so, set the disconnect interval for the channel to be higher.

So you are timing out on a TCP receive call, but on a sender channel (as evidenced by the error being reported on queue manager QMR1, the sending end of channel QMR1.QMR2, which is connecting to queue manager QMR2).

You may be thinking that a sender channel shouldn't be in a TCP receive call, but do remember that after the sender channel has sent a batch of messages, it then waits for the receiver channel to acknowledge that batch of messages before starting the next batch. It will be this receive call that has timed out (or possibly waiting for the acknowledgment of a heartbeat flow if your channel is currently idle).

This suggests some slow down on the queue manager QMR2 end of the channel. Any problems there?

Please also issue the following commands to find out the various of the various settings involved.

Code:

DISPLAY QMGR RCVTIME RCVTTYPE RCVTMIN

DISPLAY CHANNEL(QMR1.QMR2) HBINT

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

zpat

Posted: Tue Mar 05, 2019 12:25 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

RCVTIME 2
RCVTTYPE MULTIPLY
RCVTMIN 300
HBINT 60

The timeout seems like 5 mins. Can MQ change that?

It's a shared channel on QSG through a VIPA.

I think that the wait time would be 2 x 60, ie 120 secs but RCVTMIN of 300 secs
is applied. So we could reduce that but it's QM wide.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

Last edited by zpat on Tue Mar 05, 2019 12:35 am; edited 1 time in total

hughson

Posted: Tue Mar 05, 2019 12:33 am Post subject:

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

Yes that looks like your receive time out should be 300 seconds.

hughson wrote:

This suggests some slow down on the queue manager QMR2 end of the channel. Any problems there?

You didn't answer this?

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

zpat

Posted: Tue Mar 05, 2019 12:36 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

They claim not......
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

hughson

Posted: Tue Mar 05, 2019 12:39 am Post subject:

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

zpat wrote:

I think that the wait time would be 2 x 60, ie 120 secs but RCVTMIN of 300 secs is applied. So we could reduce that but it's QM wide.

I wouldn't reduce anything until you figure out what is causing your timeout. Clearly it is not long enough for whatever is your current timeout.

I would be inclined to turn on MONCHL(HIGH) and take a look at the numbers in NETTIME for your round trips.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

zpat

Posted: Tue Mar 05, 2019 1:03 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

Nettime is usually very stable at around 30,000 microseconds.

When these very occasional like once every 2 weeks timeouts occur it will cause nettime to rise just before the timeout.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

hughson

Posted: Tue Mar 05, 2019 6:53 am Post subject:

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

So that suggests a network issue, rather than a problem at queue manager QMR2. What does your network team say about it?
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

bruce2359

Posted: Tue Mar 05, 2019 7:22 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Wait! Wait. I know the answer to this one: "... there are no network problems, and nothing has changed."
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

hughson

Posted: Tue Mar 05, 2019 7:27 am Post subject:

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

zpat

Posted: Tue Mar 05, 2019 11:18 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

That's the one.

Apparently no network or firewall issues either side.

The channel is fairly high volume (around 300 messages per sec) and works fine for weeks on end then gets a timeout, causing a service outage.

That's why I am looking into reducing the time it takes to timeout so that the outage is shorter (I know that won't fix the network).

There is only one sender channel, I don't suppose dividing the traffic up over two channels would make it any faster or more reliable, although it might result in a 50% outage?

Of course there is the other viewpoint, the more channels, the more chance one will get a network glitch (a bit like having more engines on a plane).
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

zpat

Posted: Tue Mar 05, 2019 11:18 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

bruce2359

Posted: Tue Mar 05, 2019 12:17 pm Post subject: Re: Where is TCP timeout value set on z/OS

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

zpat wrote:

On a sender channel between two z/OS QMs (QMR1 at v7.1, QMR2 at v8.0) - we see a timeout from TCP.

Backing up a bit... please describe the network configuration.

Are these two qmgrs on the same z/OS instance?

Are these z/OS instances in the same physical z box?

Over what type of channel are they communicating? HiperSockets? CTC? Copper cat 5 or 6? What type of adapters?

What z/OS releases at both ends of the channel?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Last edited by bruce2359 on Tue Mar 05, 2019 12:44 pm; edited 1 time in total

bruce2359

Posted: Tue Mar 05, 2019 12:43 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Moved to mainframe forum.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

Display posts from previous:

Goto page 1, 2, 3 Next

Page 1 of 3

MQSeries.net Forum Index » Mainframe, CICS, TXSeries » Where is TCP timeout value set on z/OS

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP