ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexMainframe, CICS, TXSeriesWhere is TCP timeout value set on z/OS

Post new topicReply to topic Goto page Previous  1, 2, 3  Next
Where is TCP timeout value set on z/OS View previous topic :: View next topic
Author Message
hughson
PostPosted: Tue Mar 05, 2019 7:47 pm Post subject: Re: Where is TCP timeout value set on z/OS Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1216
Location: Bay of Plenty, New Zealand

bruce2359 wrote:
zpat wrote:
On a sender channel between two z/OS QMs (QMR1 at v7.1, QMR2 at v8.0) - we see a timeout from TCP.

Backing up a bit... please describe the network configuration.

Are these two qmgrs on the same z/OS instance?

Are these z/OS instances in the same physical z box?

Over what type of channel are they communicating? HiperSockets? CTC? Copper cat 5 or 6? What type of adapters?

What z/OS releases at both ends of the channel?


He did say this:-

zpat wrote:
The two QMs are located in different organisations.

_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
bruce2359
PostPosted: Tue Mar 05, 2019 8:34 pm Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8460
Location: US: west coast, almost. Otherwise, enroute.

Ooops. Missed that. So much for my investment in speed-reading course.

What are the CHINUT settings at both ends?
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
zpat
PostPosted: Wed Mar 06, 2019 12:08 am Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5673
Location: UK

As mentioned. They are in different organisations.

Ours is Zos 2.2. The issue is likely with the external network but we cant prove it.

CHINUT ?
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Wed Mar 06, 2019 3:46 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20091
Location: LI,NY

zpat wrote:
As mentioned. They are in different organisations.

Ours is Zos 2.2. The issue is likely with the external network but we cant prove it.

CHINUT ?

I figure he meant Channel INIT or CHINIT...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bruce2359
PostPosted: Wed Mar 06, 2019 5:28 am Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8460
Location: US: west coast, almost. Otherwise, enroute.

Yes, CHINIT channel initiator address space.
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Mar 06, 2019 12:57 pm Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8460
Location: US: west coast, almost. Otherwise, enroute.

What are adapters and dispatchers values?
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
hughson
PostPosted: Wed Mar 06, 2019 1:23 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1216
Location: Bay of Plenty, New Zealand

bruce2359 wrote:
What are adapters and dispatchers values?

Are you thinking that the network slow-down is based on having too few dispatchers?

We've already ruled out commit slow down since NETTIME is seen to increase just before timeout is seen, so I don't think the number of adapters is at fault.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
bruce2359
PostPosted: Wed Mar 06, 2019 5:14 pm Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8460
Location: US: west coast, almost. Otherwise, enroute.

hughson wrote:
bruce2359 wrote:
What are adapters and dispatchers values?

Are you thinking that the network slow-down is based on having too few dispatchers?

We've already ruled out commit slow down since NETTIME is seen to increase just before timeout is seen, so I don't think the number of adapters is at fault.

It's a possibility. I've seen test system small values accidentally percolate into production. Dispatchers face the network. Adapters face inward to support MQI calls.

Generally, 300 msgs/sec is not a very heavy load for z/OS MQ. I'm always suspicious of firewalls.

What else is going on in the entire network at the time of the failure? Is someone FTPing huge files? Streaming video?

EREP reporting anything with NIC cards? RMF reporting anything?
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
zpat
PostPosted: Thu Mar 07, 2019 6:18 am Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5673
Location: UK

60 dispatchers started.

Some big FTPs on the same adapter but not over the same external network link. No obvious corelation on the time of restart.

No apparent hardware errors and timeouts have happened on this channel which has CHLDISP of SHARED on both sides of the QSG which are on different sites and hardware.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Thu Mar 07, 2019 7:50 am Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8460
Location: US: west coast, almost. Otherwise, enroute.

Im guessing that your replies re EREP and RMF are about your end of the channel. Do you have access to SYSLOG or a helpful sysprog at the other end?
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
zpat
PostPosted: Tue Mar 12, 2019 6:01 am Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5673
Location: UK

Other end can't see any issues.

We've now been seeing relatively high network latency on this link recently without actual timeouts.

Can't seem to find the cause of this latency as seen in the sender channel NETTIME value.

Network guys can't see any issues. But the nettime values are almost 10 times higher than usual.

Could anything in z/OS TCP stack cause delays? - seems unlikely to me.

Restarting the channel resumed normal latency.


_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
hughson
PostPosted: Tue Mar 12, 2019 4:25 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1216
Location: Bay of Plenty, New Zealand

zpat wrote:
Restarting the channel resumed normal latency.


Closing the old socket and making a new one causes the the network to return to normal latency suggests that the socket had perhaps gone into re-transmission mode. Perhaps a router in the network has been having issues.

Just a guess though.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
zpat
PostPosted: Wed Mar 13, 2019 12:01 pm Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5673
Location: UK

That's what I have been trying to convince the network team of.

So just to be totally clear. What things can cause NETTIME to increase?

z/OS TCP software layer (causes?)
Internal network (VIPA sysplex adapter)
Our firewall
Virtual Circuit from telecom company
Firewall at 3rd Party
Network inside 3rd Party
z/OS TCP at 3rd Party

Are all these possible?

None of these are MQ itself - can we rule out z/OS MQ on the two QMs as a cause of latency as measured by the NETTIME? Is there any point taking a MQ trace?

Sorry to be pedantic - but what exactly is NETTIME measuring?

When working normally it is around 30 millisecs, when slow it is consistently up at 250 millisecs which leads to delays as it can't process the messages fast enough.

After stop/start it's been running fine, but this slow down has happened occasionally so will probably re-occur unless we can find the root cause.

Thanks.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
hughson
PostPosted: Wed Mar 13, 2019 2:19 pm Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1216
Location: Bay of Plenty, New Zealand

zpat wrote:
What exactly is NETTIME measuring?

An MQ channel, when it is doing a round-trip (end of batch or a heartbeat) remembers the time it sent the "Request for confirmation" flow, and when it gets back the "Acknowledgement" flow, it takes the time again. Inside the "Acknowledgement" flow is an amount of time that the partner end spent doing the MQCMIT (if it was an end of batch), and this value is removed from the time taken to do the round-trip.

So NETTIME is as close to only measuring the time spent in the network as it can be (from the perspective of MQ the owner of the socket).

It's intent was to give MQ Administrators some ammunition when talking to the network team to point out to them that there was a problem on the network.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
zpat
PostPosted: Thu Mar 14, 2019 1:22 pm Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5673
Location: UK

Thanks, looking at the KC, I see this

Quote:

The NETTIME value is the amount of time, displayed in microseconds, taken to send an end of batch request to the remote end of the channel and receive a response minus the time to process the end of batch request. This value can be large for either of the following reasons:

The network is slow.

A slow network can affect the time it takes to complete a batch. The measurements that result in the indicators for the NETTIME field are measured at the end of a batch. However, the first batch affected by a slowdown in the network is not indicated with a change in the NETTIME value because it is measured at the end of the batch.

Requests are queued at the remote end, for example a channel can be retrying a put, or a put request may be slow due to page set I/O. Once any queued requests have completed, the duration of the end of batch request is measured. So if you get a large NETTIME value, check for unusual processing at the remote end.



I am confused by the last paragraph which suggests that MQ processing delays at the remote end are included in NETTIME.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page Previous  1, 2, 3  Next Page 2 of 3

MQSeries.net Forum IndexMainframe, CICS, TXSeriesWhere is TCP timeout value set on z/OS
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.