ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Heartbeat Interval vs. AdoptMCA

Post new topic  Reply to topic Goto page 1, 2  Next
 Heartbeat Interval vs. AdoptMCA « View previous topic :: View next topic » 
Author Message
PeterPotkay
PostPosted: Sun Aug 11, 2002 7:12 pm    Post subject: Heartbeat Interval vs. AdoptMCA Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

AdoptMCA allows a QM to accept a request to start a new reciever channel even if it already has a "running" channel by the same name.

For instance, if reciever channel QM1.QM2 is just sitting there in running status because it never got the command from the sender that the discinterval time has passed because of a network failure, and subsequently QM1.QM2 sender gets more messages to send after the network is back up, QM2 will now accept a second channel by that same name.

My question is , why would the original QM1.QM2 reciever channel ever be in that permanent running state after a network failure if it was using Heartbeats? If the heartbeat interval was say 1 minute, wouldn't QM1.QM2 reciever realize that its not getting any more heartbeats and after a minute (actually a minute + 60 secs) put itself in an inactive state?

Why bother with AdoptMCA?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
oz1ccg
PostPosted: Mon Aug 12, 2002 12:00 am    Post subject: Reply with quote

Yatiri

Joined: 10 Feb 2002
Posts: 628
Location: Denmark

Peter, you're right, it shoudn't be nessesary if you also use KeepAlive.

But we're living in a world with a lot of pirates with only one goal, getting our systems down, this could be a network failure, QMGR crash, etc.
I've been using both TCP/IP keepalive and Heartbeat, but I've missed some channels running on the WAN thru a lot of firewalls (the problem could be here, we traced a lot in TCP/IP, and found some clues, but not the one we we're looking for).

What really helped was the introduction of ADOPTMCA, so the channels could reinitate communications after a problem. After this was implemented we haven't had big problems (we're using triggering on XMITQ with a triggering of 1 minute).
_________________
Regards, Jørgen
Home of BlockIP2, the last free MQ Security exit ver. 3.00
Cert. on WMQ, WBIMB, SWIFT.
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
mrlinux
PostPosted: Mon Aug 12, 2002 3:49 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

One of the problems is that the receiver code will block waiting for a response from the sender and if it never comes then the code will be hung
waiting.
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Mon Aug 12, 2002 6:35 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

But Jeff, wouldn't the reciever code stop waiting and unblock itself after the heartbeat interval passed?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Mon Aug 12, 2002 7:34 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

No the code is hung on a function (select() I think) waiting for a message
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Mon Aug 12, 2002 8:18 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Then whats the point of Heartbeat?

I thought its purpose was to alert channels (both sides) of network failures and to allow them to then go Inactive.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Mon Aug 12, 2002 8:34 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

The Heart Beat Interval gives the RCVR channel the ability to shutdown if
it has exceeded the disconnect interval.


From the IBM Manual


The heartbeat exchange gives the receiving MCA the opportunity to quiesce the channel.

Note:
You should set this value to be significantly less than the value of DISCINT. WebSphere MQ checks only that it is within the permitted range however.

HBINT(integer)
This parameter has a different interpretation depending upon the channel type, as follows:
For channels with a channel type (CHLTYPE) of SDR, SVR, RCVR, RQSTR, CLUSSDR, or CLUSRCVR, this is the time, in seconds, between heartbeat flows passed from the sending MCA when there are no messages on the transmission queue. The heartbeat exchange gives the receiving MCA the opportunity to quiesce the channel. This type of heartbeat is valid only on AIX, Compaq OpenVMS, HP-UX, Linux, OS/2 Warp, OS/400, Solaris, Windows, and z/OS.

Note:
You should set this value to be significantly less than the value of DISCINT. WebSphere MQ checks only that it is within the permitted range however.
For channels with a channel type (CHLTYPE) of SVRCONN or CLNTCONN, this is the time, in seconds, between heartbeat flows passed from the server MCA when that MCA has issued an MQGET with WAIT on behalf of a client application. This allows the server to handle situations where the client connection fails during an MQGET with WAIT. This type of heartbeat is valid only for AIX, Compaq OpenVMS, HP-UX, Linux, OS/2 Warp, OS/400, Solaris, and Windows.

The value must be in the range zero through 999 999. A value of zero means that no heartbeat exchange takes place. The value that is used is the larger of the values specified at the sending side and the receiving side.


KAINT(integer)
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Mon Aug 12, 2002 10:46 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

OK, My question still stands: Wouldn't Hearbeat solve the problem here, and if so, is there any benifit to using AdoptMCA if you already have Heartbeat set at a reasonable value?

Hearbeat will unblock a reciever channel if the network goes down, getting it to a point (inactive) where it will accept a new connection attempt.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Mon Aug 12, 2002 11:59 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

If there is a network error which lasts longer thant the discint of the sender (Or retries are exhausted or someone issues a stop channel) the sender channel will terminate the socket connection and when trying to start the channel sender from then on it is then trying to create a new connection, while the rcvr thinks it still has a valid connection.
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Mon Aug 12, 2002 12:05 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Light bulb just went off.

If the sender tries to re establish communication BEFORE the Heartbeat interval passes, then AdoptMCA would kick in and the connection would be made.

If the sender tries to re establish coomunication AFTER the Heartbeat interval passes, then the reciever already is in an Inactive state because of Heartbeat, and AdoptMCA doesn't matter.

The larger your Heartbeat number, the more important AdoptMCA is for timely reconnections. I say timely because even without AdoptMCA, sooner or later the Heartbeat int would pass and since the sender is retrying, it would eventually catch the reciever in the inactive state.

Correct?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Tue Aug 13, 2002 4:57 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

Not sure what you mean by this statement:

it would eventually catch the reciever in the inactive state.


The larger your Heartbeat number, the more important AdoptMCA is for timely reconnections. I say timely because even without AdoptMCA, sooner or later the Heartbeat int would pass and since the sender is retrying, it would eventually catch the reciever in the inactive state
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Tue Aug 13, 2002 5:13 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

A reciever must be in an Inactive state to reestablish communications after a network failure.

You can get the RCVR in this state by manually stopping and starting it (blah), or letting the Heartbeat Interval pass. Once that interval passes, the RCVR will go into the Inactive state on it's own. It's at this point that the SNDR's request for a connection will finally work. i.e. it finally caught the RCVR in an Inactive state.

If the SNDR would never try to reestablish communications until after the Hertbeat Interval put the RCVR in an Inactive state, there would be no need for AdoptMCA. But I would bet that usually more messages are coming to the sender before that has a chance to happen, and in this case AdoptMCA kicks in and lets the connection reeastablish itself before Heartbeat does its thing.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Tue Aug 13, 2002 5:20 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

This is incorrect from my understanding and observations.

Once that interval passes, the RCVR will go into the Inactive state on it's own.


The rcvr channel must receive the heartbeat message in order to come
out of it block waiting for a TCP Message then it can check it's discint timer
and timeout if required or go back to waiting for the next TCP Message.
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Tue Aug 13, 2002 5:37 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7716

Hmm, maybe I got it wrong. I based my assumption on the below quote taken from the Dallas Tech Conferance Session M16 "Keeping Your TCP/IP Channels Up and Running", page 16.

*If the network is down the heartbeat packets will not be received by the receiver MCA.

*Although the sender expects a reply, it will not respond to the absence of a reply. It will go into "Inactive" state, ready to be restarted by the arrival of message on the XMITQ.

*The Heartbeat is not dependent on the availability of the sender channel. If no heartbeat packets are recieved within the Heartbeat check interval the receiver will assume an outage and go "Inactive".


That third point is why I said what I did.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
mrlinux
PostPosted: Tue Aug 13, 2002 5:43 am    Post subject: Reply with quote

Grand Master

Joined: 14 Feb 2002
Posts: 1261
Location: Detroit,MI USA

Well I have observed at my current employer here where we have had network outages that lasted longer than the disconnect interval and the receiver channel would still be in the running state and in order to fix it we would have to force the receiver down, unitl we implmented the adoptmca
setting and we have not had those issue's since. Of course this is with
a v5.1 queue manager (rcvr side) and v5.2(sender side). So maybe it
has been changed.
_________________
Jeff

IBM Certified Developer MQSeries
IBM Certified Specialist MQSeries
IBM Certified Solutions Expert MQSeries
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Heartbeat Interval vs. AdoptMCA
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.