ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Lowering effective wait time on a problematic cluster qm

Post new topic  Reply to topic Goto page 1, 2  Next
 Lowering effective wait time on a problematic cluster qm « View previous topic :: View next topic » 
Author Message
mattfarney
PostPosted: Tue Jan 18, 2011 2:05 pm    Post subject: Lowering effective wait time on a problematic cluster qm Reply with quote

Disciple

Joined: 17 Jan 2006
Posts: 167
Location: Ohio

I am trying to increase my understanding on balancing. The clustering manual does not go into much detail with how the workload balancing is performed.

I have three QMA, QMB, and QMC who are non-repository QMs in a cluster that I do not own. They are sent data from other machines in the cluster. Let's say the connection from someQM->QMB stops for some reason. My goal is to minimize the time messages are stuck on a remote server intended to be delivered to QMB. The communications issue are being researched, but I've been asked to try and help minimize the impacts.


As soon as the channel goes to RETRYING, I assume that the traffic targeted for QMB will be redistributed to QMA/QMC. I guess technically, this happens when that someQM detects that the channel is in RETRYING, since the communications issues could affect that determination too. Correct?


What settings contribute to this wait time? The short and long retry timers only matter after an channel has gone to RETRYING, correct?


So if the channel is stuck in an odd situation (network failure during a batch), I believe I should be looking at heartbeat and keepalive settings.
It is implied in the intercommunication book that these work the same for cluster channels. Anyone have any past experience with heartbeats in a clustered environment?


Am I leaving anything out?

-mf
Back to top
View user's profile Send private message
mqjeff
PostPosted: Tue Jan 18, 2011 2:44 pm    Post subject: Re: Lowering effective wait time on a problematic cluster qm Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

mattfarney wrote:
As soon as the channel goes to RETRYING, I assume that the traffic targeted for QMB will be redistributed to QMA/QMC.


That depends on how the messages were sent.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Jan 18, 2011 2:56 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

That depends on how the queue was opened.

MQOO_BIND_NOT_FIXED vs. MQOO_BIND_FIXED. If BIND_FIXED, you have directed clustering software to only send messages to the queue name resolved at MQOPEN time, and not at MQPUT time.

BIND_NOT_FIXED allows messages in the SCTQ to be routed to the next available instance of the queue.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
mattfarney
PostPosted: Tue Jan 18, 2011 3:39 pm    Post subject: Reply with quote

Disciple

Joined: 17 Jan 2006
Posts: 167
Location: Ohio

I knew I left some important information out.
The clustered queue is the same name on all three servers and is DEFBIND(NOTFIXED).

MQ6.0 - windows

-mf
Back to top
View user's profile Send private message
exerk
PostPosted: Tue Jan 18, 2011 4:10 pm    Post subject: Re: Lowering effective wait time on a problematic cluster qm Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

mattfarney wrote:
...I have three QMA, QMB, and QMC who are non-repository QMs in a cluster that I do not own...


If you do not own them, and you have done everything as stated by others, then there is nothing more you can do - it is up to the network owner and the owners of the 'other' queue managers to resolve the issue.
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Jan 18, 2011 4:20 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Quote:
The clustered queue is the same name on all three servers and is DEFBIND(NOTFIXED).

This is a queue attribute.

The app developer can specify one of these in MQOO (open options):
1. BIND_FIXED
2. BIND_NOT_FIXED
3. BIND_AS_Q_DEF

If the developer specifies BIND_FIXED or BIND_NOT_FIXED, the queue DEFBIND attribute has no effect on the open.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Jan 18, 2011 4:42 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

You will need to examine the application code to determine exactly which open options are being used.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
mattfarney
PostPosted: Tue Jan 18, 2011 5:00 pm    Post subject: Reply with quote

Disciple

Joined: 17 Jan 2006
Posts: 167
Location: Ohio

If it were only that easy...
I've asked the question, but there's no guarantee I'll get an answer.

Personal Opinion: I wish there was a way to force certain MQ options (persistence, defbind, etc.) since trusting the generator of the data is problematic.

The traffic I see across the three systems is well balanced, so I think we can safely assume that the BIND_FIXED option is not being used [though as I said above, I've asked them to check].

-mf
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Jan 18, 2011 5:18 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Quote:
The traffic I see across the three systems is well balanced, so I think we can safely assume that the BIND_FIXED option is not being used ...

Is a cluster workload exit being used?

Quote:
I wish there was a way to force certain MQ options (persistence, defbind, etc.) since trusting the generator of the data is problematic.

Hmmmm. So, you believe it is the function of a system administrator to ensure quality of data?

Do you also believe it's also the function of a sysadmin to ensure that business processes (arithmetic, database updates, etc.) are correct in every application program? Do you check every line of code to make this happen?

I'd strongly suggest that you draw a line of separation between what a sysadmin can (should) do, and what is the responsibility of an application developer. Fixing things so bad application code behaves better does nothing to improve the application code. Firing app developers for low-quality code is a better choice.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Tue Jan 18, 2011 9:20 pm    Post subject: Re: Lowering effective wait time on a problematic cluster qm Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

mattfarney wrote:

As soon as the channel goes to RETRYING, I assume that the traffic targeted for QMB will be redistributed to QMA/QMC. I guess technically, this happens when that someQM detects that the channel is in RETRYING, since the communications issues could affect that determination too. Correct?


Not quite right. Imagine that QMB is the reply to qmgr. There is nothing wrong with the channel from QMB to QMC, QMA but the channel from QMC to QMB is in retry mode... So your requests go out, get processed but do not come back until the problem is resolved, if they were processed by QMC, or transited through QMC...


mattfarney wrote:
What settings contribute to this wait time? The short and long retry timers only matter after an channel has gone to RETRYING, correct?


So if the channel is stuck in an odd situation (network failure during a batch), I believe I should be looking at heartbeat and keepalive settings.
It is implied in the intercommunication book that these work the same for cluster channels. Anyone have any past experience with heartbeats in a clustered environment?


Am I leaving anything out?

-mf

Yes you are leaving out one of the biggest offenders... i.e. a queue is full on the destination qmgr. The receiving MCA has a number of retries and retry interval for such a scenario (lookup the specifics in the mqsc manual) before putting the message to the DLQ. The tweeking of those parms can expedite significantly the MCA putting the messages on the DLQ.

If your communications are essentially sub second, a full destination queue will wreck havoc on your MQ cluster network. The easy fix is to increase the queue depth on the fly. Of course you need to scale or fix the consumer right after that....

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
skoobee
PostPosted: Tue Jan 18, 2011 10:29 pm    Post subject: Reply with quote

Acolyte

Joined: 26 Nov 2010
Posts: 52

Look at the BATCHHB attribute. This checks the network connection just before committing the msgs, so if there is a problem the batch can be backed out and the msgs resdistributed.
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Jan 19, 2011 3:45 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mattfarney wrote:
Personal Opinion: I wish there was a way to force certain MQ options (persistence, defbind, etc.) since trusting the generator of the data is problematic.


<plug>There is but it requires 3rd party software</plug>

This is no way an attempt to address the question:

Quote:
I'd strongly suggest that you draw a line of separation between what a sysadmin can (should) do, and what is the responsibility of an application developer


because I agree with both of you.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
mattfarney
PostPosted: Wed Jan 19, 2011 9:51 am    Post subject: Reply with quote

Disciple

Joined: 17 Jan 2006
Posts: 167
Location: Ohio

bruce2359 wrote:
Quote:
The traffic I see across the three systems is well balanced, so I think we can safely assume that the BIND_FIXED option is not being used ...

Is a cluster workload exit being used?


None is defined.

bruce2359 wrote:
Quote:
I wish there was a way to force certain MQ options (persistence, defbind, etc.) since trusting the generator of the data is problematic.

Hmmmm. So, you believe it is the function of a system administrator to ensure quality of data?

Do you also believe it's also the function of a sysadmin to ensure that business processes (arithmetic, database updates, etc.) are correct in every application program? Do you check every line of code to make this happen?

I'd strongly suggest that you draw a line of separation between what a sysadmin can (should) do, and what is the responsibility of an application developer. Fixing things so bad application code behaves better does nothing to improve the application code. Firing app developers for low-quality code is a better choice.


IMO, this paradigm works well when the content is being created by people under the control of the processing application. If I am receiving content from outside my organization/company/entity, relying on their programmers to set the appropriate flags and settings is a troublesome burden.

-mf
Back to top
View user's profile Send private message
bruce2359
PostPosted: Wed Jan 19, 2011 10:00 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Quote:
...a troublesome burden.

Yes, but only if you choose to accept it.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Wed Jan 19, 2011 9:42 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

bruce2359 wrote:
Quote:
...a troublesome burden.

Yes, but only if you choose to accept it.

You need to make sure that the programs that do not have this set correctly do not pass your deliverables' acceptance criteria...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » Clustering » Lowering effective wait time on a problematic cluster qm
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.