ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Cluster Resolution /Reply to message delaying Problem

Post new topic  Reply to topic
 Cluster Resolution /Reply to message delaying Problem « View previous topic :: View next topic » 
Author Message
jeevan
PostPosted: Sun Nov 22, 2009 2:55 pm    Post subject: Cluster Resolution /Reply to message delaying Problem Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

Our MQ set up consists of two overlapping clusters. There are two bridge qmgrs which belong to both cluster and through which the two clusters communicate.

Once (according to my colleague), he saw about 10,000 messages on cluster transmission queue in full repositories and cluster channels were stopped. When he started the channel, it again went to stopped mode. But when he bouced the bridge qmgr, all the message were cleared.

There are problem of delaying message from backend queue managers ( which belong to one of the two clusters) to store server( the member of the other cluster)

While replying, the reply to queue manager is the qmgr alias defined in the bridge queue manager. Each of these queue managers have full list of qmgr alias.

We are realising is a long time to reach the reply to message back to store servers however messages are reaching to request queue without delay. The request queue are the qa at bridge qmgr and the physical queue in the backend queue manager.

I am suspecting that probably there are some bad and duplicate records in full repositories about partial repositories.

One thing uncommon in our cluster setup is that, all partial repositories are connected to both FRs. That means, there are two cluster sender channel in each of the Partial repositories.


Any thought,

Thanks


Last edited by jeevan on Sun Dec 06, 2009 2:47 pm; edited 2 times in total
Back to top
View user's profile Send private message
exerk
PostPosted: Sun Nov 22, 2009 4:06 pm    Post subject: Re: Cluster Resolution /Reply to message delaying Problem Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

jeevan wrote:
...One thing uncommon in our cluster setup is that, all partial repositories are connected to both FRs...


Are we to assume from that statement that manually defined CLUSSDR's are used to connect?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
jeevan
PostPosted: Sun Nov 22, 2009 4:24 pm    Post subject: Re: Cluster Resolution /Reply to message delaying Problem Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

exerk wrote:
jeevan wrote:
...One thing uncommon in our cluster setup is that, all partial repositories are connected to both FRs...


Are we to assume from that statement that manually defined CLUSSDR's are used to connect?


Yes, each partial repositories have 3 cluster channels, one cluster receiver channel and two cluster sender channels to two FRs.


Last edited by jeevan on Mon Dec 07, 2009 6:16 am; edited 2 times in total
Back to top
View user's profile Send private message
bruce2359
PostPosted: Sun Nov 22, 2009 4:26 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Quote:
...cluster channels were stopped.

A channel in STOPPED state was either stopped manually (by someone or something issuing a STOP CHANNEL command, for example), OR the channel encountered errors from which it could not recover.

Did you take a look at the error logs for the failing channel(s) to see why they failed? Please do so, and post the results here.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
jeevan
PostPosted: Sun Nov 22, 2009 4:36 pm    Post subject: Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

bruce2359 wrote:
Quote:
...cluster channels were stopped.

A channel in STOPPED state was either stopped manually (by someone or something issuing a STOP CHANNEL command, for example), OR the channel encountered errors from which it could not recover.

Did you take a look at the error logs for the failing channel(s) to see why they failed? Please do so, and post the results here.


All these are the past history. I have been asked to figure out the problems and was explained to me that is what happned once in the past. So I do not have any error log of these.

But when the channel were restarted, they went to stopped mode but they came up ok once the bridge qmgrs were bounced. So, i am suspecting some problem in repositories.

I assume, the channel went down because they could not delive the message. I think in that situation, mq stopps the channel( ready to be corrected)


Last edited by jeevan on Mon Nov 23, 2009 6:21 am; edited 1 time in total
Back to top
View user's profile Send private message
bruce2359
PostPosted: Sun Nov 22, 2009 7:42 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Quote:
I assume, the channel went down because they could not deliver the message. I think in that situation, mq stops the channel (ready to be corrected)

You have no error logs and no other definitive documentation of the error and its symptoms - other than someone said this happened?

There are a variety of reasons for the symptoms reported to you; but without any trail (errors logged), there is little chance of intelligently taking corrective action. We could guess, I suppose.

There is a shotgun approach to problem resolution. Let's say you are told to change a few things, then wait to see if it happen again. If it problem does not recur, which thing fixed the problem? What did you learn? If it didn't fix the problem, what then?

As a side note, I've recommended to clients that they archive error logs at qmgr startup. This allows for this type of historical analysis. It's only disk space, or tape. This action alone may satisfy management that you are doing the right thing to best track down the problem if it should happen again.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Nov 22, 2009 9:15 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

Can you please specify the version of MQ and whether the cluster channels are using message compression?
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
jeevan
PostPosted: Mon Nov 23, 2009 3:49 am    Post subject: Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

fjb_saper wrote:
Can you please specify the version of MQ and whether the cluster channels are using message compression?


MQ 6.0.2.7 for Aix
No, we are not using compression but all channel including cluster channel are ssl enabled.


Last edited by jeevan on Mon Nov 23, 2009 5:48 am; edited 2 times in total
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Mon Nov 23, 2009 4:55 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

fjb_saper wrote:
Can you please specify the version of MQ and whether the cluster channels are using message compression?

_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Nov 23, 2009 3:15 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

PeterPotkay wrote:
fjb_saper wrote:
Can you please specify the version of MQ and whether the cluster channels are using message compression?

There is an APAR / PMR for this. The compression (depending on size of message) has a problem with alignments... IBM has a fix for this (V6.0.2.6 & V6.0.2.7)

@jeevan
You may have the same type of problem with SSL. Open a PRM!
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
jeevan
PostPosted: Sun Dec 06, 2009 2:41 pm    Post subject: Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

fjb_saper wrote:
PeterPotkay wrote:
fjb_saper wrote:
Can you please specify the version of MQ and whether the cluster channels are using message compression?

There is an APAR / PMR for this. The compression (depending on size of message) has a problem with alignments... IBM has a fix for this (V6.0.2.6 & V6.0.2.7)

@jeevan
You may have the same type of problem with SSL. Open a PRM!


We found the solution of the message delaying. Let me explain it.

As I said above, our mq network consists of two overlapping clusters. Two queue manager function as a gateway queue managers which are the member of both clusters. The gateway queue manager hold alias queue for, lets say, client side of the cluster and qmgr alias for server side of the cluster. When the server side of the qmgr replies the message, it may go to either of the bridge queue manager.

The bridge queue manager is not sending the message to client qmgr ( reply to qmgr) directly each time. one time it sends directly, the other time, it sends the message to other bridge queue manager. So the message does to and fro a few times between the bridge queue managers. This was causing the delay and sometime expiring the messages.

Once we created manual cluster sender channel betewen these bridge queue managers and stop them, the message start reaching the target as expected.

We also had a problem of dying the amqrrmfa (reepository) process when we stop and start the queue manager. We opened a PMR to ibm. they sent us replacement of the amqrrmfa and that seems working.

By the way, do any of you guys have overlapping cluster? Had you faced the similar situation like we did ?

Thanks
Back to top
View user's profile Send private message
exerk
PostPosted: Sun Dec 06, 2009 3:08 pm    Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

jeevan wrote:
...By the way, do any of you guys have overlapping cluster? Had you faced the similar situation like we did ?...


Not when a cluster has been correctly configured, e.g. as per the manual and with only one manually defined CLUSSDR from a PR to an FR
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
zonko
PostPosted: Sun Dec 06, 2009 11:48 pm    Post subject: Reply with quote

Voyager

Joined: 04 Nov 2009
Posts: 78

You chose the wrong solution, creating a manual CLUSSDR.

The cause was that the qmgr alias had the same name as the actual qmgr, and since each had the same status as valid destinations, the msg was load-balanced between the qmgr hosting the alias, and the actual qmgr.

The real solution was to create qmgr aliases with different names to the real qmgr.
Back to top
View user's profile Send private message
jeevan
PostPosted: Mon Dec 07, 2009 6:01 am    Post subject: Reply with quote

Grand Master

Joined: 12 Nov 2005
Posts: 1432

zonko wrote:
You chose the wrong solution, creating a manual CLUSSDR.

The cause was that the qmgr alias had the same name as the actual qmgr, and since each had the same status as valid destinations, the msg was load-balanced between the qmgr hosting the alias, and the actual qmgr.

The real solution was to create qmgr aliases with different names to the real qmgr.


For a discussion, let's say we have the follwoing systems.

8 qmgr in cluster client
2 are bridge server
2 FR
4 qmgrs where client applicaton connect and put message and wait for response

6 qmgr in server cluster
2 FR ( same as above)
2 bridge server ( same as above)
2 qmgrs for server ( responding) app where the responding applicaton connect and reply the message

Note: Bridge server belong to both cluster and FR are the same for both clusters

each of the two bridge queue managers hold alias queues for the cluster queue physically resided
in 2 qmgrs where the responding queue managerconnect and reply the messages received from client app
and qmgr alias for 4 queue manager where the client applicaiton connect and put the messages


when the client app sends a message, the reply to queue manager is picked up automatically. The replytoqueu
may be set as this is the same thoughtout all the queue manager.

But according to you, if we have different name for qmgr alias then the actual name, and again, one in one
bridge queue manager and another name in another bridge queue manager, would not it be mess?

A client app which connect to 4 client side queue managers, have to have diffrent replyToQmgr.

Also, if we have same alias name in both bridge queue manager same problem arises as we faced now. If
we have different alias name for the same queue manger, then, it would create more hassle for
applicaiton to put the reply to qmgr.

Could you please explain how it would work without hassle for appplciation?
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » Cluster Resolution /Reply to message delaying Problem
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.