MQSeries.net :: View topic - cluster resolution problem

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » cluster resolution problem

Goto page 1, 2 Next

cluster resolution problem

« View previous topic :: View next topic »

Author

Message

guy11

Posted: Wed May 16, 2012 10:05 pm Post subject: cluster resolution problem

Newbie

Joined: 16 May 2012
Posts: 8

I getting frequent cluster resolution errors (2189) on my Full repository managers when application tries to access queues defined in Partial repository manager. I captured amqfrdm outputs in one of the repository managers. Up on inspections of the logs i found the below. Can somebody tell me is it normal or it denotes some problem. IBM Support is avoiding my question and not providing any answer, despite directly asking them

FQM1 (FR) , FMQ2 (FR) and PQM3 ( PR) were all members of 3 different cluster CLUS1, CLUS2, CLUS3 with dedicated cluster channels

Cluster queue QUEUE.CLUS1 is defined in PQM3

Below is part of amqrfdm output captured in FMQ1 (FR). Note QUEUE.CLUS1 is showing under all the 3 clusters of which PQM3 is also a member. QUEUE.CLUS1 has been in only cluster CLUS1 since ages.
MQ Version is 7.0.1.7, Platform SUN SPARC

Q(QUEUE.CLUS1 ) Seq(501)
@1708B8
Cluster(CLUS1 )
UUID(PQM3 )
SubID(135 2CF6DC89)
Exp(Fri May 25 03:05:30 2012) Upd(Wed Apr 25 03:05:31 2012)
Flags(No Ack ClusQ )
Flags(0) MsgId(414D5120514D5F4950535633202020204F5717572156E2D4)
EnumPrev(1709A8 ) EnumNext(107A7C0 )

Q(QUEUE.CLUS1 ) Seq(507)
@107A7C0
Cluster(CLUS2 )
UUID(PQM3 )
SubID(132 AEBD6E28)
Exp(Fri May 25 03:05:30 2012) Upd(Wed Apr 25 03:05:31 2012)
Flags(No Ack ClusQ )
Flags(1) MsgId(414D5120514D5F4950535633202020204F5717572156E2D3)
EnumPrev(1708B8 ) EnumNext(107A6D0 )

Q(QUEUE.CLUS1 ) Seq(507)
@107A6D0
Cluster(CLUS3 )
UUID(PQM3 )
SubID(132 AEBD6CA1)
Exp(Fri May 25 03:05:30 2012) Upd(Wed Apr 25 03:05:31 2012)
Flags(No Ack ClusQ )
Flags(1) MsgId(414D5120514D5F4950535633202020204F5717572156E2D2)
EnumPrev(107A7C0 ) EnumNext(107A5E0 )

mqjeff

Posted: Thu May 17, 2012 3:21 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

What is the exact text of the exact error that shows up?

Exactly which queue manager does it show up as?

Exactly which queue manager is the application connected to at the time of the error?

Exactly what object is the application attempting to use at the time of the error?

Exactly what MQ operation is the application attempting on that object at the time of the error?

Exactly what MQRC does the application receive from that attempt?

Vitor

Posted: Thu May 17, 2012 5:38 am Post subject: Re: cluster resolution problem

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

guy11 wrote:

IBM Support is avoiding my question and not providing any answer, despite directly asking them

How are they avoiding the question? Do they just keep asking for more information?

Like the topology of your clusters, the number & status of channels, the circumstances of the error, that sort of thing?
_________________
Honesty is the best policy.
Insanity is the best defence.

guy11

Posted: Thu May 17, 2012 6:04 am Post subject: Re: cluster resolution problem

Newbie

Joined: 16 May 2012
Posts: 8

Vitor wrote:

guy11 wrote:

IBM Support is avoiding my question and not providing any answer, despite directly asking them

How are they avoiding the question? Do they just keep asking for more information?

Like the topology of your clusters, the number & status of channels, the circumstances of the error, that sort of thing?

Exactly. We have provided topology, configuration details etc. always same standard answer, provide logs/traces when problem is occuring.

mqjeff

Posted: Thu May 17, 2012 6:15 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

That's not avoiding the question.

That's telling you that there isn't yet enough information available to answer the question.

You haven't provided enough information here for anyone else to answer the question either.

The reason code indicates that the partial repository that the application is using is having difficulties communicating with the full repository in order to resolve the object being opened.

This could be because of a corrupt set of information in the full repository, although that's unlikely and would also have been exposed by the PMR by now.

It's more likely that there are channel issues with the clusrcvr or clussdr on the PR.

It's also likely that you have not constructed the topology that you think you have constructed. You have said you expect the queue is only shared in one of the three clusters, but you are seeing definitions for it in the FR that indicate it's shared in all three. This is most likely caused by a misunderstanding of what you have configured - that is, that you have *actually* shared it in all three clusters even though you think you only shared it in one.

But there isn't enough information here to determine which of these, if any, is actually the scenario in effect.

guy11

Posted: Thu May 17, 2012 6:18 am Post subject:

Newbie

Joined: 16 May 2012
Posts: 8

mqjeff wrote:

Application accessing the queue through JMS gets 2189 ( Runs in same machine with Full repository manager )

Error always happens at FQM1 or FMQ2 ( Full repositories ) and Queue being accessed was in PQM3 ( Partial repository )

Object being accessed by application was CLUSTER QUEUE on Partial repository ( QUEUE.CLUS1 )

Application is attempting MQPUT/MQPUT1 through JMS

Error received by application is MQJMS001: Completion Code 2, Reason 2189.

This is not new setup, has been working since ages, problem resolves once i refresh the cluster. But it happening again for a different queue in different cluster. But rest is always the same about Full and partial repositories. I am quite familiar with MQ and Clusters. This is something i couldn't sort it out.

In the output, i posted, the cluster queue part of one cluster is appearing under 3 different clusters in amqrfdm output, does it denotes corruption in repository cache or is it normal. as i stated the partial repository which hosts the stated queue is also member of other 2 clusters the queue is showing up in cache.

Vitor

Posted: Thu May 17, 2012 6:23 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqjeff wrote:

That's not avoiding the question.

That's telling you that there isn't yet enough information available to answer the question.

mqjeff wrote:

You haven't provided enough information here for anyone else to answer the question either.

mqjeff wrote:

It's also likely that you have not constructed the topology that you think you have constructed

mqjeff wrote:

But there isn't enough information here to determine which of these, if any, is actually the scenario in effect.

_________________
Honesty is the best policy.
Insanity is the best defence.

guy11

Posted: Thu May 17, 2012 6:38 am Post subject:

Newbie

Joined: 16 May 2012
Posts: 8

mqjeff wrote:

I am damn sure the queue was shared in only one cluster.
This is not new configuration, it was working for ages.
Trouble started after we migrated to V7 from V6.
We have 15+ clusters and problem happened for different queues in different clusters, even different Partial repository QMs in different platforms - zOS, AIX, SOLARIS ).

My question is simple, does it denotes corruption ?. PMR has not exposed anything, they are not giving any answer.

cicsprog

Posted: Thu May 17, 2012 7:51 am Post subject:

Partisan

Joined: 27 Jan 2002
Posts: 347

Are you scripting ALTER or DEFINE commands as the server or MQM starts. I've seen newbie MQ Admins DEFINE cluster objects every time the server is recycled. This can't be done - there is state data that needs to be maintained,

bruce2359

Posted: Thu May 17, 2012 8:32 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

I sense that the OP has (mis)used the REFRESH and/or RESET CLUSTER commands... oooooohhhhhhhhhmmmmmmm.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

cicsprog

Posted: Thu May 17, 2012 8:53 am Post subject:

Partisan

Joined: 27 Jan 2002
Posts: 347

+1

You may want to do a DIS CLUSQMGR(*) CLUSTER(<cluster name>) to see if the QMID and CLUSTER date and times match match between MQM's. If they don't you have corrupt repositories.

exerk

Posted: Thu May 17, 2012 10:34 am Post subject:

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

When you migrated your queue managers did you ensure you migrated your FRs first?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

mvic

Posted: Thu May 17, 2012 11:26 am Post subject: Re: cluster resolution problem

Jedi

Joined: 09 Mar 2004
Posts: 2080

guy11 wrote:

Below is part of amqrfdm output captured in FMQ1 (FR). Note QUEUE.CLUS1 is showing under all the 3 clusters of which PQM3 is also a member.

What you have dumped there is subscription information held on a FR in respect of an interest registered by a PR. It therefore does not mean what you took it to mean, unfortunately.

Continue working with IBM. Cluster problems can take weeks to work out.

My general idea is: check the health of all your channels, and check that all channels are defined as you expect them to be.

What fix pack are you running on your various qmgrs?

I hope you've already checked through all the MQ v7 APARs that say "queue went missing" or that sort of thing. Ensure you're at latest maintenance, as a first step.

Hmm. I normally go to http://www.ibm.com/support/docview.wss?uid=swg21254675 for a list of fixes but the page is not there right now. Does anyone have a better link?

guy11

Posted: Thu May 17, 2012 10:10 pm Post subject:

Newbie

Joined: 16 May 2012
Posts: 8

cicsprog wrote:

+1

You may want to do a DIS CLUSQMGR(*) CLUSTER(<cluster name>) to see if the QMID and CLUSTER date and times match match between MQM's. If they don't you have corrupt repositories.

Thanks. QMID matches, but the CLUSDATE and CLUSTIME doesn't match for some of the clusters, so those cluster caches are corrupted is it ?.

mvic

Posted: Fri May 18, 2012 2:13 am Post subject:

Jedi

Joined: 09 Mar 2004
Posts: 2080

guy11 wrote:

QMID matches, but the CLUSDATE and CLUSTIME doesn't match for some of the clusters, so those cluster caches are corrupted is it ?.

Not likely, but possibly you have a breakdown in communications that mean the updates are not being pushed through the system.

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » Clustering » cluster resolution problem

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP