ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » "ghost" entry of FR QMGR in PR QGMR's repository

Post new topic  Reply to topic
 "ghost" entry of FR QMGR in PR QGMR's repository « View previous topic :: View next topic » 
Author Message
zhanghz
PostPosted: Tue Aug 26, 2008 11:12 pm    Post subject: "ghost" entry of FR QMGR in PR QGMR's repository Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

hi guys, please help (I somehow mentioned this in one of my previous posts).

2 QMGRs on 2 AIX boxes, holding full repository. Let's call these 2 QGMRs MQ1 and MQ2. z/OS QMGR Z5 holds partial repository.

It has been smooth until one day, Z5 CLUSSDR to MQ1 was inactive. DIS CLUSQMGR on Z5 showed 4 entries: 2 MQ1, 1 MQ2 and 1 Z5. For the 2 MQ1, one has QMID of MQ1_2008-05-31-xxx, CONNAME of "DR server IP address", the other one has QMID of MQ1_2008-06-30-yyy, CONNAME of "production server IP address". DIS CLUSQMGR on MQ1 showed 3 entries: 1 MQ1 having QMID of MQ1_2008-06-30-yyy, CONNAME of "production server IP address", 1 MQ2 and 1 Z5.

I do not understand why all of a sudden Z5 had that additional "ghost" entry in Z5's repository.

This has happened several times. As I did (which I shouldn't have done) refresh cluster(xx) repos(yes) on z/OS side to resolve the problem, I am being chased on what happened and why.

Some of you guys has mentioned in my previous post that this maybe because the DR server got connected to production network. If that for sure will not happen, what else might cause the problem?

My team lead suspected that AIX team used their production box to test the set-up of the QMGR and at that time the server was given the DR IP address. Then later they changed the IP address to production IP address, but didn't do a clean job after the change. I agreed with that given that fact that the date in the QMID. But my question remains, why it has been ok but suddenly that DR IP entry appeared in z/OS QMGR?

Any thought is much appreciated! Thanks.
Back to top
View user's profile Send private message
zhanghz
PostPosted: Tue Aug 26, 2008 11:16 pm    Post subject: Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

And how to prevent this from happening again?

If I don't check DIS CLUSQMGR on z/OS QMGR, I wouldn't find out that there is this additional "ghost" entry. The channel status is inactive, which we usually treat as good, no problem.
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Wed Aug 27, 2008 1:11 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

Every Queuemanager in a cluster is given a unique ID, Now if your DR server is to replace one of your running servers, and you have a queuemanager with the same name as a member of the cluster you will se the queuemanager twice, same qmgr name bit different qmgr id for the cluster. however this was cvisible on the z/OS queuemanager, its not a very good design.
i assume that the information was prom one of th efull repositories to the z/OS queuemanager, so chack all FRs for that doublicate definition and remove it if required.
i'd change the cluster design too, not havingtwo queuemanagers with the same name in the cluster.
_________________
Regards, Butcher
Back to top
View user's profile Send private message
zhanghz
PostPosted: Thu Aug 28, 2008 1:06 am    Post subject: Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

The AIX DR server is supposed for DR only. It's never supposed to be connected to any production server, AIX side or z/OS side.

As it's the DR server, it has the same QMGR name and same cluster name as in the production server. But IP addresses for AIX DR servers and z/OS DR environment are different from those in produciton environment. That's something I don't understand: even if we assume there was a slim chance that that AIX DR server was exposed to production network, with the definition of channels in AIX DR QMGR pointing to DR IP addresses, the channels should never be able to be in running status, then the AIX DR server's QMGR information should have no way of going to any production QGMR in the cluster.

Also, if we say the problem is due to AIX guys' unclean job (i.e. created QMGR and cluster using DR IP address, then later on simply changing the server's IP address to production IP address assigned to it), again, why most of the times the cluster repository is ok but suddenly the "ghost" entry appears?

Headache.. I pray for user not asking further questions on this and such problems not happening again..
Back to top
View user's profile Send private message
ranganathan
PostPosted: Thu Aug 28, 2008 9:36 am    Post subject: Reply with quote

Centurion

Joined: 03 Jul 2008
Posts: 104

1. What is the current status ?!
2. What happens if you do a DIS CLUSQMGR in MQ1/2 (FRs) ?
3. Did you try SUSPEND the Z/OS QM and rejoin it ?!
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Thu Aug 28, 2008 3:31 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

zhanghz wrote:
The AIX DR server is supposed for DR only. It's never supposed to be connected to any production server, AIX side or z/OS side.

As it's the DR server, it has the same QMGR name and same cluster name as in the production server. But IP addresses for AIX DR servers and z/OS DR environment are different from those in produciton environment. That's something I don't understand: even if we assume there was a slim chance that that AIX DR server was exposed to production network, with the definition of channels in AIX DR QMGR pointing to DR IP addresses, the channels should never be able to be in running status, then the AIX DR server's QMGR information should have no way of going to any production QGMR in the cluster.

Also, if we say the problem is due to AIX guys' unclean job (i.e. created QMGR and cluster using DR IP address, then later on simply changing the server's IP address to production IP address assigned to it), again, why most of the times the cluster repository is ok but suddenly the "ghost" entry appears?

Headache.. I pray for user not asking further questions on this and such problems not happening again..

You need to go for a very short time to one full rep.
On that full rep force the wrong QM1 out of the cluster:
RESET CLUSTER(mycluster) action(forceremove) qmid('MQ1_2008-05-31-xxx') queues(yes)
You should then run that same command on the Zos qmgr.
You can then make the other FR a full repository again.


The problem here is that for whatever reason a definition of the DR QM1 got into one of your repositories, or that ZOS has a cluster connection (manually defined) to QM1 in DR ?. Should this be the case removing that connection is your very first task.

Within a cluster you CANNOT have 2 qmgrs with the same QMName without running into serious routing difficulties.

Have fun...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
zhanghz
PostPosted: Thu Aug 28, 2008 8:34 pm    Post subject: Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

ranganathan wrote:
1. What is the current status ?!
2. What happens if you do a DIS CLUSQMGR in MQ1/2 (FRs) ?
3. Did you try SUSPEND the Z/OS QM and rejoin it ?!

1. I did REFRESH CLUSTER(XXX) REPOS(YES) on z/OS QMGR (which I shouldn't have done as z/OS QMGR is shared by many applications. REFRESCH CLUSTER from AIX FR QGMR should resolve the problem too and AIX QMGRs are only for that one application), now the "ghost" entry is gone already. What I am afraid of is this may happen again without warning..
2. DIS CLUSQMGR in MQ1 and MQ2 shows the 3 QMGRs without any "ghost" entries.
3. No SUSPEND or any other operations was done. The control in z/OS is always very very stringent. Nothing was changed on z/OS QMGR since the cluster channels and queues were defined for the application.

Thanks.
Back to top
View user's profile Send private message
zhanghz
PostPosted: Thu Aug 28, 2008 9:44 pm    Post subject: Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

fjb_saper wrote:

You need to go for a very short time to one full rep.
On that full rep force the wrong QM1 out of the cluster:
RESET CLUSTER(mycluster) action(forceremove) qmid('MQ1_2008-05-31-xxx') queues(yes)
You should then run that same command on the Zos qmgr.
You can then make the other FR a full repository again.

The thing is the MQ1 and MQ2 had no "ghost" entry in their repositories. I tried to do RESET CLUSTER on z/OS QMGR without realizing that command was for FR QMGRs only.

fjb_saper wrote:

The problem here is that for whatever reason a definition of the DR QM1 got into one of your repositories, or that ZOS has a cluster connection (manually defined) to QM1 in DR ?. Should this be the case removing that connection is your very first task.

z/OS QMGR does not have any channels to DR IP addresses. CONNAME is pointing to production AIX server's DNS host name. So it's a mystery now. Don't know what really caused the problem..
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Aug 29, 2008 6:40 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

zhanghz wrote:
The thing is the MQ1 and MQ2 had no "ghost" entry in their repositories. I tried to do RESET CLUSTER on z/OS QMGR without realizing that command was for FR QMGRs only.


No the command is appropriate for a partial repository too.
In fact using the command with the addition REPOS(YES) should only be executed on a partial repository. It will clear the cluster information and force the partial repository to reacquire it all from a full repository.
This is why you never want to execute it on a partial repository while the network is having problems.

Enjoy
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Sat Aug 30, 2008 6:03 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

FJ,
The RESET command is only valid from a Full Repository.

http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/topic/com.ibm.mq.csqzah.doc/qc11190_.htm
Quote:
You can issue the RESET CLUSTER command only from full repository queue managers.

_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sat Aug 30, 2008 6:42 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

Sorry, of course!!!! Got confused with REFRESH CLUSTER as the addition of REPOS(YES) would have pointed to....

The fingers jumped the link ahead of the mind:
Issue RESET cluster action(forceremove) on the full repository
Then issue REFRESH cluster repos(yes) on the offending partial.
This should guarantee that all traces of the offending channel have been removed from the cluster.


_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
zhanghz
PostPosted: Mon Sep 01, 2008 3:10 am    Post subject: Reply with quote

Disciple

Joined: 17 Jun 2008
Posts: 186

fjb_saper wrote:
Sorry, of course!!!! Got confused with REFRESH CLUSTER as the addition of REPOS(YES) would have pointed to....

The fingers jumped the link ahead of the mind:
Issue RESET cluster action(forceremove) on the full repository
Then issue REFRESH cluster repos(yes) on the offending partial.
This should guarantee that all traces of the offending channel have been removed from the cluster.


Regarding how to clear the "ghost" entry, I did a test on my laptop. It seems "RESET CLUSTER(XXX) ACTION(FORCEREMOVE)" alone on a FR QMGR can already clear the entry on PR QMGR. No need to issue REFRESH CLUSTER(XXX) REPOS(YES) on PR QMGR.

ps: my test couldn't simulate my production problem.. but i noticed the above.
Back to top
View user's profile Send private message
ranganathan
PostPosted: Mon Sep 01, 2008 3:27 am    Post subject: Reply with quote

Centurion

Joined: 03 Jul 2008
Posts: 104

RESET cluster will inform all other queue managers in the cluster that the queue manager is no longer available. In order to rejoin the QM to the cluster you need to give a REFRESH CLUSTER...

http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/index.jsp?topic=/com.ibm.mq.csqzah.doc/qc11190_.htm
Back to top
View user's profile Send private message
mqjeff
PostPosted: Mon Sep 01, 2008 3:55 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

ranganathan wrote:
In order to rejoin the QM to the cluster you need to give a REFRESH CLUSTER...


NO.

You are unlikely to need to use this command, except in exceptional circumstances.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Sep 01, 2008 3:13 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

zhanghz wrote:
Regarding how to clear the "ghost" entry, I did a test on my laptop. It seems "RESET CLUSTER(XXX) ACTION(FORCEREMOVE)" alone on a FR QMGR can already clear the entry on PR QMGR. No need to issue REFRESH CLUSTER(XXX) REPOS(YES) on PR QMGR.

ps: my test couldn't simulate my production problem.. but i noticed the above.


You are right. Regarding the refresh cluster, like Jeff said it is an exceptional measure. We had to use it because even resetting the FR did not remove the bad definition from the PR. Our PR is also a full FR in a different cluster. So we kicked the PR out of the cluster on the FR.
The only way the PR rejoined the cluster was to use
refresh cluster(mycluster) repos(yes).
This allowed us to limit the refresh to the cluster where the qmgr is a PR and forced it to reacquire all its definitions for that cluster from the FR.

Enjoy
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » "ghost" entry of FR QMGR in PR QGMR's repository
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.