MQSeries.net :: View topic - CLUSSDR channel goes into retry state after an hour(approx.)

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » CLUSSDR channel goes into retry state after an hour(approx.)

CLUSSDR channel goes into retry state after an hour(approx.)

« View previous topic :: View next topic »

Author

Message

Sam Uppu

Posted: Mon Mar 01, 2010 8:49 pm Post subject: CLUSSDR channel goes into retry state after an hour(approx.)

Yatiri

Joined: 11 Nov 2008
Posts: 610

Hi Guys,
We are on MQ 7.0.1 on running on solaris and AIX boxes.

We have 2 queue managers(QM1, QM2) in a cluster..one(QM1) running on AIX and the other(QM2) on Solaris box.

When both queue managers are configured in cluster, the cluster sender/receiver channels are up and running but around an hour later the sender on QM1 pointing to QM2(i.e., TO.QM2) goes into retry state and later to Binding state. When we rebuild the queue manager QM1 with the same config details, the cluster sender/receiver channels will be up and running for an hour and the sender channel goes into retry state.

When I do a DIS CHS(TO.*) on QM1:

dis chs(*)
1 : dis chs(*)
AMQ8417: Display Channel Status details.
CHANNEL(TO.QM2) CHLTYPE(CLUSSDR)
CONNAME(xx.xxx.xx.xxx(1414)) CURRENT
RQMNAME( ) STATUS(RETRYING)
SUBSTATE( ) XMITQ(SYSTEM.CLUSTER.TRANSMIT.QUEUE)
AMQ8417: Display Channel Status details.
CHANNEL(TO.QM1) CHLTYPE(CLUSRCVR)
CONNAME(yy.yyy.yy.yyy) CURRENT
RQMNAME(QM1) STATUS(RUNNING)
SUBSTATE(RECEIVE) XMITQ( )

where as the cluster sender / receiver channels on QMQ2 are up and running all the time.

There is an FDC on QM1 saying BAD_DATA_RECEIVED from QM2. Not sure whether this is related to the issue.

Can you guys show some light on this.

Thanks.

Vitor

Posted: Mon Mar 01, 2010 9:04 pm Post subject: Re: CLUSSDR channel goes into retry state after an hour(appr

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Sam Uppu wrote:

When we rebuild the queue manager QM1 with the same config details

I do hope that's not exactly what you mean, and what you are actually doing is removing/ejecting QM1 from the cluster then recreating it and readding it.

Sam Uppu wrote:

There is an FDC on QM1 saying BAD_DATA_RECEIVED from QM2. Not sure whether this is related to the issue.

If you're not playing fast & loose with the cluster (and it does sound like the queue manager is getting replication information it's not expecting) then it's PMR time.

Unless someone wants to correct my belief default replication is 60 mins or so....?
_________________
Honesty is the best policy.
Insanity is the best defence.

Mr Butcher

Posted: Tue Mar 02, 2010 12:59 am Post subject:

Padawan

Joined: 23 May 2005
Posts: 1716

maybe explizit defined clussdr uses a correct conname

maybe implizit defined clussdr uses a bad conname

maybe disconnect intervall is 1 hour.

after setting up the cluster, the explizit defined clussdr with the correct conname is working. then disconnectinterval makes channel to become inactive. then it is used, and started using the implizit defined channel definition with the bad conname, goes in to retry and binding.

just a guess. other channel attributes could come into account too, e.g. defined exits on QM1 which do not exist on QM2 and so on. i'd also check the amqerr* log files.
_________________
Regards, Butcher

Sam Uppu

Posted: Tue Mar 02, 2010 10:01 am Post subject:

Yatiri

Joined: 11 Nov 2008
Posts: 610

Sorry..this has been resolved. .The issue was we used a different ip(the IP which it is resolving to after a period of time) in the cluster receiver channel on the destination queue manager.

I should have double checked the cluster receiver channel. My apolozies for not checking this earlier.

One question though:
How come the cluster sender/receiver channels were able to start when I provided wrong ip in the cluster receiver channel of the destination queue manager?. I was able to send msgs across in both directions and communication was ok for an hour and later the sender channel of the source qmgr going to retry mode which caused me to believe there is something changing inflight. How come the sender/ receiver pair will work at the initial 1 hour even I provide a wrong ip in the cluster receiver channel of destination qmgr?.

bruce2359

Posted: Tue Mar 02, 2010 10:04 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9489
Location: US: west coast, almost. Otherwise, enroute.

Moved to Clustering forum.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

fjb_saper

Posted: Tue Mar 02, 2010 9:40 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20772
Location: LI,NY

Sam Uppu wrote:

Depends... how did you change the channel?

stop cluster receiver (mode force status stopped)
remove cluster info from cluster receiver
change ip/dns name on cluster receiver
make sure the cluster receiver displays the right ip/dns name using chl display
I'd prefer dns name because it allows network traversal etc...
change channel adding cluster information
verify the change took on at least the FR's and one PR
done

Of course you want the qmgr suspended from the cluster while you do this...

Have fun

_________________
MQ & Broker admin

Mr Butcher

Posted: Wed Mar 03, 2010 12:37 am Post subject:

Padawan

Joined: 23 May 2005
Posts: 1716

Quote:

How come the cluster sender/receiver channels were able to start when I provided wrong ip in the cluster receiver channel of the destination queue manager?. I was able to send msgs across in both directions and communication was ok for an hour and later the sender channel of the source qmgr going to retry mode which caused me to believe there is something changing inflight. How come the sender/ receiver pair will work at the initial 1 hour even I provide a wrong ip in the cluster receiver channel of destination qmgr?.

as i wrote before... for the inital contact your explizit cluster sender channel (with the correct ip) is used. the cluster receiver definition(with the wrong ip) is then received and used to create the implizit defined cluster sender channel. next time the channel starts, this implizit defined channel with the wrong ip is used, and you encountered a non working connection that was working before.
this could either happen by manual stop start, or just by channels going inactive because of disconnect intervall, or other channel disruptions....
_________________
Regards, Butcher

Sam Uppu

Posted: Wed Mar 03, 2010 7:42 am Post subject:

Yatiri

Joined: 11 Nov 2008
Posts: 610

Mr Butcher wrote:

Quote:

Mr. Butcher,
This is what exactly happened.

It was a blunder on my end but as both cluster sender/ receiver channels were up and running which made me think - "I did define properly" but I am not.

Thanks for sharing the info.

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » Clustering » CLUSSDR channel goes into retry state after an hour(approx.)

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP