MQSeries.net :: View topic - Solaris cluster WMB61 EG's do nothing

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Solaris cluster WMB61 EG's do nothing

Goto page 1, 2, 3 Next

Solaris cluster WMB61 EG's do nothing

« View previous topic :: View next topic »

Author

Message

kris.pilaet

Posted: Tue Jun 21, 2011 11:56 pm Post subject: Solaris cluster WMB61 EG's do nothing

Newbie

Joined: 21 Jun 2011
Posts: 8

We're running WMBv61 & MQ V7 on a Clustered Solaris 10 environment.

On node B, everything runs fine. When stopping the broker & starting it, it takes its time, but everyting comes back up.

When switching to the other node (A), we're having problems.
I've found a lot of topics here that describe a part of my problem, but not completely and never with a cluster.

Here's mine. So, one node B, I'm having no problem.

When starting our broker on node A: ps -fu usernameBroker shows
* Configuration manager & Broker processes are there
* DataFlowEngines are there
** When all flows of all DataflowEngines were stopped on node B just before the switch, I'm getting
"WebSphere Broker v6108[379]: [ID 702911 user.info] (BRBA01T.eg-names)[1]BIP2208I: Execution group (64) started"
** When giving the command to start a flow of one EG, i'm getting "BIP2066E: Broker 'BRBA01T' (UUID 'a38dfaed-1a01-0000-0080-8fff11ab5618') was una
ble to retrieve an internal configuration response message for execution group"
** No matter which EG I try, they all give me the same error.
** In the mean whille, the appropriate messages pile up on SYSTEM.BROKER.EXECUTIONGROUP.QUEUE
** Every now and then they disappear. On node B, this queue has an Open-Input-count of 28 (as it should), on node B it is 0

Extra info perhaps. When stopping the broker on node B, I'm getting errors BIP2804E: The broker has detected that Execution Group X has not shut down,
which is really stupid cause it has no flows running. I'm getting this for all EG's

Normally, when we switch, Flows aren't stopped before that switch. I've done this to try giving my system more time and start things in a more controlled way.
But when doing that, I immediately get BIP2066E for each EG.

I've started/stopped everything with cluster commands. I've tried it all outside cluster... All the same

Any help, ideas, suggestions?

Our admins have no clue, we have no traces except the ones described.
We're thinking no of rebooting complete node A.

regards
kris

lancelotlinc

Posted: Wed Jun 22, 2011 4:44 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

This is an example of why active-passive is a bad idea. Its alot better to run active-active or active-active-active.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

mqjeff

Posted: Wed Jun 22, 2011 4:47 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

lancelotlinc wrote:

This is an example of why active-passive is a bad idea. Its alot better to run active-active or active-active-active.

This is an example of a badly managed install, not anything else.

Vitor

Posted: Wed Jun 22, 2011 4:54 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

This is an example of why active-passive is a bad idea. Its alot better to run active-active or active-active-active.

Active / Active!

_________________
Honesty is the best policy.
Insanity is the best defence.

Vitor

Posted: Wed Jun 22, 2011 4:56 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqjeff wrote:

lancelotlinc wrote:

This is an example of why active-passive is a bad idea. Its alot better to run active-active or active-active-active.

This is an example of a badly managed install, not anything else.

@kris.pilaet
For clarity - When you say "Clustered Solaris 10" do you mean clustered, or do you mean zoned? Are there zones anywhere in all this?

Does the install on node A work with a stand-alone queue manager & broker?
_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jun 22, 2011 5:13 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

This passive node probably has not been operational in quite a long time. I think if both nodes were active, and one became non-operational, the correction to the problem could have been accomplished much sooner.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Vitor

Posted: Wed Jun 22, 2011 5:19 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

This passive node probably has not been operational in quite a long time.

What leads you to that assumption?

Why is it more likely that the scenario where they're setting up an active / passive & are having trouble with the post-install testing?

There's nothing in the original post that I can see to indicate one way or the other. So what leads you to one scenario not the other?
_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jun 22, 2011 5:20 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

I suppose we can ask the OP: how long has it been since you failed over to the other node?
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Vitor

Posted: Wed Jun 22, 2011 5:22 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

I suppose we can ask the OP: how long has it been since you failed over to the other node?

It is the logical way to find out.

My question on how you inferred one scenario not the other stands.
_________________
Honesty is the best policy.
Insanity is the best defence.

lancelotlinc

Posted: Wed Jun 22, 2011 5:35 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

I know you do not like assumptions, so I guess I can be guilty of making that one.

I inferred the assumption based on the fact that he is on a 7 year old OS using WMB 6.1, so I could not imagine it was a new configuration.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

Vitor

Posted: Wed Jun 22, 2011 5:49 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

lancelotlinc wrote:

I inferred the assumption based on the fact that he is on a 7 year old OS using WMB 6.1, so I could not imagine it was a new configuration.

I suppose Solaris 10 did come out 7 years ago but isn't it still in support? When did Solaris 11 come out?

Bear in mind there are still people installing WMB6.1 despite WMB7 being out. While your site has an agressive update policy, others do not. Especially if they have a large existing estate and set up new machines off a standard install.

I will agree that somewhere in a dark corner of the OP's site, someone should be working on a new install with WMBv7, WMQv7 & Solaris 11. But they may not be, and that's not going to solve the OP's problem.

Which could also be happening on an active / passive that's not been properly maintained and/or tested as per your assumption.
_________________
Honesty is the best policy.
Insanity is the best defence.

kris.pilaet

Posted: Wed Jun 22, 2011 6:22 am Post subject:

Newbie

Joined: 21 Jun 2011
Posts: 8

Vitor wrote:

mqjeff wrote:

lancelotlinc wrote:

This is an example of why active-passive is a bad idea. Its alot better to run active-active or active-active-active.

This is an example of a badly managed install, not anything else.

It is zoned and clustered. Our MQ & Broker zone runs on a cluster.
Queue manager data is shared between both nodes.

We have one queue manager per zone that is contacted by node A or B.

When switching, the only thing that changes is the hardware, and the OS (which is the same OS on A & B)

kris.pilaet

Posted: Wed Jun 22, 2011 6:25 am Post subject:

Newbie

Joined: 21 Jun 2011
Posts: 8

We're having the problem since end of may.

We are actually migrating to version 7 broker, but I'm having a lot of applications running with all their deadlines and SLA's.
Migration is a work that considers a lot of, how to say, struggling at my firm.

About migration to V7-broker: Fixpack 2 isn't out that long. So why hurry??
Is v6.1 suddenly something prehistoric?

lancelotlinc

Posted: Wed Jun 22, 2011 6:30 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

Your passive node has been non-operational for three and a half weeks. The point I was making was, if you were simple active-active, no clustering or zones, you could have resolved the problem within minutes or hours rather than weeks.

WMB6.1 is fine, especially if you have lots of apps that need TLC to migrate.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

kris.pilaet

Posted: Wed Jun 22, 2011 6:39 am Post subject:

Newbie

Joined: 21 Jun 2011
Posts: 8

lancelotlinc wrote:

okay lancelotlinc, i'm not going to argue that

. But I think the solution remains the same, not? The thing is, what is the solution?

Display posts from previous:

Goto page 1, 2, 3 Next

Page 1 of 3

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Solaris cluster WMB61 EG's do nothing

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP