Author |
Message
|
psiracusa |
Posted: Tue Apr 01, 2008 12:35 pm Post subject: Cluster Corruption When Rebooting Servers |
|
|
Apprentice
Joined: 17 Nov 2006 Posts: 34
|
We have a cluster config with 2 full repos and 2 partials. Our test team has gotten into the habit of rebooting the two servers where the full repos reside simultaneously. Whenever they come back up, we get the 2085 error on the first open and then the 2189 after that. This is easily solved by rebooting just one full repo server at a time but I was wondering if anyone else had noticed this type of behavior or if this is to be expected. All versions are 6.0.2.2 and the full repos servers are Solaris with Intel and partials are 2003. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Apr 01, 2008 1:42 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
What part of "FRs must be highly available" don't your team understand?
If this is in production, you are getting off very lightly for all your
 _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
psiracusa |
Posted: Wed Apr 02, 2008 6:47 am Post subject: |
|
|
Apprentice
Joined: 17 Nov 2006 Posts: 34
|
It's not my team and no it's not production, hence the term "test". I was just wondering if anyone else had experienced the issue or if it was a bug. I couldn't find any documentation that said corruption would occur if a partial lost connectivity to all full repos for any period of time. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Apr 02, 2008 6:51 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
psiracusa wrote: |
I was just wondering if anyone else had experienced the issue or if it was a bug. |
Unlikely as very few clusters (even test ones) are used with both FRs out. It's not a bug, as the documentation does say each cluster must have at least one FR.
psiracusa wrote: |
I couldn't find any documentation that said corruption would occur if a partial lost connectivity to all full repos for any period of time. |
See above re: definition of a cluster. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
psiracusa |
Posted: Wed Apr 02, 2008 7:08 am Post subject: |
|
|
Apprentice
Joined: 17 Nov 2006 Posts: 34
|
I was leaning in that direction but I just wanted to bounce it off someone else. Thanks Vitor. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Apr 02, 2008 7:15 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
A cluster should function just fine if both its Full Repositories are down. Messages will flow. You just can't make any cluster configuration changes since those require at least 1 of the 2 FRs to process the change.
Note I have never tried doing this, just read that its true in posts on the list serve and maybe here, as well as heard about it in the clustering classes.
Ian? Nigel?
(I would defer to one of these 2 guys to be 100% sure) _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
|