Author |
Message
|
cicsprog |
Posted: Tue Aug 21, 2018 5:33 am Post subject: Elongated Recovery in a DRP test |
|
|
Partisan
Joined: 27 Jan 2002 Posts: 347
|
Running MQ v8 on Z/OS and Linux. A large MQ cluster exists across this MQ network. Both full repositories live on Z/OS and are dedicated mostly to handle subscription messages for the large cluster.
During a DRP test, just for the Linux MQ apps, it’s taking an elongated period of time for Linux MQs to become functional. From what I am told, MQ is trying to resolve the connections for the cluster.
While this isn’t probably an optimum DRP test with the full repos on the mainframe, this customer would like to reduce this recovery time for MQ. My only thought for a resolution is to add another full repository on a Linux MQ.
Would appreciate your input on my solution or another approach. Thanks!!! |
|
Back to top |
|
 |
Vitor |
Posted: Tue Aug 21, 2018 7:03 am Post subject: Re: Elongated Recovery in a DRP test |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
cicsprog wrote: |
Would appreciate your input on my solution or another approach. |
What's causing the delay in "resolving the connections for the cluster "? Do you just mean it's taking a while for the DNS to work through? If so, why and why would having a Linux full repository help with that?
As to your proposed solution, DO NOT ADD A THIRD FR!. If you think moving one of the z/OS ones to distributed will help then so be it, but there are 2 FRs in a cluster. Not 3, and 1 FR only on the way to 2. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
cicsprog |
Posted: Tue Aug 21, 2018 7:56 am Post subject: |
|
|
Partisan
Joined: 27 Jan 2002 Posts: 347
|
Ya seeing is believing. Just doing some gap analysis for a client.
I'm being told its 6 hrs trying to resolve connections. Since the FR aren't accessible during the test, its not surprising MQ would thrash around trying to recover. |
|
Back to top |
|
 |
Vitor |
Posted: Tue Aug 21, 2018 8:04 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
cicsprog wrote: |
its not surprising MQ would thrash around trying to recover. |
No, it is.
The channels should obey their short and long retry intervals rather than "thrash about", and even if the FRs are unavailable all of the PRs in the cluster should start up the already defined auto channels to the other PRs. Connection should take about 30 seconds. You're going to get a lot of retry messages out of the manually defined channels to the FRs (obviously) but there's no good reason why all that is taking 6 hours. 6 minutes would be a long time. How long until the cluster is available after you fail back and the FRs are available again?
Again, what does "resolve connections" mean? The DNS? The PRs certainly don't need to "thrash about" to find all the other PRs (and the cluster doesn't work that way anyway) because they know where they are, so what is happening for 6 hours? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
cicsprog |
Posted: Tue Aug 21, 2018 8:17 am Post subject: |
|
|
Partisan
Joined: 27 Jan 2002 Posts: 347
|
I will try and see if I can get some logs from Linux MQs from past DRs. Otherwise, I need to be present to see what's happening. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Aug 21, 2018 8:48 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
During this 6 hour lag, are channels in RUNNING state? RETRY state? Or, any state other than RUNNING?
Did your DISPLAYs during the 6 hour lag indicate any in-doubt transactions or channels?
You didn't indicate what happens at the end of 6 hours. Did the qmgrs suddenly become functional? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Tue Aug 21, 2018 9:02 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
cicsprog wrote: |
Otherwise, I need to be present to see what's happening. |
Or you need someone to be a lot more precise and detailed about what's actually happening. Who's telling you that it's because the queue managers are "resolving connections" (whatever that means) and how did they make that determination? Did the DR'd system start working after 6 hours and they pulled this explanation out of thin air because it sounded technical? Where's their evidence for what's happening?
Most important of all, why would an FR on the distributed side make any difference? Do you think it will make a difference or is this mythical person saying that having a Linux based FR will "resolve connections" quicker?
I suspect the logs will show the queue managers starting after a few minutes and then sitting there. But you don't need to be there to see it; you need this person to tell you, in detail, what they saw and how they drew conclusions from it. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
cicsprog |
Posted: Tue Aug 21, 2018 7:07 pm Post subject: |
|
|
Partisan
Joined: 27 Jan 2002 Posts: 347
|
Thanks for input...waiting for a call back from MQ Admin to get some more info and supposed PMR that was opened at the time of issue. |
|
Back to top |
|
 |
fjb_saper |
Posted: Sun Aug 26, 2018 8:24 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
cicsprog wrote: |
Thanks for input...waiting for a call back from MQ Admin to get some more info and supposed PMR that was opened at the time of issue. |
You may also consider disabling Reverse DNS lookup. I could imagagine that that could bring quite a crick into your delays in DR...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|