MQSeries.net :: View topic - Multiple Gateway QMs in cluster

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » Multiple Gateway QMs in cluster

Goto page Previous 1, 2

Multiple Gateway QMs in cluster

« View previous topic :: View next topic »

Author

Message

hughson

Posted: Sun Apr 19, 2020 4:42 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1976
Location: Bay of Plenty, New Zealand

Received the following image and text in an email from the OP.

zrux wrote:

Please find the picture. I have edited your diagram and tried to make it look like the topology I had in mind.

The basic issue I am trying to address at this design stage is
How can the external QMs which is not part of the cluster send to the GW QMs in DC1 and DC2, taking into account that if connectivity to DC1 or its infrastructure is down the messages should be able to go to DC2.

The earlier design you sent assumes the GW QM in DC1 is always up and running. I understand that GW QM in DC1 can be made as MQ MI/ under VCS cluster, but doing that we cannot assume that the GWQM in DC1 can always be reachable by external QM as there might be cases that DC1 is not reachable due to network issues, NFS failure on DC1 etc.

I think the SDRs to the external QMs from the DC1/DC2 need to be uniquely named, to rule out SDRs going out of Sync.
The RCVRs at the DC1/DC2 needs to be named same so that the remote Q on EXT QM can use the same SDR channel to DC1, DC2 to send the messages.

With scripts in DC1/DC2 which resets the seq numbers if it see an error on the error logs and as per the expected number on the error logs.

Have asked for confirmation about which queue managers have the same name, as this is not shown on the picture, but no response as yet.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

fjb_saper

Posted: Mon Apr 20, 2020 6:56 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20771
Location: LI,NY

This is all very distressing.
Reading the OP you'd think that both DC's participate in a single cluster.
However this is not at all reflected in the picture.
It looks here as if both DC's are completely independent...

Count me confused...

_________________
MQ & Broker admin

hughson

Posted: Mon Apr 20, 2020 5:36 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1976
Location: Bay of Plenty, New Zealand

Got an answer to my email question about same named queue managers.

zrux wrote:

Initially my thinking was the GWQM(1/2) can have the same name. But the cluster wont like that.
So its been renamed as GWQM1, GWQM2. I am open to suggestions on this.

I am also confused by this since the diagram doesn't have GWQM1 and GWQM2 in the same cluster.

However, it is very good practice to avoid designing something with the same named queue managers, whether they are in the same cluster or not, so I am glad to hear that this is now the case.

@zrux, if you see this post, please can you help us to understand the two different clusters. Your DC1 and DC2 on your diagram. Are they all one cluster, or two different clusters?

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

zrux

Posted: Tue Apr 21, 2020 12:18 am Post subject:

Apprentice

Joined: 21 May 2006
Posts: 41
Location: UK

the QMs in DC1 and DC2 are part of the same cluster.

pcelari

Posted: Wed Apr 22, 2020 6:39 am Post subject:

Chevalier

Joined: 31 Mar 2006
Posts: 411
Location: New York

can we assume your DC1 and DC2 are geographically so far apart, (DC1 in NY, DC2 in LA) that a multi-instance QM as GWQM is not feasible due to latency?

If it is the case, the discussion needs to continue, I'm sure you're not the first one encountering this need...

If it's NOT the case, you can readily put a F5 GTM in front of the two DC'2 for the external parties to retrieve the current active IP based on which one has the active GWQM, provided multi-instance is an option for you. Or you can keep a copy of the GWQM synchronized using a metro-cluster solution at the O/S and network level, without using multi-instance.

It's an interesting discussion...
_________________
pcelari
-----------------------------------------
- a master of always being a newbie

Vitor

Posted: Wed Apr 22, 2020 7:28 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

pcelari wrote:

...this is exactly the situation I find myself in and my earlier post refers. We have one (1) gateway queue manager with one (1) external facing IP address that runs in any of our data centers but only ever in one at any given time. So in the event of a failure it looks from the outside like our gateway queue manager went down and, a short time later, came back. From the inside it's a completely different instance of the same queue manager running somewhere different.
_________________
Honesty is the best policy.
Insanity is the best defence.

zrux

Posted: Wed Apr 22, 2020 9:29 am Post subject:

Apprentice

Joined: 21 May 2006
Posts: 41
Location: UK

@pcelari - The DCs are 1000s of miles apart. And hence cannot be under MQ MI.

I will need to explore "metro-cluster solution at the O/S". If you got any more info on this please share. Also, has anyone successfully implemented this ?

-------------

@Vitor - Like I said earlier, the setup I am preparing, needs to take into account -> External QM may not be able to connect to DC1 GWQM as there might be cases that DC1 is not reachable due to network issues, NFS failure on DC1 GWQM MQMI etc.

The External QM should be able to connect to DC2, if the network link/ HA GW MQ on DC1 is not available.

-------------

Vitor

Posted: Wed Apr 22, 2020 10:03 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

zrux wrote:

@Vitor - Like I said earlier, the setup I am preparing, needs to take into account -> External QM may not be able to connect to DC1 GWQM as there might be cases that DC1 is not reachable due to network issues, NFS failure on DC1 GWQM MQMI etc.

The External QM should be able to connect to DC2, if the network link/ HA GW MQ on DC1 is not available.

-------------

I get all of that. That's the assumption in my set up.

So I have a queue manager called GWQM which all my external clients connect to at 256.300.218.1 and it runs in DCA in New York. It's doesn't run in DC1 (because that's your data center

) nor does it run in DCB (Austin), DCC (San Francisco) or DCE (Toronto).

One day Godzila comes up out of the Hudson and stomps DCA flat, doing some damage to the rest of New York to cover up the fact he's working for a rival financial institution bent on sabotage. The data center is a total loss, no connectivity, nothing. The external client channels go into retry.

A brief period of time later, GWQM starts up again and starts receiving traffic on 256.300.218.1. Our clients relax and start sending us stuff. They don't know that the queue manager is now running in DCB with the IP address redirected and don't care. If any compute capacity survives in DCA and we can reconnect to it once the cable's fixed then we can start sending cluster traffic from DCB to DCA in the same way we used to send it from DCA to DCB.

If Godzila takes a road trip and starts smashing up Austin, then we move the IP address again and use the queue manager in one of the survivng data centers including (in extremis) a queue manager running in Azure or AWS.

But there's still only one queue manager called GWQM in our topology.
_________________
Honesty is the best policy.
Insanity is the best defence.

zrux

Posted: Wed Apr 22, 2020 12:02 pm Post subject:

Apprentice

Joined: 21 May 2006
Posts: 41
Location: UK

@Vitor - So with your setup have you got
Is it ->

1) GWQM on DCA and GWQM on DCB with the QMs names being the same(unlikely as cluster doesn't like same named QM)

or

2) GWQM on DCA and GWQMx on DCB with the QMs names being different but when external QMs try to connect have a QM alias created on DCB/DCA?

or

3) you are moving the GWQM QMs files to DCB in the event of a failure on DCA

or
4) Something else (not sure what...)

If you are doing 2) and QM at DCB is a different instance all together and being resolved by an alias are you resetting the SDR/RCVR channel sequence number when a failover occurs from DCA to DCB?

Or is there any other way you are managing the failovers.

Vitor

Posted: Wed Apr 22, 2020 12:30 pm Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

zrux wrote:

3) you are moving the GWQM QMs files to DCB in the event of a failure on DCA

Though we're not moving them in the event of a failure; we're replicating them in case there's a failure. One data center is the designated fail over site and gets the files in real time, the rest are replicated as and when (typically less than 60 seconds later).

This isn't an MQ thing, this is a "what do we do if we lose all or part of DCA" thing. As a practice we try and distribute everything over all the data centers but there's a few things for which this is not feasible (the gateway queue manager being a good example).

Note that you don't mention changing the external IP address so it resolves to DCB not DCA.
_________________
Honesty is the best policy.
Insanity is the best defence.

pcelari

Posted: Thu Apr 23, 2020 6:33 am Post subject:

Chevalier

Joined: 31 Mar 2006
Posts: 411
Location: New York

Vitor wrote:

... we're replicating them in case there's a failure. One data center is the designated fail over site and gets the files in real time, the rest are replicated as and when (typically less than 60 seconds later).

@Vitor, This is a brilliant design. It seems your 'fail-over' site is a relay point for real-time data synchronization - a hub for replicating to other sites in the event the primary site goes down. The other sites stay synchronized with a slight, acceptable delay. Presumably your data fail-over site must be somehow in the middle geographically.

But given the physical distance, a single IP point of entry will likely result in significant latency for clients in the far end of the continent. How do you remedy that?

@zrux, the Metro-Cluster data synchronization will not work for you as your DCs are too far apart. it works only for DCs within a "metropolitan area", say 50 miles apart.
_________________
pcelari
-----------------------------------------
- a master of always being a newbie

Vitor

Posted: Thu Apr 23, 2020 6:58 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

pcelari wrote:

But given the physical distance, a single IP point of entry will likely result in significant latency for clients in the far end of the continent. How do you remedy that?

Dedicated fiber and telling the clients to suck it up.

Seriously, we have the best backbone money can lease, and we do packet-shaping at the network level to give priority to external clients with low SLA.

But (and don't tell anyone on this forum) this is one reason we're moving away from MQ as a transport protocol and into REST. It gives a lot more flexibility in routing clients to geographically adjacent data centers and external clients with low SLA / poor tolerance for latency are being encouraged to move away from MQ so they can use a more local DC.

A good whack of our available bandwidth is used (and reserved) for replication as we chase the Holy Grail of multiple synchronized data centers over which the entire workload can be evenly distributed. We also have a contact admin ton of automation and monitoring controlling the replication and fail over of components, most of which works most of the time but we still make sacrifices to the gods of business continuity during the quarterly fail over testing.

It's also worth pointing out that we are a North American institution with some Canada & a spot of Mexico so we're not trying to do this globally, which limits the possible physical distance to "gosh" rather than "yikes".
_________________
Honesty is the best policy.
Insanity is the best defence.

zrux

Posted: Wed Jun 17, 2020 7:57 am Post subject:

Apprentice

Joined: 21 May 2006
Posts: 41
Location: UK

any tried RDQM available on v9 onwards instead of copying the files for the Gateway QM to the 2nd site?

exerk

Posted: Wed Jun 17, 2020 8:45 am Post subject:

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

zrux wrote:

any tried RDQM available on v9 onwards instead of copying the files for the Gateway QM to the 2nd site?

Will the link between the DCs meet the minimum latency requirement for RDQM?

The KC states:

RDQM HA

Quote:

If you do choose to locate the nodes in different data centers, then be aware of the following limitations:

* Performance degrades rapidly with increasing latency between data centers. Although IBM will support a latency of up to 5 ms, you might find that your application performance cannot tolerate more than 1 to 2 ms of latency.

RDQM DR

Quote:

You should be aware of the following limitations:

* Performance degrades rapidly with increasing latency between data centers. IBM will support a latency of up to 5 ms for synchronous replication and 50 ms for asynchronous replication.

_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » General IBM MQ Support » Multiple Gateway QMs in cluster

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP