MQSeries.net :: View topic - Is it safe to load balance JMS sessions at the TCP level?

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ Java / JMS » Is it safe to load balance JMS sessions at the TCP level?

Is it safe to load balance JMS sessions at the TCP level?

« View previous topic :: View next topic »

Author

Message

paulau

Posted: Wed Jul 24, 2019 4:40 pm Post subject: Is it safe to load balance JMS sessions at the TCP level?

Novice

Joined: 06 Feb 2017
Posts: 19

Hi,
I have been asked to look at the MQ integration approach for a Camel/Spring/JMS over MQ application thats running on OpenShift containers.

One of the requirements is to support connections to more than one queue manager from each container. We tried the CCDT Scenario 6 'queue manager groups' approach documented here https://www-01.ibm.com/support/docview.wss?uid=swg27020848

Unfortunately it doesn't work because the Spring layer binds to the first connection factory and NEVER allows another new connection factory call though to JMS. This means that there is only ever one JMS connection factory channel instances to MQ and all of the subsequent sessions (which create their own svrconn instances) are associated with the one queue manager.

When we route the connections & sessions through MQIPT they get load balanced appropriately using the SampleRoutingExit documented here https://www.ibm.com/support/knowledgecenter/SSFKSJ_9.1.0/com.ibm.mq.con.doc/ipt4191_.htm

The application is event based so doesn't have any issues with requests and responses going to different queue managers.

I have seen some posts where its claimed that JMS traffic shouldn't be routed through a load balancer e.g. "Load Balancer â€“ the load balancer could be provided by the container orchestration engine, or be a standard network load balancer such as an F5. Regardless it can provide TCP load balancing in front of the IBM MQ Queue Managers. This approach removes the application from making the connection selection, and instead will forward onto the load balancer IP and port. Often load balancers have the capability for more feature rich workload management strategies, that may make it more appealing than CCDT. Although there are certain capabilities such as JMS that are not recommended with an external load balancer, so these need to be considered when selecting an effective strategy."

Does anybody have any details around JMS load balancing issues as it seems to work OK.

Any help would be appreciated.
Paul

hughson

Posted: Thu Jul 25, 2019 7:36 pm Post subject:

Padawan

Joined: 09 May 2013
Posts: 1977
Location: Bay of Plenty, New Zealand

Load balancing of client connections is generally OK. Obviously you need to take care of ensuring that wherever the client connection ends up, the queues it expects to use are available to it.

The reason load balancing of client connections is generally OK is because they are stateless, unlike QMgr-QMgr channels where you very definitely shouldn't load balance between them.

There is one situation where clients are not stateless and that is when they are using XA transactions. If you are using XA transactions, then please don't load balance or state about prepared but not yet completed transactions may not be found.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software

paulau

Posted: Thu Jul 25, 2019 10:38 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

Hi Morag, Thanks for the response.

They currently use XA to save state between steps in the payments process so thats bad news. Understand that if XA thinks there are things in the LUW that the queue manager doesn't know about because its on a different connection it would cause issues during recovery.

I guess we will have to keep looking for a way to get the ConnectionFactory to load balance. I am assuming that XA is happy if you keep all of the session connections on the same queue manager that got the ConnectionFactory.

Paul

fjb_saper

Posted: Fri Jul 26, 2019 4:36 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20772
Location: LI,NY

It is also a very bad idea to try and balance what is in essence a JMS MessageDrivenBean. (Think about load balancing an activation specification... nightmare!!).
When dealing with MDBs you should set the max number of instances for a specific queue and then you should setup one instance of the MDB for each queue manager hosting the queue. This way you will get the message wherever it lands...

Have fun

_________________
MQ & Broker admin

paulau

Posted: Tue Jul 30, 2019 3:46 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

Well after checking with the development team it turns out that they are NOT using XA for transaction management. I think they may have started out down the XA path a couple of years ago and then removed it during the performance test phase of the project.

So what they have now is the ability to specify concurrentConsumers in Camel, then some weird behaviour in Spring where it uses 1 ConnectionFactory and 2* concurrentConsumers to generally create a cached pool of 9 MQ connections for the 4 application threads in camel. There are only 4 open handles on the queue. It doesn't seem to matter what you do with maxConnections and maxSessionsPerConnection as the MQAT trace shows a new connection for every session.

There are 150+ different Microservices. some of which have very low volumes. The development team would like more options for managing the connections to MQ. With the Camel/Spring setup that means changing the Spring code to support independent ConnectionFactories, or changing the infrastructure to route the JSM sessions to different queue managers. We have also asked the MQ admins to deploy AMQSCLM so they don't have to have an application instance for each of the 4 queue managers.

Personally I think changing the code to have a connection factory for each queue manager is the more consistent way of managing MQ connections between the OpenShift containers and the queue managers in the cluster however understand that there is a benefit to not changing all of the legacy application code.

Part of the solution is increased visibility of what's happening with the connections so they are in the process of adding MQ statistics for channels and queue handles into the existing Prometheus monitoring framework.

@fjb_saper I understand where you are coming from with the nightmare comment and agree that a connection factory per queue manager is the preferred way to listen to multiple queue mangers. The current setup is a bit of a nightmare to understand because the Spring framework does some weird stuff and it was also being masked by the conversations setting on the svrconn channel. Now that they have some more visibility of the MQ connections and open handles and a better understanding of the framework behaviour they can start to tune settings and make some more informed decisions.

fjb_saper

Posted: Wed Jul 31, 2019 9:19 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20772
Location: LI,NY

Using MDBs I would also set shared conversations to 1 on the svrconn channel, or to 0 if some old compatibility behavior is required.
Remember that you will need a minimum of MDB number of sessions + 1 connections to read from the queue. (The =1 counts for the activation specification).
Have fun

_________________
MQ & Broker admin

paulau

Posted: Mon Aug 12, 2019 4:25 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

The queue managers are V8 on Solaris and they currently have shared conversations set to 10. Agree the target should be 1 as its proven to be compatible and is more efficient than 10.

The downside of the camel/spring boot/jms/mq stack is that we start with 4 threads in camel and the spring boot cache seems to manage a connection pool thats over 30. So our micro services end up using a lot of MQ connections for a small number of threads. The MQ team want us to keep under 600 channel instances in production and the actual number of MQ conversations is likely to be a couple of thousand.

So no option of moving to 1 conversation in the short term until we understand how to measure and manage the connection pool.

paulau

Posted: Mon Sep 23, 2019 6:26 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

We were able to take a step back and try to adopt a broader view of the issues and options for Camel/SpringBoot connections to MQ. The 9.1.3 Uniform Cluster support for JMS encouraged us to do a POC. The Uniform Cluster support for JMS is great and I think the automated load balancing will allow us to have a pattern thats reliable enough for payments with most of the complex stuff supported in the infrastructure.

One of the problems with Camel/Spring is that its big(bloated) and fairly generic. In my view enhancing Spring to support multiple ConnectionFactories with application level load balancing would just add to the complexity and build in an ongoing maintenance/migration cost. Startup time for each micro service to connect through to MQ is over 30s which is not exactly consistent with the intent of a micro service design.

So at the moment we are looking at keeping the application code simple and having MQ monitor application connections via AMQSCLM and Uniform Clusters. In theory using AMQSCLM reduces the need for multiple ConnectionFactories for each container.

The other issue we are trying to manage is connections. WAS does a good job of sharing a small cached connection pool across a lot of application threads. With micro services each micro service gets its own cached connection pool. In our Spring environment that ends with some of the containers we have tested having 4 application threads and 75 MQ connections. Did I mention that I am starting to dislike SpringBoot.

gbaddeley

Posted: Tue Sep 24, 2019 4:11 pm Post subject:

Jedi Knight

Joined: 25 Mar 2003
Posts: 2538
Location: Melbourne, Australia

paulau wrote:

...The MQ team want us to keep under 600 channel instances in production and the actual number of MQ conversations is likely to be a couple of thousand...
...Startup time for each micro service to connect through to MQ is over 30s which is not exactly consistent with the intent of a micro service design....
...In our Spring environment that ends with some of the containers we have tested having 4 application threads and 75 MQ connections....

From a MQ design stand-point this is not a desirable situation. MQ connect time should be much less than 1 second. MQ connections should be pooled, and only maintain as many connections as needed for the concurrent work going on. 600 is excessive, unless there is a need to process at least 600 concurrent application requests.

SpringBoot may not be the answer, unless you can config it (& the other pieces in the puzzle) to be more sociable towards MQ. Deal with these architecture / design issues now before they create major headaches and unreliability in production.
_________________
Glenn

paulau

Posted: Tue Sep 24, 2019 6:15 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

Hi Glen,

Just to clarify the maths. There are 4 v8 Solaris queue managers that I think can do cross site failover in a fairy clumsy way. Currently each qm has a svrconn limit of 600 however the conversations defaults to 10. The JMS implementation is not very good at sharing conversations so the 600 channel instances per server probably supports around 2000 application sessions e.g. most channels have 1 conversation however there are some with 5,7,9 etc.

There are 150 different services which each have their own container and independent connection pool cache. So on the application side its 150 unique services with a minimum of 4 container instances. That means there are 600 application instances, each with their own connection pool.

And if there are 600 applications there are 2400 application threads. In my testing the SpringBoot cache really only seems to work when its set to 3 or more. I believe they are running with a cache size of 5. Thatâ€™s pretty much how we get to 8000 or so connections for 2400 application threads.

Agree that for the past 20 years we have been able to have a small number of MQ connections and efficiently share them across a larger application thread pool. With microservices running in their own container that rule changes. In a microservice world if there are 4 application threads we need a connection pool for those threads that is at least as large. Our application services are probably closer to IIB services in purpose and functionality than they are to a traditional MQ application that would be deployed to WAS.

Camel is essentially an open source IIB environment where you configure the route (messageflow) in XML or a GUI. Camel does have a JMS implementation however itâ€™s not as mature and battle hardened according to the newsgroups. So the MQ part of the startup processing is fast, it just takes 30+ seconds for a docker container to start the OS, load the Camel and Spring stuff and go through the process of establishing a connection factory and 74 sessions against a queue manager.

On one of the services I have traced the 4 application threads had 75 connections included 55 open handles on one queue. Agree thatâ€™s not good and we are trying to optimise things.

So if we go back to the basics there are 600 independent JVMâ€™s with 4 application threads each waiting on a message to arrive at a queue. So we have to support a minimum of 2400 potentially concurrent MQGET calls. In the test environments that I have been looking at most of the time they all sit there on a Get wait call. Sadly the default for the wait in SpringBoot is 1 second to there are 2400 MQGET calls every second. Most of those MQGET calls result in a 2033 and then another 2033 and then another. The best part is that SpringBoot doesnâ€™t use the connection passed by Camel, instead the spring default listener class creates a new connection and does the MQGET call on that new connection. Now both of the threads need to go in the connection pool as active and the code isnâ€™t smart enough to know if a 2033 was work that needs to be committed so when control is passed back to Camel it just goes ahead and does a commit as a 2033 wasnâ€™t an error that needed backing out. So we see double the connections for message listeners and unnecessary MQCMT calls for MQGETs that timed out. It even does an MQCMIT on the â€˜orphanedâ€™ thread that SpringBoot decided not to use.

Did I mention that I was starting to hate SpringBoot?

Platform: MQPL_UNIX
=============================================================================
Tid Date Time Operation CompCode MQRC HObj (ObjName)
242335 2019-09-17 14:18:39 MQXF_GET MQCC_FAILED 2033 2 (Pxxxxxxxx )
242335 2019-09-17 14:18:40 MQXF_CMIT MQCC_OK 0000 -
=============================================================================

MonitoringType: MQI Activity Trace
â€¦
â€¦
Platform: MQPL_UNIX
=============================================================================
Tid Date Time Operation CompCode MQRC HObj (ObjName)
174190 2019-09-17 14:18:40 MQXF_CMIT MQCC_OK 0000 -
=============================================================================

MonitoringType: MQI Activity Trace

So our 2400 concurrent MQGet calls actually need 4800 connections (one orphaned and one for the MQGet) and most of the services also have at least one output queue and a poisoned message queue each with its own cache so we end up with heaps of MQ connections that can potentially be used on concurrent LUWâ€™s.

I would have thought that everyone who is going down the microservices path is having to deal with an increase in MQ connection volumes. Agree that SpringBoot is about as bad an example of a JMS/MQ implementation that you can get. Now that RedHat is part of IBM there might be some effort from the open source community maintaining SpringBoot (yes Red Hat people appear to be active) to make it work a bit more efficiently. I am clearly not the only person complaining about issues with its JMS implementation where its functionally OK but really poor with the non-functional stuff.

It would be a big call for the team here to move off SpringBoot. There are some options to avoid the doubling of MQGet connections however they compromise the transaction management. The most likely scenario here is that they just have to learn to deal with a large number of MQ connections as most of them can be concurrent.

Paul

PeterPotkay

Posted: Wed Sep 25, 2019 5:00 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Thanks for taking the time to post all that info. It will be helpful to more than 1 person I'm sure.
_________________
Peter Potkay
Keep Calm and MQ On

paulau

Posted: Wed Sep 25, 2019 3:18 pm Post subject:

Novice

Joined: 06 Feb 2017
Posts: 19

I got a lot of the Spring details from this blog post: http://tmielke.blogspot.com/2012/03/camel-jms-with-transactions-lessons.html

There are other relevant posts but thats the most comprehensive and detailed. We could see the doubling up of connections but that post explained the why e.g. " The transaction manager does not re-use the already instantiated JMS connection or session but uses the registered connection factory to obtain a new connection and to create a new JMS session."

There are a lot of very smart people on this blog so explaining the scenario should encourage some feedback if I have overlooked something.

While the original requirement was for a JVM to support connections to multiple queue managers I am recommending against that. They could:
1) use the load balancer pattern at the camel layer with each JMS endpoint pointing to a different queue manager
2) extend the spring framework with load balancing and multiple beans/connection factories for each queue manager (other sites have successfully done this)
3) use MQIPT at the TCP layer to spray the session connections across multiple queue managers (because of the orphaned connections the results are really inconsistent, sometime we saw all the orphaned connections go to the same queue manager so I wouldn't trust this in prod)

I think if they use AMQSCLM and Uniform Clusters they can get adequate HA with more predictability and keep the application layer as simple as possible.

Stats for the number of connections per container range from 16 to 307 in the 20 I looked at yesterday. So while the implementation isn't purist micro services and the containers are treated more like pets than cattle I still think its a reasonable compromise and they get a lot of benefits from open shift etc.

In reality the biggest issue is the decision around using bloated packages and frameworks versus custom efficient code. So if you go with a big bit of software like camel/spring then expect to have to live with the overheads.

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » IBM MQ Java / JMS » Is it safe to load balance JMS sessions at the TCP level?

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP