MQSeries.net :: View topic - Significance of Times Max Thread Reached in flow stats

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Significance of Times Max Thread Reached in flow stats

Significance of Times Max Thread Reached in flow stats

« View previous topic :: View next topic »

Author

Message

abd.wsu

Posted: Thu Jun 01, 2017 8:19 am Post subject: Significance of Times Max Thread Reached in flow stats

Acolyte

Joined: 12 Sep 2012
Posts: 65

Hi,

We have a flow that is making a HTTP request call to an external webservice and there have been complains of slowness. We turned on the snapshot stats on our broker running on IIB10 and see that Average Elapsed time in the entire flow is about 300ms with the request node taking about 200ms. Now, this doesn't look slow to us and it has been the same for pretty much ever since we started the stats.

However, while looking at the stats from tivoli, I noticed we have unusually high number of `Times Max Threads Reached` count. We are using 6 additonal instances, so the no. of threads is showing as 7, but at certain point, i saw, `Times Max Threads Reached` at 60K. Should this concern me? Is this indicative of anything wrong with the flow?
The slowness resolved on it's own, but we really don't see any difference in the stats during the issue happening and after it was resolved. But i am just trying to cleanup from my side and this huge number caught my eye. Please suggest.

mqjeff

Posted: Thu Jun 01, 2017 8:26 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

I think "Times Max Threads Reached" means "the number of times that all threads were in use".

That suggests that you should increase the number of instances to better match the volume of incoming messages...

If you can capture it in a lower environment than production, a user trace would be useful.
_________________
chmod -R ugo-wx /

Vitor

Posted: Thu Jun 01, 2017 8:29 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

abd.wsu wrote:

We are using 6 additonal instances, so the no. of threads is showing as 7, but at certain point, i saw, `Times Max Threads Reached` at 60K. Should this concern me? Is this indicative of anything wrong with the flow?

If you have 360K messages going through the flow during the period of the snapshot, probably not. If you have 60K, then probably. You don't mention volumes.

What that statistic is telling you is the number of times (during the interval) all of the threads were used. At it's simplest, it's telling you that (based on the flow throughput you don't mention) you're using all of the thread resource you've allocated and therefore there's a possibility of running out if throughput increases.

abd.wsu wrote:

The slowness resolved on it's own

I am always highly suspicious of situations where the magic just came back on it's own. It's much more likely that whatever problem was causing the slowdown (and that problem could be external to IIB) was resolved.

I would not want to be in a position where users complain about slowness and my response is "don't worry; it happens sometimes, it'll fix itself in a minute if we just wait patiently" or "oops, out of magic again; I'll just pour some pixie dust in the back of the server"
_________________
Honesty is the best policy.
Insanity is the best defence.

abd.wsu

Posted: Thu Jun 01, 2017 9:18 am Post subject:

Acolyte

Joined: 12 Sep 2012
Posts: 65

Thanks for the replies.

@mqjeff, I'll see what i can do in lower environment. Need something to generate close enough load to production. I'll try some options.

@Vitor, sorry i didn't mention the total input messages. Its actually the same number as the Times Max threads reached. at that point. 60K.

We didn't wait for the magic to come back, but followed the 'Bounce everything, hold hands and sing kumbaya' mantra to please the Gods. Yes. We are digging in to see what caused the slowness. There is an LDAP Authentication layer which is being looked at closely. But from my observation, the only thing to indicate any issue is these stats. We don't see any errors in the logs and like i said, the Average Elapsed time on the flow stats is pretty much similar before/after issue.

Vitor

Posted: Fri Jun 02, 2017 4:53 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

abd.wsu wrote:

We didn't wait for the magic to come back, but followed the 'Bounce everything, hold hands and sing kumbaya' mantra to please the Gods.

It's an equally valid strategy, and more proactive. Indeed with Windoze, it's the de facto first step.
_________________
Honesty is the best policy.
Insanity is the best defence.

Vitor

Posted: Fri Jun 02, 2017 4:55 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

abd.wsu wrote:

@Vitor, sorry i didn't mention the total input messages. Its actually the same number as the Times Max threads reached. at that point. 60K.

So whatever's causing the slowness, it's clear that you were at the limit of the thread resources. This could cause your customers to receive actual timeouts and connection refused type errors rather than just slow performance.
_________________
Honesty is the best policy.
Insanity is the best defence.

fjb_saper

Posted: Fri Jun 02, 2017 5:35 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20771
Location: LI,NY

I could not find he interval but I looked at the math and this is what it gives
300 ms avg elapsed flow time gives about 3 TPS
3TPS times 7 threads = about 21 TPS.

60 K messages @ 21 TPS = about 48 mins running nearly full tilt...
(60000 *300 /1000 /60 / 7 = 43.8 mins)

If you expected to be done any sooner you have a capacity problem....

_________________
MQ & Broker admin

abd.wsu

Posted: Tue Jun 06, 2017 9:15 am Post subject:

Acolyte

Joined: 12 Sep 2012
Posts: 65

So @fjb_saper when you say capacity problems, is it within the flow or the eg or the broker and server itself. I checked the RHEL6 vitals at the time of the issue and there were no constraints. At the OS level everything seems to be humming along nicely.

No, what would fix whatever this capacity problem is? Increasing the number of flow instances?

mqjeff

Posted: Tue Jun 06, 2017 9:30 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Ok. To review (mainly for my own memory):

Your flow makes an HTTP request.
Your flow is receiving 60k messages, and sending out 60k http requests
during processing of those 60k requests, you are receiving 60k "Times Max Threads Reached"
You are running 7 instances

If during processing those 60k messages, you get more than about 25% times max threads reached, you are probably running into issues with the server receiving the http requests.

If the goal is to respond to the applications calling your flow as quickly as possible, then you should reduce the timeout on your HTTPRequest node. That way it will timeout faster, and return a response to the calling application saying "I timed out". The application then needs to retry the call.

If the application can *never* receive a timeout response, and have to retry, then a) increase your thread instances, b) get a new job...

You don't say how you are receiving messages to start the flow.

If you are using an HTTP transport node (soap, http, json, etc) then you can adjust the properties of the relevant HTTP connector to get more threads waiting in the http listener than in the flow.
_________________
chmod -R ugo-wx /

abd.wsu

Posted: Tue Jun 06, 2017 10:51 am Post subject:

Acolyte

Joined: 12 Sep 2012
Posts: 65

Yup. I did the first thing. Reduced the timeout on the HTTP Request node to timeout sooner.
If increasing the instances is gonna cost me my job then maybe don't do it.

I'll research on the last part and see how it will help with my issue. We are using a broker wide http listener. So I am not sure how this would affect the other flows. Let me research/google that. Thanks.

Vitor

Posted: Tue Jun 06, 2017 11:08 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

abd.wsu wrote:

If increasing the instances is gonna cost me my job then maybe don't do it.

There's an implied "or" between the a) and b) options of my most worthy associate. The b) option is the last resort for those of us faced with impossible requirements.

Of course, increasing the number of instances unwisely could cost you your job.
_________________
Honesty is the best policy.
Insanity is the best defence.

mqjeff

Posted: Tue Jun 06, 2017 11:11 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

I neither meant that increasing the instances of your flow would cost you your job, nor implied an "or".

I said if you were faced with ridiculous/impossible requirements (that the application can't handle a timeout and can't do a retry) , then you should band-aid issues until you get a new job.
_________________
chmod -R ugo-wx /

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Significance of Times Max Thread Reached in flow stats

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP