Author |
Message
|
kordi |
Posted: Wed Mar 18, 2015 12:19 am Post subject: Measuring the time message spend on MQ |
|
|
Centurion
Joined: 28 May 2012 Posts: 146 Location: PL
|
Hello,
We have some complains about response time of reply messages which go through MQ network. Is there any more or less precise way to calculate the time message spend in MQ queues, from time when application puts message to time when remote application gets it and other way round?
I was considering statistics but it shows only general information.
Cheers
Kordian |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Mar 18, 2015 5:48 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
If you are looking for latency across your MQ network, this is a fickle task.
First you'd need to make sure that all your servers in the MQ network are time synchronized. If you don't, you may see messages that are received before they were put...
Say host B is 5 mins ahead of host A. QMA resides on A and QMB resides on B. Network latency between A and B is negligible.
No matter what you do, if you look at message put timestamp you will see:
time on A 8:05, time on B 8:10 (5 mins ahead)
Message sent from A to B put time 8:05 received at 8:10 => message age 5 mins !!!
Message sent from B to A put time 8:11 received at 8:06 => message received 5 mins before it was sent !!!
Now you can get some type of measure by looking at averages on the round trip.
Let's say average round trip is two second, and average difference between put and receive is 5 mins and average processing time is 1 second...
So you now have 1 seconds to account for put, receive, network, receive , put , network and receive operations...
Assuming rather small messages, this would translate essentially into network time... What is all covered in network time? - name resolution (round trip to DNS server)
- ARP enquiries (if necessary)
- routing (see set up of routing table), routing setup (might be faulty),
- actual transmission time (negligible over small distances, may be a matter of concern for intercontinental transmissions if very low latency is required).
This is still not taking into account any time needed for encryption/decryption, whether at the channel level or VPN setup or both.
Have fun  _________________ MQ & Broker admin
Last edited by fjb_saper on Wed Mar 18, 2015 6:02 am; edited 1 time in total |
|
Back to top |
|
 |
Gaya3 |
Posted: Wed Mar 18, 2015 6:01 am Post subject: |
|
|
 Jedi
Joined: 12 Sep 2006 Posts: 2493 Location: Boston, US
|
are you using any SSL between the qmgrs _________________ Regards
Gayathri
-----------------------------------------------
Do Something Before you Die |
|
Back to top |
|
 |
kordi |
Posted: Wed Mar 18, 2015 11:41 am Post subject: |
|
|
Centurion
Joined: 28 May 2012 Posts: 146 Location: PL
|
fjb_saper All our servers are connected to the same time server so we may assume that time is the same on all mq servers. For the rest...well, you are right
My quesation is: how MQ supports measurement of transit time of messages being transmitted over the network MQ servers? What commands, what attributes etc How to do that regardless topics you mentioned?
Gaya3 Yes, we do use TLS on all message channels. |
|
Back to top |
|
 |
jsware |
Posted: Wed Mar 18, 2015 12:54 pm Post subject: |
|
|
 Chevalier
Joined: 17 May 2001 Posts: 455
|
kordi wrote: |
My quesation is: how MQ supports measurement of transit time of messages being transmitted over the network MQ servers? What commands, what attributes etc How to do that regardless topics you mentioned? |
There are some timing programs that come with the rfhutil supportpac. They use a variety of methods.
If the expiry interval is maintained by the receive/respond server program (and you know how long it takes to respond), you could check the expiry interval of the response and compare it to the expiry interval set in the request. This would give you accuracy to 1/10th of second +/- however much error IBM say in their online docs. It may give you what you need (to within a 1/10th of a second may be acceptable margin of error) - Caveat emptor - this is not an exact value and may drift. Should be unaffected by message broker if you're using that as the expiry interval is converted to a time in the future, so any delay in message broker brings you closer to that time and as such an output message gets a reduced expiry interval.
If you can pass the epoch value (say from the current time in milliseconds) in the request and have it carried through in the response (e.g. set the msgid to the epoch and check the correl id in the response). You can then check the correl id epoch against the current epoch and that will tell you how much time has passed. This might be the most accurate.
You may also be able to use the MA0W API trace supportpac to log MQ API calls and their timestamps to demonstrate how long between the MQPUT call and corresponding MQGET returns.
You may be able to set up a simulation to do the following:
1. Create a qremote definition on QMGRA that points at a queue on a remote QMGRB, using xmitq / channels etc.
2. Have the remote q definition on QMGRB be itself a qremote definition back to a reply queue on QMGRA.
3. Use some of the rfhutil command line performance tests to send messages across that channel so they just get echo'd back by QMGRB. It will be able to tell you by second what the throughput rate is. _________________ Regards
John
The pain of low quaility far outlasts the joy of low price. |
|
Back to top |
|
 |
RogerLacroix |
Posted: Wed Mar 18, 2015 2:40 pm Post subject: |
|
|
 Jedi Knight
Joined: 15 May 2001 Posts: 3264 Location: London, ON Canada
|
A simple solution is to turn on COA & COD report options then send the message.
- COA (Confirmation on Arrival) will tell you when the message landed in the queue and was available for the receiving application.
- COD (Confirmation on Delivery) will tell you when the message was read (consumed) by the receiving application.
(1) The COA time minus the message's PutTime is the total time MQ used to move & make the message available.
(2) The COD time minus COA time is the time the message sat in the queue before the message was read by the receiving application.
(3) The response message's (from receiver) PutTime minus the COD time (from original message) is how long the receiving application took to do their thing.
Regards,
Roger Lacroix
Capitalware Inc. _________________ Capitalware: Transforming tomorrow into today.
Connected to MQ!
Twitter |
|
Back to top |
|
 |
gbaddeley |
Posted: Wed Mar 18, 2015 3:20 pm Post subject: Re: Measuring the time message spend on MQ |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
kordi wrote: |
Hello,
We have some complains about response time of reply messages which go through MQ network. Is there any more or less precise way to calculate the time message spend in MQ queues, from time when application puts message to time when remote application gets it and other way round?
I was considering statistics but it shows only general information.
Cheers
Kordian |
Hi Kordian,
MQ will generally move messages through as quickly as possible, but may be constrained by resources such as network, CPU and disk I/O capacity, and contention with other MQ messaging apps. Please check all of these.
Unless the message are very large (>10MB) you should be seeing message delivery times that are no reason for complaints (eg. sub 100ms or sub 10ms).
In most cases it is the applications themselves which are causing the delays, either by bad design or inappropriate usage of MQ, or some other reason unrelated to MQ.
Ideally, apps should be able to log the date/time stamps of when they produce and consume MQ messages to at least millisecond precision. This can be correlated to the putdate/puttime in the messages.
It can be quite difficult to diagnose MQ response time issues, particularly if the messaging paths and apps are complex and message volume is high.
Are the apps using MQ Client ? _________________ Glenn |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed Mar 18, 2015 3:30 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Capitalware's MQ Auditor can capture all the MQ API calls and when they occurred
http://www.capitalware.com/mqa_overview.html
But it might be a little tedious to try and correlate multiple MQ API calls across multiple queue managers by reviewing the log files it produces or the messages it puts to "MQ Auditor" queues. But you could roll your own helper app to read these files and insert the data into a database, with another helper app to query the DB and present all related MQ API calls.
There are other much more expensive, much more complicated tools on the market that not only capture your MQ API calls across QMs but also tie the transaction together in one view as it hops from queue to queue, from QM to QM. Done right, its easy to spot the big gap in time. There's your slow down, almost always inside some application (after the MQGET of the request and before the MQPUT of the reply). _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
Andyh |
Posted: Wed Mar 18, 2015 11:53 pm Post subject: |
|
|
Master
Joined: 29 Jul 2010 Posts: 239
|
Do the QTIME values in the QSTATUS's give you any insight into your issue ?
Do the NETTIME values in the channel status's give you any insight into your issue ?
As others have alluded to, measuring the time taken for a message to be transferred from a putter on one system to a getter on another is quite difficult, due to the possible clock skew. MQ's status values in this area were chosen partially in order to bypass the clock skew concern.
Depending upon the accuracy you require in this measurement, some of the other solutions suggested here may not identify the issue. For example, Expiry is only decrementing while a message is on a queue, a long NETTIME would not be reflected in the delta to the Expiry interval. |
|
Back to top |
|
 |
zpat |
Posted: Thu Mar 19, 2015 12:53 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
For network latency tests - use non-persistent messages.
If you use persistent, then the speed of your disk subsystem will be a factor (of course you can compare the two to see the difference).
There are various ways to improve throughput of messages, especially persistent ones. _________________ Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error. |
|
Back to top |
|
 |
tczielke |
Posted: Thu Mar 19, 2015 5:39 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
Are the complaints about response time in the subseconds (i.e. we need the reply message processed in 0.001 seconds and they are processed in 0.005 seconds), or is it in the seconds range (i.e. it is taking 10 seconds for the reply message to get processed)? If it is in the seconds range, you can probably rely on your timestamps of when the API calls are getting processed (i.e. Application Activity Trace, strmqtrc, etc.) to get a handle on when the messages are flowing through your queue managers and where the delay might be (or at least where the delay might be between).
Regarding clock skew, it is rare but worth noting that you can have clock skew even on the same server with certain distributed architectures (i.e. x86 with multiple processors) when reaching into the microsend range. _________________ Working with MQ since 2010. |
|
Back to top |
|
 |
kordi |
Posted: Mon Apr 13, 2015 12:27 pm Post subject: |
|
|
Centurion
Joined: 28 May 2012 Posts: 146 Location: PL
|
Guys,
Thanks a lot for your comprehensive feedback. Since I am dealing with PROD env, I cant write tools/apps to use on it. I think the best solution would be just asking app team to send one message from and check queue status on transmission and destination queues and other way round during reply. But first we would have to change monq attribute to high to have higest sample rates on queue.
Once again, I do a lot appreciate your help!
Cheers |
|
Back to top |
|
 |
PaulClarke |
Posted: Mon Apr 13, 2015 1:07 pm Post subject: |
|
|
 Grand Master
Joined: 17 Nov 2005 Posts: 1002 Location: New Zealand
|
One other thing you could try ....and this is a shameless plug....is to try the new Trace Route message function in MO71. You can download it here http://www.mqgem.com/mo71_beta_dl.html and I'd be happy to send you a trial licence. Just send a message to support@mqgem.com
What it does is, from the Queue Manager of your choice, send a trace route message to a target queue of your choosing. MQ will send back report messages as the message travels through your network. MO71 then displays a diagram of the path(s) the message(s) take and it will tell you how long the round-trip takes. It will also attempt to show you how long it takes across each channel by subtracting the round-trip times from each side. Measuring performance is never easy, especially remotely, but if the basic MQ network (or TCP) has a serious problem it should help to identify where. It would also show you, in a cluster for example, whether the message are getting evenly balanced amongst the target queues which can be another cause of poor performance.
Cheers,
Paul. _________________ Paul Clarke
MQGem Software
www.mqgem.com |
|
Back to top |
|
 |
kordi |
Posted: Tue Apr 14, 2015 3:24 am Post subject: |
|
|
Centurion
Joined: 28 May 2012 Posts: 146 Location: PL
|
This plug sounds very cool
I would love to test it. I will send mail to support@mqgem.com with request of trial key. Thank you! |
|
Back to top |
|
 |
|