|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
  |
|
IBM MQ8.0 z/OS performance |
View previous topic :: View next topic |
Author |
Message
|
GheorgheDragos |
Posted: Sun Jul 22, 2018 11:29 pm Post subject: IBM MQ8.0 z/OS performance |
|
|
Newbie
Joined: 28 Jun 2018 Posts: 7
|
Hello,
My name is Dragos Gheorghe, I am 30 years old and operating as a z/OS CICS adminfor the last 3.5 years, after spending around 7 as a z/OS Operators in 2 countries for 3 diferent corporations. I pride myself that my overall MF knowledge is pretty extensive.
In my current position, things are well, and I am afraid I have reached a point where I am more or less stagnating. There are many things to learn, and to improve, however, I cannot decide in any direction.
So in this case, I have found, through a colleague, a statistics gathering tool for MQ- MP1B, I can't post links thought... pretty extensive one, as our current one, based on SAS - MXG anyone ? - is a little tricky for me, plus I haven't really put that much effort in to it, because I prefer my own solutions than to build and improve on antique(which are good) designs.
I have gathered statistics for the last 2 weeks from SMF, both 115 and 116.
I would like to know, if possible, what should I use to create an excel chart to show the daily usage,PUT's and GET's ? This is to see when MQ is peaking, what can I do to improve, for example, if a buffer pool gets full and messages are being written to disk. So that I can increase the number of buffers, or, allocate over the bar.
And this issues another question. What would you recommend. To have below the bar ( 4 KB ) buffers for long lived message, and over the bar for small ones? Now we have buffers only below. But that has another impact, because I understood that if we have a large buffer, and the utilization is around 70%, the performance actually decreases.
In any case, I am considering indexing queues based on the reports from the tool, moving queues with persistent messages to large buffer pools( or unused ) , and moving non persistent queues to PS-ID's with smaller and larger used BP.. Or is indexing an option to be discussed with the application development team as well, so that they may add Msgid or CorrelliD to their messages.
Have you ever done this with your installation ? Would you have any recommendations for performance tuning?
Any input would be greatly appreciated.
Dragos Gheorghe |
|
Back to top |
|
 |
gbaddeley |
Posted: Mon Jul 23, 2018 4:18 pm Post subject: |
|
|
 Padawan
Joined: 25 Mar 2003 Posts: 1921 Location: Melbourne, Australia
|
You have some reasonable ideas for improving MQ performance on z/OS. You should really only look at changing anything if you can identify performance or efficiency bottlenecks in MQ, as MQ performs very well without much tweaking.
Indexing queues can improve performance if queue depth becomes significant (eg. > 100 - 1000 msgs) and msgs are fetched by correlid or msgid. However, high queue depths can indicate other issues.
Its a good idea to not have high volume app queues on the same pageset as system queues. If the pageset fills, the qmgr may become unresponsive.
Issues quite often only manifest under high load (eg. > 50 - 200 msgs/sec), and are usually caused by other apps (eg. CICS DBS TCP) or bad MQ messaging designs.
Have you looked at the MQ z/OS performance reports?
http://www-01.ibm.com/support/docview.wss?uid=swg27007150 _________________ Glenn |
|
Back to top |
|
 |
GheorgheDragos |
Posted: Mon Jul 23, 2018 10:41 pm Post subject: |
|
|
Newbie
Joined: 28 Jun 2018 Posts: 7
|
We have plenty of buffer pools with enough buffers(4KB). Buffer pool 0 gets full and starts writing messages to pageset simply because it has qmgr to qmgr channel checks ( on mainframe and distributed ) and the messages are persistent.
[i]
BP 1 Some ( pages read from disk. Buffer pool may be too small
BP 1 Many (588) pages read from disk. This is typical oflong lived messages. Buffer pool may be too small[/i]
Our DEVPlex qmgr's rarely go over 2/300 messages per second.
The problem might be that our app dev teams request always persistent queues which I think it's not because the messages are "critical", especially in TST GTU etc, but because they don't want to modify their apps.
MSG throughout ( GET/PUT )
7/11/2018 1:52:40 259/sec 7/sec
7/11/2018 2:23:00 259/sec 7/sec
7/11/2018 2:52:17 257/sec 7/sec
7/11/2018 3:21:58 257/sec 10/sec
7/11/2018 3:51:50 257/sec 7/sec
7/11/2018 4:21:45 257/sec 7/sec
7/11/2018 4:51:46 257/sec 7/sec
7/11/2018 5:21:34 256/sec 7/sec
7/11/2018 5:51:38 256/sec 7/sec
7/11/2018 6:21:37 255/sec 7/sec
7/11/2018 6:51:19 254/sec 7/sec
7/14/2018 7:34:44 251/sec 7/sec
7/14/2018 8:04:45 251/sec 6/sec
7/14/2018 8:34:46 251/sec 6/sec
7/14/2018 9:04:04 250/sec 6/sec
7/14/2018 9:34:00 250/sec 6/sec
And that is just because this particular QMGR (with another one for DR purposes ) acts as a comms node ( QR, SVRCONN's etc )
Another repeating alertin the messages extracted with the MP1B tool is :
[i]
QEST structure xxxxxxxx long average response time 423.[/i]
However, I do not think this is an issue because the average response time for a CF defined ( by MVS team ? ) is not "realistic". <- EDIT : This is a value we can pass to the MQSMF program to determine the "average" CF response time.
To be investigated :
Fixed pool contractions 11 > 0 ???
Variable pool contractions 12 > 0 ???
QIST read ahead message count 68 > 0 ???
BP 1 get old to get new page ratio > 41 . Queues not indexed ? Could be a lot of browse activity ???
Now this is an issue :
[i]
BP 3 Filled many(121) times. This is typical oflong lived messages. Buffer pool may be too small[/i]
Thank you for your reply.
If you have any ideas about the ??? fields, It could be interesting to share.
Dragos Gheorghe
*******************************************************************************
Here are the PRD details :
CPUB,<qmgr>,2018/07/24,01:25:34
BP PS count tot-time avg-time rate QL
2018/07/24,01:25:34,<qmgr>,001,020,Read, 6, 12093, 2015, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,031,Read, 4, 890, 222, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,031,Write, 1, 244, 244, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,032,Read, 50, 13245, 264, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,032,Write, 14, 4538, 324, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,043,Write, 26, 6690, 257, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,052,Write, 23, 6479, 281, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,060,Read, 38, 16993, 447, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,060,Read, 7, 1670, 238, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,060,Read, 13, 4133, 317, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,060,Read, 16, 4691, 293, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,060,Write, 111, 51431, 463, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,002,053,Read, 1, 1210, 1210, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,003,056,Write, 1, 954, 954, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 3, 756, 252, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 5, 1721, 344, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,020,Read, 4299, 1343854, 312, 2,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 8, 3304, 413, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 6, 1992, 332, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 2, 634, 317, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Write, 1, 358, 358, 0,QLNAME
2018/07/24,01:25:34,<qmgr>,001,001,Read, 14, 3832, 273, 0,QLNAME
I just saw this ...Will peek through it.
I saw a line in one of the reports that says MQBRO.. so I guess when there is a large queue with non indexed queues being browsed that can cause a problem.
From 2018/07/24,00:55:34.364296 to 2018/07/24,01:25:27.413922, duration 1793
MQOPENs 1046623, MQCLOSEs 646838, MQGETs 1512813, MQPUTs 189382
MQPUT1s 293776, MQINQs 195024, MQSETs 498, C ALL H 1
MQSUBs 0, MQSUBRQs 0, MQCBs 1435607
MQCTLs 1010535, MQSTATs 0, Publish 0
***
I wrote the reply in such a disorganized way because :
1 - I want a reply from a helping soul who has more energy than me right now;
2 - I don't have any more energy after spending my day eye balling MQ statistics like a proper lab worm;
Greetings |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Jul 24, 2018 3:43 pm Post subject: Re: IBM MQ8.0 z/OS performance |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7505
|
GheorgheDragos wrote: |
Would you have any recommendations for performance tuning? |
Lyn Elkins has lots of valuable posts on this site and also on her blog:
http://www.lynsmq4zos.com/ _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
gbaddeley |
Posted: Tue Jul 24, 2018 5:02 pm Post subject: |
|
|
 Padawan
Joined: 25 Mar 2003 Posts: 1921 Location: Melbourne, Australia
|
Quote: |
The problem might be that our app dev teams request always persistent queues which I think it's not because the messages are "critical", especially in TST GTU etc, but because they don't want to modify their apps. |
The MQ code path and overhead of persistent vs non-persistent messages is quite different. It should be a design choice based on QOS that is consistently used in all environments through to prod.
BTW, Queues are NOT persistent. Each messages carries a MQMD property to indicate if it is persistent or not. This is regardless of the Queue's DEFPSIST attribute. _________________ Glenn |
|
Back to top |
|
 |
GheorgheDragos |
Posted: Thu Aug 16, 2018 12:14 am Post subject: |
|
|
Newbie
Joined: 28 Jun 2018 Posts: 7
|
Colleagues,
First of all, thank you for taking your time to read and reply. Here is the current situation. We are using Omegamon for our installation. I have configured alerting so that we will know by mail when a buffer pool has less than 6% available buffer. This has happened this morning, for a good number of tens of minutes, so I had plenty of time to investigate which queues are the guilty ones. Still, with all the information from Omegamon, I have no idea what queues have been causing this. Help ? Buffer pool 3 - PS03.
Below the attachment from our system ( with blanked out queue names for obvious reasons ).
https://drive.google.com/open?id=1f53xpU15h9ozVWCeWR_BEcAfBgFLnDtC
Thank you in advance for your time and patience.
Dragos
***EDIT***
OR, it might be possible that there is no buffer pool activity, because it has finished at the time when I checked ( even though I checked pretty quickly ).. and there are just messages in the buffers, which have been offloaded because of a periodical automated system checkpoint ....
I'm pretty sure there is an easy answer to all of this..
Last edited by GheorgheDragos on Thu Aug 16, 2018 12:48 am; edited 1 time in total |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Aug 16, 2018 12:47 am Post subject: |
|
|
 Grand Poobah
Joined: 18 Nov 2003 Posts: 19880 Location: LI,NY
|
if you know the buffer pool you should also know which queues are backed by that buffer pool/ page set? Then see which of those queues had messages during the relevant interval...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
GheorgheDragos |
Posted: Thu Aug 16, 2018 1:02 am Post subject: |
|
|
Newbie
Joined: 28 Jun 2018 Posts: 7
|
[quote="fjb_saper"]if you know the buffer pool you should also know which queues are backed by that buffer pool/ page set? Then see which of those queues had messages during the relevant interval... [/quote]
But how can I know this if the messages have already been treated ? Omegamon has a short term memory of around 30 minutes-1 hour ...
I am in close collaboration with our automation guy. I want his rexx-es to trigger a series of displays whenever the buffer pool utilisation > 94% message prefix comes in the syslog, to display the queues which are in use, based on Buffer pool and PSID, dis ql(*) ps(...) then pull the queue locals, and issue DIS QS on those queue locals, to see they are doing IO ? before ALTER QMGR,MONQ(MEDIUM) and OFF after of course.
Ideas ? |
|
Back to top |
|
 |
bruce2359 |
Posted: Thu Aug 16, 2018 3:48 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 8301 Location: US: west coast, almost. Otherwise, enroute.
|
To order to avoid the risk from computer viruses, I never click URLs from unknown sources. If you want me/us to look at your evidence, please post it here. _________________ There are two types of people in this world:
1) Those that can extrapolate from incomplete data |
|
Back to top |
|
 |
GheorgheDragos |
Posted: Thu Aug 16, 2018 4:21 am Post subject: |
|
|
Newbie
Joined: 28 Jun 2018 Posts: 7
|
[quote="bruce2359"][quote="GheorgheDragos"]
Below the attachment from our system ( with blanked out queue names for obvious reasons ).
[url]https://drive.google.com/open?id=1f53xpU15h9ozVWCeWR_BEcAfBgFLnDtC[/url]
Thank you in advance for your time and patience..[/quote]
To order to avoid the risk from computer viruses, I never click URLs from unknown sources. If you want me/us to look at your evidence, please post it here.[/quote]
Please open it. It's text copy/pasted from Omegamon while the situation is active, in Citrix session it's uncomfortable to take/save screenshots. So I copied them in outlook to keep the formatting. Then I saved it in a HTML format. Because if I copy paste here, I can't choose the font. Here is a part of the HTML. See, not readable...
Command ==> HostName : CPUF
KMQMSBMD Buffer Manager QmgrName : GM10
Ŀ
Latest Buffer Manager SMF Sample Summary _
Ĵ
# of Pools In Use......... 4 Low % Avail............... 15.0
Low # Avail............... 964 Zero Bufrs Count.......... 0
Synch Writes.............. 0 GetPg IO %................ 0.0
% GetPg Outside Pool...... 0.0
Buffer Pools _
Ŀ
Columns 2 to 7 of 19 Rows 1 to 4 of 4
Ĵ
Pool Ӷ% of Bufrs Available Low # Zero Bufrs Page Sets +Queue
ID ӷAvailable Buffers Avail Count Assigned Assig
Ĵ
00 91.8 964 964 0 1 0
01 53.6 5626 5626 0 2 1554
02 93.8 9844 9844 0 4 60
03 15.0 1576 1576 0 3 98
Options Menu
Select an option and then press ENTER
1. H Buffer Pool Statistics History
2. P Page Sets in Buffer Pool
3. R Recent Buffer Pool Statistics
4. S Queues in Buffer Pool
KMQQUBPS Queues in Buffer Pool QmgrName : GM10
Ŀ
Latest Sample for Queues in Buffer Pool 03 _
Ĵ
Columns 2 to 6 of 29 Rows 1 to 18 of 98
Ĵ
Queue Ӷ% Full Msgs Read Msgs Put Total +Las
Name ӷ per Sec per Sec Opens
Ĵ
SYSTEM.CLUSTER.TRANS 0.0 0.0 0.0 0 n/
SYSTEM.DEAD.LETTER.Q 0.0 0.0 0.0 1 n/ |
|
Back to top |
|
 |
elkinsc |
Posted: Thu Oct 04, 2018 8:45 am Post subject: Sorry I tend just to look in the z/OS area |
|
|
 Centurion
Joined: 29 Dec 2004 Posts: 122 Location: Memphis
|
GheorgheDragos, did your situation get resolved? |
|
Back to top |
|
 |
|
|
  |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|