Author |
Message
|
vsathyan |
Posted: Thu Dec 17, 2015 8:58 pm Post subject: Calculating qmgr restart time by the number of logs present |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Is there a mathematical formula or any method to calculate how much time a queue manager may take depending on the number of log files (primary and secondary) the log file pages value set in qm.ini?
When a queue manager is restarted, it takes a long time when we do not know the number of log file records in the logs and have to wait indefinitely without knowing how long would it take to get the queue manager running.. _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
bruce2359 |
Posted: Thu Dec 17, 2015 9:45 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
If the qmgr ends normally (endmqm command) it should restart in about the same amount of time each time.
Are you asking about restart after an abnormal termination of a qmgr? If so, restart time will include re-do of completed UofWs and undo of incomplete UofWs after the last checkpoint. The more of each at the time of termination the longer restart will take.
The number and size of log files has lesser impact on restart than UofWs. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
vsathyan |
Posted: Thu Dec 17, 2015 10:04 pm Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
When i start a queue manager after abruptly stopping it, it says <number> log records accessed on queue manager <qmgr> during the log replay phase.
How do i know what is the number of log records available to replay? _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
Andyh |
Posted: Fri Dec 18, 2015 12:22 am Post subject: |
|
|
Master
Joined: 29 Jul 2010 Posts: 239
|
The number of records accessed during replay shouldn't normally be very large, even in the event of a crash recovery. The queue manager tries to to checkpoints within a certain number of log records and you should only need to replay the log records since the last checkpoint. If you're getting very large numbers of log records replayed then it suggests there's a problem with the checkpoint process (which might be indicative of an I/O performance issue).
The number of log records that would be expected to be replayed is higher in more recent releases, and that is intended to have been accompanied by appropriate increases in shutdown and restart times. There's something of a performance balance to be struck between fast restart times, and fast mainline performance.
How many log records are we actually talking about here? In the lab I can force a large number of log records to be replayed (millions) and it still doesn't take very long, although obviously that's very I/O dependent. |
|
Back to top |
|
 |
vsathyan |
Posted: Fri Dec 18, 2015 2:33 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
Seems fishy to me.
The issue is resolved as of now, but the queue manager replayed 1.7million log records and took 3 hours to process them..
WMQ 7.5.0.5, OEL 6.5 _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Dec 18, 2015 3:04 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
vsathyan wrote: |
Seems fishy to me.
The issue is resolved as of now, but the queue manager replayed 1.7million log records and took 3 hours to process them..
WMQ 7.5.0.5, OEL 6.5 |
Is this repeatable? If it is then a PMR might be the next step to take. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
Andyh |
Posted: Fri Dec 18, 2015 4:22 am Post subject: |
|
|
Master
Joined: 29 Jul 2010 Posts: 239
|
Even if it's not repeatable, I'd suggest raising a PMR, a restart time of 3 hours is clearly unacceptable and if it has happened once you want to try to avoid the possibility of a repeat.
Were there any FDC's leading up to the restart ?
By defining a recovery log with sufficient active log space for 1.7 million records, you are in effect allowing a restart that might (in exceptional circumstances) need to replay 1.7 million records (even then I wouldn't expect it to take anything like 3 hours). |
|
Back to top |
|
 |
vsathyan |
Posted: Fri Dec 18, 2015 5:22 am Post subject: |
|
|
Centurion
Joined: 10 Mar 2014 Posts: 121
|
There were a few FDCs which say Log Full (AMQ6709) and a few semaphore busy (AMQ6150).
I'll raise a sev2 PMR and keep this post updated, in case it might help somebody.
Thanks for your time guys.
Regards,
vsathyan _________________ Custom WebSphere MQ Tools Development C# & Java
WebSphere MQ Solution Architect Since 2011
WebSphere MQ Admin Since 2004 |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Dec 18, 2015 5:51 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You can certainly take some measure to keep track of how many records are active in the logs.
And I think - or at least am not sure how much I forget - that the size of a log record is documented somewhere, so you can make some guesses about how many records you have based on your log file size.
Also, never discount an application that's gone nuts and failed to commit messages it put in a giant ridiculously out of control loop. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
tczielke |
Posted: Fri Dec 18, 2015 6:29 am Post subject: |
|
|
Guardian
Joined: 08 Jul 2010 Posts: 941 Location: Illinois, USA
|
You might want to look at the mqlogperf.sh (link below) to get an idea of how your disk i/o is performing with your MQ log files. Based on what you have described, it sounds like you are potentially having some very serious file i/o performance issues.
http://www-01.ibm.com/support/docview.wss?uid=swg21678834 _________________ Working with MQ since 2010. |
|
Back to top |
|
 |
|