Author |
Message
|
JosephGramig |
Posted: Fri Jul 12, 2024 8:39 am Post subject: File System Full on RDQM |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
This is an oldie but a goodie. Let's take the classic case of some igmo filled the Dead Letter Queue (DLQ) with msgs bound for a queue that does not exist. In this case, the file system filled before the DLQ filled and that brought the QMgr down (stopped).
Normally, one should look for some files to delete so that the QMgr can be started with -ns options and then process (delete) msgs off the DLQ.
Being that this is RDQM there are no extraneous files to delete to free space and in any case, this is a replicated file system. The file systems must stay in sync whatever happens next. We can't just haul off and expand the file system.
Does anybody have a link to the doc's that explain how to expand RDQM file systems?
In my case, I just deleted the DLQ file which breaks it (because I knew I didn't need what was on it). I then cleaned up crm_resource which restarted the QMgr. Then could delete/define the DLQ again.
I'm looking for an answer where I don't blindly delete stuff. |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Jul 12, 2024 6:04 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Hi Jo,
Next time you need to - back up the RDQM qmgr as described in the RDQM doc,
- Delete the qmgr from RDQM,
- Recreate the qmgr with more space
- Restore the qmgr from backup
enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
Andyh |
Posted: Sat Jul 13, 2024 4:22 am Post subject: |
|
|
Master
Joined: 29 Jul 2010 Posts: 239
|
If the queue manager is properly configured (three file systems, one for queue manager object data, one for recovery logs, one for /var/mqm) then it would imply an APARable MQ bug for the queue manager to crash when the qmgr data file system filled, or for the need to release file system space from logs/data in order to restart the QMgr.
In the not too distant past this wouldn't have been the case, but even before I retired (2021) we believed that we'd addressed all of the issues in this area.
A simple file system filling issue is not an acceptable reason for MQ to lose data.
You should contact IBM/MQ support over this issue. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sat Jul 13, 2024 7:36 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
These same symptoms and resolutions apply as well to non-RDQM qmgrs.
Identify and punish the offending developer(s). Apply automation to watch for, report (outcall) and manage errant behaviors.
Disk space is cheap. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sun Jul 14, 2024 4:11 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
I have several 0.5 GB "dummy" files on each MQ file system. If we ever get critically low on space I can delete one at a time to free up 1/2 a gig as needed. When the crisis is solved, recreate the dummy files to once again have a few GB insurance. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
JosephGramig |
Posted: Mon Jul 15, 2024 5:47 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
fjb,
I see the point of your proposed solution and it would seem workable. This is a fairly easy to reproduce issue to test the solution. BTW, my youngest daughter is Jo and I'm Joe.
Peter,
Hmmm, that is a good stop gap measure.
Bruce,
I shall thoroughly trout the offenders.  |
|
Back to top |
|
 |
gbaddeley |
Posted: Mon Jul 15, 2024 5:37 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
PeterPotkay wrote: |
I have several 0.5 GB "dummy" files on each MQ file system. If we ever get critically low on space I can delete one at a time to free up 1/2 a gig as needed. When the crisis is solved, recreate the dummy files to once again have a few GB insurance. |
In case anyone is wondering how to create dummy files on UNIX
Code: |
dd if=/dev/zero bs=1048576 count=MBSIZE of=FILENAME |
MBSIZE =# of megabytes, eg. 512
FILENAME =Your file name of choice, eg. dummy1, dummy2 _________________ Glenn |
|
Back to top |
|
 |
Andyh |
Posted: Tue Jul 16, 2024 4:34 am Post subject: |
|
|
Master
Joined: 29 Jul 2010 Posts: 239
|
You state "This is a fairly easy to reproduce issue to test the solution." and as such it's even more important that it gets reported to IBM and resolved.
As stated previously, a simple disk space shortage on a correctly configured queue manager is NOT an acceptable reason for the queue manager to crash, and even on a badly configured queue manager not an acceptable reason for any persistent message loss.
I'd be very disappointed if this were not still part of regular regression tests within MQ and hence I suspect there's some disconnect here. Can you elaborate on the steps you take to easily reproduce this ? |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jul 16, 2024 6:13 am Post subject: Re: File System Full on RDQM |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
JosephGramig wrote: |
This is an oldie but a goodie. Let's take the classic case of some igmo filled the Dead Letter Queue (DLQ) with msgs bound for a queue that does not exist. In this case, the file system filled before the DLQ filled and that brought the QMgr down (stopped). |
This event should have created an FDC. What did the FDC tell you? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
bruce2359 |
Posted: Sun Aug 11, 2024 8:02 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Any update on this? Resolved? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
JosephGramig |
Posted: Tue Aug 13, 2024 7:22 am Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
Bruce,
Yes. It was resolved the less than optimal way. The DLQ was truncated (wrong answer but worked). Cleaned up resources which restarted everything right away (not exactly what I expected but know now to expect that). Deleted and defined the DLQ again to resolve the wrong way to delete a queue.
Implemented the space occupying files in the "data" directory. In the actual PROD, there is monitoring.
Andy, running out of space on the file system is not a defect in the MQ product but the folks managing it.
The real answer if this really happened in PROD was given by fjb_saper. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Aug 13, 2024 11:56 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
JosephGramig wrote: |
Bruce,
Yes. It was resolved the less than optimal way. The DLQ was truncated (wrong answer but worked). Cleaned up resources which restarted everything right away (not exactly what I expected but know now to expect that). Deleted and defined the DLQ again to resolve the wrong way to delete a queue.
Implemented the space occupying files in the "data" directory. In the actual PROD, there is monitoring.
Andy, running out of space on the file system is not a defect in the MQ product but the folks managing it.
The real answer if this really happened in PROD was given by fjb_saper. |
What is/was the wrong way? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
JosephGramig |
Posted: Tue Aug 13, 2024 1:43 pm Post subject: |
|
|
 Grand Master
Joined: 09 Feb 2006 Posts: 1244 Location: Gold Coast of Florida, USA
|
Bruce,
What was wrong is how the space got released from the DLQ. Don't just delete the file or truncated it. In my case, I knew nothing was needed. It did corrupt it and that is why it needed to be deleted and redefined (through the mqsc). |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Aug 13, 2024 2:32 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Are you saying that you navigated down the .. Qmgrs/yourqmgr/queues/ file system and deleted the DLQ queue? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Aug 14, 2024 7:48 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Never delete the DLQ. Rather try the following:
Run a DLQ handler to discard the messages on the DLQ.
 _________________ MQ & Broker admin |
|
Back to top |
|
 |
|