MQSeries.net :: View topic - Triggering on Linux is creating messages on the dead letter

bduncan · Posted: Wed Apr 18, 2001 3:31 pm Post subject:

Problem Description: For some time now, we have been noticing that the SYSTEM.DEAD.LETTER.QUEUE on some of our machines would begin to fill up over time. Several times, this has actually filled up /var/mqm which caused the queue manager to break. It became clear that this was only occurring on those machines that had triggering enabled. This was supported by the fact that when the messages on the dead letter queue were browsed, they were all copies of the trigger messages that were supposed to be getting routed to the SYSTEM.DEFAULT.INITIATION.QUEUE. What threw us off at first was the fact that even though the trigger messages appeared to be going to the dead letter queue for some unknown reason, the triggered applications were still getting launched. Because the triggering was working, this problem took a low priority, and it didnâ€™t become necessary to look at it again until the dead letter queues on several machines started getting very, very large. As an experiment, I ran the trigger monitor in the foreground so I could see exactly what happened when a message landed on a triggered queue. One point about the way MQ triggers applications - it always tries to pass a large string representing a Trigger Message Header Structure as a parameter to the program it launches. This is for those programs that need to know certain information from MQ to function properly. In our case, none of our programs required this information, but there was no way to turn this mechanism off. So what I observed while watching the trigger monitor was the fact that when a message would land on the triggered queue, the trigger monitor would indeed execute our program, but then the shell that the trigger monitor was running under would complain about a non-existent file or unknown command error, and this error always pointed at the Trigger Message Header. So it appeared that the command that the trigger monitor was passing to the shell to execute was corrupted somehow, and the shell was returning errors. When this occurs, the trigger monitor is designed to create a message describing the error and dropping it on the dead letter queue, which is what we observed.

Problem Solution: I called IBM support on this problem, since it seemed like other people must have run into this same problem. This turned out to be the case, and we saved a lot of time by not having to troubleshoot this ourselves. It turns out that the MQ Series trigger monitor program uses the system function to execute triggered applications. The system function takes a single string as a parameter, and the OS attempts to treat the string like a shell command. The people at IBM informed me that the way MQ constructs this string is a little tricky. It essentially takes three fields in the PROCESS structure we defined within MQ Series to trigger our application. These three fields are APPLICID, USERDATA, and ENVRDATA. APPLICID simply contains our command - we had it set to something like "csh /usr/db/feed/bin/autostartfeeds.csh &" and we left USERDATA and ENVRDATA blank. The trigger monitor will construct the system parameter by concatenating the fields in this order: APPLICID + USERDATA + Trigger Message Header + ENVRDATA. Notice that it drops the Trigger Message Header into the parameter. So our problem was occurring as follows. The system call would pass the concatenated string to the OS, and the OS would execute it like it was a shell command. It would begin to parse the string, and as it moved through the APPLICID field, it would get to the last character, a "&" and stop. The ampersand is telling the shell to execute the preceding command in the background. It also has a second, more subtle purpose - it acts much like a semicolon does in a piece of Perl code. It informs the shell that this is the end of a command. So the shell goes and executes our application without any complaints. But this wasnâ€™t the end of our string. I mentioned that we left the USERDATA and ENVRDATA fields blank, but MQ automatically stuck the Trigger Message Header in between those two blank fields. So essentially the shell would consider this a second command (after correctly executing our application) and since this structure looks nothing like a shell command, the shell would return a syntax error to the trigger monitor, which would in turn create a message and drop it on the dead letter queue. This explains the odd symptom that our programs were being triggered correctly, but we were still getting messages on the dead letter queue. IBM support had a very simple fix for this problem. Take the ampersand out of the APPLICID field and throw it into the ENVRDATA field. This way, the ampersand comes after the Trigger Message Header, which means that the entire string will be treated as a single command by the system function.

_________________
Brandon Duncan
IBM Certified MQSeries Specialist
MQSeries.net forum moderator

[ This Message was edited by: bduncan on 2001-05-16 14:12 ]

[ This Message was edited by: bduncan on 2001-05-16 14:12 ]

[ This Message was edited by: bduncan on 2001-05-16 14:13 ]