Author |
Message
|
Challenger |
Posted: Mon Dec 01, 2008 1:20 am Post subject: Challenge Question - 12 / 2008 |
|
|
 Centurion
Joined: 31 Mar 2008 Posts: 115
|
Well, here comes your newest Challenger person
December 2008 Challenge:
A client application in a windows environment is configured to be triggered from a queue under a Solaris queue manager
The trigger monitor is the client version, running as a Windows service
The queue has the following attributes set:
TRIGGER
TRIGDPTH(1)
TRIGTYPE(FIRST)
The queue manager's trigger interval attribute has a very low value 10000 , compared to the default 999999999.
During testing the application team complains that multiple instances of the application are getting spawned which causes the windows server to crash. They request the MQ Admin to check if the trigger configuration is done correctly. The MQ Admin rechecks the configuration, confirms the queue is set to TRIGGER(FIRST) and advises the application team that the best practice for the triggered application is to read all the messages off the queue and exit only when it gets a 2033. The application team replies that they are indeed doing it so.
The Apps team and the MQ admin folks fight on blaming each other back and forth , finally the issue is reported resolved.
Tell me what may have caused the issue, which side do you think won out the argument, and exactly how was it eventually resolved ?
All the best!! |
|
Back to top |
|
 |
AkankshA |
Posted: Tue Dec 02, 2008 3:02 am Post subject: |
|
|
 Grand Master
Joined: 12 Jan 2006 Posts: 1494 Location: Singapore
|
My guess,
TrigInt is set to 10 secs & Trig type is first.
Under normal conditions a trigger message would be generated whenever the queue depth reaches from 0 to 1.
Now if the application could not get started in 10 seconds(quite probable as network delay since windows to solaris) that is, the OpenInputCount for application local queue attribute is still zero, another trigger message would get generated to start application. and continue till one instance opens the queue for processesing.
Hence multiple instances of application would be started.
Now Each instance will try and read the messages from the queue. If, the application is opening the queue for exclusive input (which it should) - then only one instance will get messages
Every other instance will then throw errors, presumable. This will cause the trigger monitor to dump the initiation message on the DLQ.
As all the messages are processed, application instance will get 2033 and close the connection.
Onus lies on both application and MQ admin here... _________________ Cheers |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Dec 02, 2008 3:55 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Is there a message affinity question embedded there?
You did not tell why the apps folk were so upset having multiple processes spawned. What was the max number of processes ever spawned?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Dec 02, 2008 5:41 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
To add on to what AkankshA already said:
http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/topic/com.ibm.mq.csqzal.doc/fg13910_.htm
Quote: |
If you set TriggerInterval to a low value, and there is no application serving the application queue, trigger type FIRST might behave like trigger type EVERY. This depends on the rate that messages are being put onto the application queue, which in turn might depend on other system activity. This is because, if the trigger interval is very small, another trigger message is generated each time that a message is put onto the application queue, even though the trigger type is FIRST, not EVERY. (Trigger type FIRST with a trigger interval of zero is equivalent to trigger type EVERY.) |
It sounds like this application takes longer than 10,000ms to open the queue after being triggered. Maybe it takes that long to initialize itself (connect to DBs, etc).
Solution is to up the Trigger Interval or if that can't be done (its a QM level attribute shared by all apps), change the app to MQOPEN the queue ASAP after being triggered, if multiple instances are indeed a problem rather than a curiosity. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
Challenger |
Posted: Tue Dec 02, 2008 4:21 pm Post subject: |
|
|
 Centurion
Joined: 31 Mar 2008 Posts: 115
|
Ok. At this moment I don't want to comment on what Akanksha and PeterPotkay has replied.Lets see if some one can look at this problem from a different angle.
To answer fjb_saper, the windows server in which the application runs is quite heavily loaded and the business is quite happy with having a single instance of receiver application processing the messages sequentially off the queue. The spawning of multiple processes is impacting other critical applications which run in the same physical windows server. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Dec 02, 2008 9:19 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Allow more time for the trigint...
Have the application open the queue in exclusive mode. You may run a DLQ handler to delete rc 265, could not start app.
You may also attempt to register the app at start up with some kind of registry. This registry keeps track of how many instances of the app are running and allows up to a predetermined max. When the application starts it tries to register. If it is past the max instances the registry returns some kind of information and the instance of the app shuts down. The registry returns to the app the ordinal and the max allowed. If the ordinal is greater than the max allowed the app signals to the registry that it is shutting down and does so.
This allows for instant modification if the registry has a method for reloading its .ini file. You can adjust the amount of resources dedicated to processing the queue according to the overall workload. At regular intervalls the instances check with the registry and shut down if the boxes workload has put their ordinal above the max allowed. If the instance's ordinal is below the max allowed it may query the registry about the total # of instances actually running. If the total # running is below the max it spawns off a new instance and initializes it (it needs to run in a different thread) to increase load and parallel processing.
If the instance has no more work it signals the registry that it is shutting down and does so.
I know this sounds quite complicated but allows for dynamic load management on the triggered side... and of course the assumption here is that there is no message affinity.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
Challenger |
Posted: Sun Dec 07, 2008 1:47 pm Post subject: |
|
|
 Centurion
Joined: 31 Mar 2008 Posts: 115
|
OK.We got a couple of solutions to the problem. But all of them point to the fact that there was a delay of 10 seconds before the triggered application opened the queue for input.
The triggered application intitializes db resources before it opens the queue. This is a resource heavy process and this is a reason why triggering of multiple application instances is not preferred. However, it was observed that this intitialization occurs within a few seconds (~less than 2 seconds). Also during the blame game period, MQAdmin changed the trigger interval parameter to the default value 999999999. Still the strange behaviour was observed.
What could have created such a bizarre situation? |
|
Back to top |
|
 |
PeterPotkay |
Posted: Sun Dec 07, 2008 5:39 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Challenger wrote: |
However, it was observed that this intitialization occurs within a few seconds (~less than 2 seconds). Also during the blame game period, MQAdmin changed the trigger interval parameter to the default value 999999999. Still the strange behaviour was observed.
What could have created such a bizarre situation? |
The app was successfully getting triggered, opening the queue, and then closing the queue (with more than 0 messages on it), but not ending. As soon as the app closed the non empty queue, another triggered instance was started, and if that one did the same thing but didn't end, a 3rd instance was triggered, etc, etc. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
Challenger |
Posted: Sun Dec 07, 2008 7:31 pm Post subject: |
|
|
 Centurion
Joined: 31 Mar 2008 Posts: 115
|
Ok. Now we got a different view of the problem. Peter has suggested a possible scenario which could result in that bizarre situation. But wait, isn't the TRIGGINT meant to address this problem. Or is it? |
|
Back to top |
|
 |
xhaxk |
Posted: Sun Dec 07, 2008 10:45 pm Post subject: |
|
|
Apprentice
Joined: 30 Oct 2008 Posts: 31
|
No. TRIGINT only applies to msgs being put to the queue; a trigger msg generated when a non-empty queue is closed does not consider TRIGINT. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Dec 08, 2008 6:52 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
xhaxk wrote: |
No. TRIGINT only applies to msgs being put to the queue; |
except on z/OS, where you do not need another message to be put. On z/OS Trigger Inteval passing by without another message being put will be enough to generate a new trigger message.
This doesn't apply to this Challenge which is pure Windows, but thought I'd mention it. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
xhaxk |
Posted: Mon Dec 08, 2008 1:45 pm Post subject: |
|
|
Apprentice
Joined: 30 Oct 2008 Posts: 31
|
Quote: |
except on z/OS, where you do not need another message to be put. |
I did not know that. Do you have a reference for this, or a quote from a manual? I could not find it in the trigger conditions in the APG. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Mon Dec 08, 2008 2:01 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
|
Back to top |
|
 |
bruce2359 |
Posted: Fri Dec 12, 2008 8:52 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
In my opinion they should change distributed to match. |
The why of this is the abundant horsepower in the mainframe to take care of missing triggers, discarding expired messages, writing log buffers to logs (see below), as background tasks, with little/no impact to SLAs.
Although not part of this challenge, the same should be said about _SYNCPOINT vs. _NO_SYNCPOINT.
It would be a delight if WMQ behaved consistently across all platforms. Of course, this would cut into some of my consulting... _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Challenger |
Posted: Thu Dec 18, 2008 3:09 pm Post subject: |
|
|
 Centurion
Joined: 31 Mar 2008 Posts: 115
|
Many thanks to all for particpating in this edition of the challenge. I would like to announce Peter Potkay as the winner.
PeterPotkay wrote: |
The app was successfully getting triggered, opening the queue, and then closing the queue (with more than 0 messages on it), but not ending. As soon as the app closed the non empty queue, another triggered instance was started, and if that one did the same thing but didn't end, a 3rd instance was triggered, etc, etc. |
The real reason for this behaviour was the way in which the application has coded the loop.
Code: |
boolean qEmpty = false;
while ( not(qEmpty))
{
MQConnect(QMGR);
MQOPEN(Q);
MQREAD(Q);
MQCLOSE(Q);
MQDISC(Q);
Exception : if mqrc = 2033
qEmpty = true;
else
Exception Handler code
break;
} |
A trigger message was getting generated each time the application closed the queue within the for loop.
A special mention to xhaxk for clariying that TRIGINT is not applicable for such instances and for fjb_saper for presenting a new architecture to do dynamic load management across triggered applications.
-Challenger |
|
Back to top |
|
 |
|