ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » MQ Trigger question

Post new topic  Reply to topic Goto page 1, 2  Next
 MQ Trigger question « View previous topic :: View next topic » 
Author Message
ALCSALCS
PostPosted: Fri Oct 14, 2005 2:42 am    Post subject: MQ Trigger question Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

We are running MQ V5.3.1 in z/OS environment.

We have an application triggered input queue (trigger=first and trigger yes). Queue 'INPUT' receives messages from an external system.
Application 'A' reads the queue.

Normally queue 'INPUT' receives one message every two seconds.

When a trigger message arrives, application 'A' reads all the messages from the queue and exits.
Standard implementation of a triggerable queue.

If for some reason queue 'Input' reaches a certain threshold (120 messages) an offline job 'OFFLINE' runs and clears the queue.

The other day we reached this condition (120 messages) and the offline job 'OFFLINE' was called, but because this job had low priority and z/os was very busy, it didn't started to run until 1 hour later.

Looking at the contents of the 120 messages we saw that were put in the queue in a interval of 15 minutes, but for some reason unknown application 'A' didn't read the queue 'INPUT'.

Question 1) any idea o reason why suddendly application 'A' didn't read queue 'INPUT'.
a) Problem with a trigger message ? unlikely.
b) Problem with the application 'A' ? unlikely.
c) Something different in the message sent by the external system ? Difficult to know.

During the hour that the offline job 'OFFLINE' was waiting to run, we saw that application 'A' was reading messages from the queue 'INPUT' but on top of this application'A' was called at a rate of 200 times/sec in average to read queue 'INPUT'.

This produced a high activity in z/OS because the application 'A' was doing 200 times in a second a process of MQOPEN, MQGET, finding queue empty and MQCLOSE.

Questions 2.
How it's possible that application 'A' was reading correctly the queue after the problem started ? Application 'A' is called by the trigger message and the trigger message is generated when the message queue goes from '0' to '1' message.

Question 3.
Because job 'OFFLINE' didn't run for an hour, theoretically queue 'INPUT' contained at least 120 messages and how it is possible that a trigger message was produced to call application 'A'.

Question 4.
How it is possible that some many trigger messages were produced ?
(200 in a second)

Question 5
Until we can discover the origin of the problem any suggestion about what can be implemented if we receive this rate of '200 trigger messages/sec' ?

Any comment or any possible explanation about this problem will be welcomed.

Thanks
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Fri Oct 14, 2005 3:09 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

I am betting your problem lies soley and completely with the code in Application A.

Probably, it decided to back out a message, and decided NOT to check the retry count.

And therefore, it spent the entire time trying to reprocess a message it could not handle (a "poison message" in the lingo).
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Fri Oct 14, 2005 3:18 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

if the queue reaches >120 messages and the offline job is started, then both (application A and the job) are reading the queue?

how far did the job get? was the queue opened by application A only or also by the job?


Q1 - maybe the messages where uncommitted (already read by the job, but the job did not commit already?!?) but - more likely - application A got a problem.

Q2 - yes, but this is not the only condition. there are many others. even with messages in a queue a trigger can be generated. this also applies to Q3

Q3 - one possible scenario - the batch job was running, application a stopped processing. because the queue was open for input by the job there was no re-triggering (a first-triggered queue will be triggered when the last process reading the queue closes and it is not empty).
now - whyever - the batch job ended or was cancled or or without reading all the messages, then the queue is triggered as explained above and application a is restarted.

Q4 - are you sure this was caused by triggering? if application A (at that moment) was the only application processing the queue, and if it does not process the message (mqopen,mqget, mqbackout mqclose ) or if application A abends after open), then the queue is triggered again and this will cause of loop (because of trigger first and the queue is not empty as described above). but this only happenes if A is the only one reading the queue. did you see any backouts or transaction abends?
check also the trigint of the queuemanager (although this applies only when messages arrive, but it will also cause a trigger).

Q5 - does the application checks the backout count? can you restrict application A (e.g. if it is a cics transaction then you can limit the transaction class)?

hard to find answers to that from here.
best thing is to look at the queue and qstats in the moment when the error occurs. it will show you if and which process has the queue open and is processing the queue. a reset qstats may also help you investigating how many messages have been get / put (especially trigger messages). other logs may help too (cics/ims/batch/whatever application A is running in).
switch trigger off next time to see what happens. if it ends, then of course it is the trigger (or better, its application A that gets triggered which is a works as designed).
_________________
Regards, Butcher
Back to top
View user's profile Send private message
ALCSALCS
PostPosted: Fri Oct 14, 2005 5:00 am    Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

Answer to jefflowrey

Application A does the following

1) Open queue.

2) Get message.

3) If message not availlable , closes the queue

4) If message available, it procesess the message and reads the queue again until no message available, closes the queue

Options: MQGMO_NO_SYSNPOINT


Answer to Mr Butcher

Job 'offline' did not executed until 1 hour later. It was waiting, so queue INPUT was not open by the job OFFLINE'

Q1) Applicatiopn A does what I indicated above.

Q2 and Q3) During the hour only application A open the queue, job OFFLINE was waiting to be executed.

Q4) A dump taken when the problem occurred shows tha application 'A' has been called by a trigger message at a rate of 200 times/sec and the sequence of MQ commands where 'MQOPEN', 'MQGET' and 'MQCLOSE'

Trigint for queuemanager is TRIGINT(999999999)

Q5) Application A doesn't do backout.

What you mean by 'can you restrict application A'

Many Thanks for the information provided
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Fri Oct 14, 2005 5:03 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

What does the application do if the message can't be processed?

What hapepns if the application abends?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Fri Oct 14, 2005 5:11 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

okay, so application A was the only one processing the queue.

i accompany with jefflowrey, this looks like the typical "queue-not-empty-but-trigger-first" loop. and this is most likely caused by the application and not by the mq trigger mechanism.

by "restricting" i mean that you try to prevent your application A to be startet that many times within that short interval. i dont know your environment nor what application A is (batch, cics, ims, ....) so no more hints possible on this one. anyway, restricting does not solve the problem.

when application A was triggered 200 times a second, and did mqget, these mqget returned 2033 you said? and there where messages in the queue. did you see uncomitted messages? if so, okay, if not, why 2033?
Application Error?
_________________
Regards, Butcher
Back to top
View user's profile Send private message
ALCSALCS
PostPosted: Fri Oct 14, 2005 6:41 am    Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

The application does MQGET.

It checks for completion code different from 0 and reason code different from 2033, if this is the case it sends an error message.

If completion code is 0, it procesess the message and continue to read next message until reason code is 2033, when it does MQCLOSE

If the application abends, it will issue a dump. (no dump during the time of the problem)

We need to think about a way to restrict the application 'A' to prevent the consequences of the problem until we can find the origin of it.

Looking at the code for application A, the only way that the code executes MQOPEN, MQGET and MQCLOSE is that the reason code was 2033 and this is shown in the dump that we took.

Unfortunately the information from the queue 'INPUT' produced by job 'offline' after the hour is not available anymore.

So it is difficult to explain why 2033.

Many Thanks
Back to top
View user's profile Send private message
kevinf2349
PostPosted: Fri Oct 14, 2005 7:05 am    Post subject: Reply with quote

Grand Master

Joined: 28 Feb 2003
Posts: 1311
Location: USA

Are your clearing out the CorrelID and MsgID fields to make all messages eligible?

Is the code re-entrant?
Back to top
View user's profile Send private message
ALCSALCS
PostPosted: Mon Oct 17, 2005 2:20 am    Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

Application A reads the messages without specifying any CorrelID or MsgID making any message in the queue eligible.

The code is non re-entrant.

Thanks
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Mon Oct 17, 2005 5:26 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

if you read a message then you will get msgid corelid filled from the message you get. if you reuse the storage / fields / variables (depending on your programming language) then the second get will be performed with the values returned from the first get, and will probably return with 2033.
_________________
Regards, Butcher
Back to top
View user's profile Send private message
ALCSALCS
PostPosted: Mon Oct 17, 2005 6:45 am    Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

Application A reads the message from the queue INPUT without any selection criteria (any message is acceptable). It doesn't use either msgID or CorrelID.

Many Thanks
Back to top
View user's profile Send private message
kevinf2349
PostPosted: Mon Oct 17, 2005 7:15 am    Post subject: Reply with quote

Grand Master

Joined: 28 Feb 2003
Posts: 1311
Location: USA

Quote:
Application A reads the message from the queue INPUT without any selection criteria (any message is acceptable). It doesn't use either msgID or CorrelID.


and then you clear the fields again before the next GET right?

If not, then you need to.
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Mon Oct 17, 2005 7:16 am    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

yes. my english is bad, and sometimes my explanaitions are bad. i try again

msg1 in queue msgid ABC correlid 123
msg2 in queue msgid DEF correlid 456

program a mqmd data buffer msgid '' correlid ''

first mqget < gets first message, after that msgid is ABC, correlid is 123

secondmqget returns 2033 beause the second MQGET reads with the values from the first MQGET (if same storage and not resettet).

maybe it is the best if you post the MQGET-Loop Code snipet.
_________________
Regards, Butcher
Back to top
View user's profile Send private message
ALCSALCS
PostPosted: Mon Oct 17, 2005 7:38 am    Post subject: Reply with quote

Acolyte

Joined: 14 Apr 2002
Posts: 61

No, my answer was not clear enough.
The field is resetted.

But when the 2033 happened, I saw in the dump the sequence of MQ macros were MQOPEN, MQGET and MQCLOSE, meaning that the application A started (as consequence of a trigger message) it opened the queue 'INPUT', read for the first time and the queue was empty (2033) and close the queue.

That was the case for the 200 times in the dump that application A was called.

Many Thanks
Back to top
View user's profile Send private message
Mr Butcher
PostPosted: Mon Oct 17, 2005 10:42 pm    Post subject: Reply with quote

Padawan

Joined: 23 May 2005
Posts: 1716

do the dump show who started application a?

if it was by triggering, ,do you see the trigger monitor in the dump? this one should also have lots of mqgets (reading the trigger messages)
what environment is application a? cics? ims? ?!?

if it is a trigger problem then you should maybe involve ibm.
_________________
Regards, Butcher
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » General IBM MQ Support » MQ Trigger question
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.