ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » WMQI takes a break every 65 seconds

Post new topic  Reply to topic
 WMQI takes a break every 65 seconds « View previous topic :: View next topic » 
Author Message
PeterPotkay
PostPosted: Fri Feb 28, 2003 7:29 am    Post subject: WMQI takes a break every 65 seconds Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

QM1 (OS/390) needs to send a message to QM2 (Solaris).

QMHUB (NT) sits in the middle and hosts WMQI (2.1 CSD2). A message flow takes the COBOL format from QM1 and converts it to XML Accord format. A second message flow takes the XML Accord reply message and converts it back to COBOL format. There are no other flows doing any work on this WMQI setup.

The request messages are about 15-25K. The replies are under 5K. About 100 message per minute is the volume pretty consistently.


Every 65 seconds, the Input queue to the request flows starts backing up for about 5 seconds. During this backup, the DataflowEngine.exe drops to zero CPU. Then it kicks up to about 50% to clear the backlog, and then stays around 20% for the next 60 seconds, when the backup starts again. Every 65 seconds like clockwork.

During the backup, CPU for the entire box drops down to near zero. I can't see that WMQI or any other application is busy at this time. Just every 65 seconds, WMQI decides to take a break.

Any ideas where to look for what? This is puzzling.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
yaakovd
PostPosted: Fri Feb 28, 2003 9:00 am    Post subject: Reply with quote

Partisan

Joined: 20 Jan 2003
Posts: 319
Location: Israel

What about garbage collection?
Or some other memory/DB cleanup?
Try change size of messages to 50 K. If this period will reduced?

It is just my guess... May be somebody knows an answer.
_________________
Best regards.
Yaakov
SWG, IBM Commerce, Israel
Back to top
View user's profile Send private message Send e-mail
kwelch
PostPosted: Fri Feb 28, 2003 9:01 am    Post subject: Reply with quote

Master

Joined: 16 May 2001
Posts: 255

I work with Peter and am the WMQI support for this application. I just wanted to add that this flow and execution group has been running on this box for a couple of months now with no problems. We have not done any recent deploys and as far as we know nothing has changed on this box from an NT perspective. This started happening a couple days ago and has been happening consistently every 65 seconds ever since. We scheduled a reboot of the machine last night and the problem is still persisting. All thoughts and ideas are welcome!!!!

Karen
Back to top
View user's profile Send private message Send e-mail
jefflowrey
PostPosted: Fri Feb 28, 2003 10:48 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

How are you measuring this 'backup'? Are you looking at queue depth?

What you could be seeing is an artifact of the way the MCA works. You could be recording a queue depth that shows uncommitted messages on the queue coming out of the MCA. They won't be visible to WMQI until the sending MCA has told the WMQI MCA to commit them and it's been acknowledged. On the other hand, the queue depth will still increase. Queue depth indicates both uncommitted messages on the queue and committed messages on the queue.
Back to top
View user's profile Send private message
kwelch
PostPosted: Fri Feb 28, 2003 11:18 am    Post subject: Reply with quote

Master

Joined: 16 May 2001
Posts: 255

Yes we are looking at queue depth. I have noticed recently that it's not just the input queue to the WMQI flow but any queue that might have a message in it backs up(i.e. input to another flow, 2 xmit q's to the mainframe and the system.cluster.transmit.queue all back up) until after whatever is keeping things from running lets go. Then everything starts back up again. The fact that this happens so routinely every 60-65 seconds makes it puzzling.
Back to top
View user's profile Send private message Send e-mail
PeterPotkay
PostPosted: Fri Feb 28, 2003 12:15 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

The messages are committed. I can browse them during the back up. I don't believe it has anything to do with the channels. The messages are cleanly being delivered to the input queue.



Everyone swears there have been no application changes. This has been running fine in prod for over 3 months, and then yesterday this started happening.

The fact that it happens every 65 seconds makes me think it has nothing to do with app code or the actual messages. Rather that this is a side effect of the OS doing something regularly, or WMQI doing something internally, or maybe some monitoring thing. I don't know; something is telling WMQI to take a break from processing every 65 seconds.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Mon Mar 03, 2003 6:48 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

If you're sure the messages are committed, then it is more likely some process outside WMQI that's causing the symptoms you're seeing.

There are any number of different ways to monitor Windows to find out what resources are being used at what times and by what things. It sounds like you've done some work with Task Manager, and not gotten anywhere. Performance Monitor is the next place to start. You could also modify the policy of the local machine to log every object access into the security log. This might show you things that are accessing the box from the network.

Other things to try are enabling tracing in WMQI, and checking the broker database logs.
Back to top
View user's profile Send private message
kirani
PostPosted: Mon Mar 03, 2003 9:45 pm    Post subject: Reply with quote

Jedi Knight

Joined: 05 Sep 2001
Posts: 3779
Location: Torrance, CA, USA

Peter,

Is it possible that your message flow is waiting for some I/O or your messages are backing out for some reason? Do you have multiple instances of the message flow running?
_________________
Kiran


IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries

Back to top
View user's profile Send private message Visit poster's website
PeterPotkay
PostPosted: Tue Mar 04, 2003 5:11 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

When I browse the messages, the backout count is zero, so I don't think they are being backed out.

I think Karen's point above is important. It is not just the Input queue to the MQSI request flow that backs up every cycle. Any in flight messages (replies coming back to the Reply Input queue, messages that have just exited the flows and are on the XMIT queues) also get stuck. The halt of work is effecting the request flow, the reply flow, and the MCAs.

It is just most visible on the request flow because the mainframe keeps pumping out requests during the slowdown.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Wed Mar 05, 2003 12:12 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Problem solved, I think.

The message flow had a database node that was connecting to an Oracle database on a separate server. It would insert a record for every message it processed. That server had hardware issues. While talking with the sys admin for that server, he mentioned that periodically it would take a while to respond. I asked him to try to do things at the same time I noticed the WMQI Input queue backing up. Sure enough, every time my queue started backing up, he noticed performance problems as well. While my flow was working correctly, he noticed no problems.

The hardware was replaced (some sort of network card), the box rebooted, and the problem went away.


Two outstanding questions that make me not 100% sure that this in fact was the source of the problem.

1.) While the message flow was waiting for a response from the database, it was blocked with nothing to do, so it would wait, consuming 0 CPU. But why were the MCAs effected? Whenever the blocking occurred, I saw XMIT queues backing up. I remember from previous scenarios that if WMQI was 100 % busy with the CPU, the MCAs would take a back seat. They are lower on the totem pole as far as resources are concerned when compared to WMQI. But would they also get blocked if WMQI was blocked (but using 0% CPU)?

2.) Our WMQI trace shows the period of inactivity. But it shows the gap after it completed the PUT of one transformed message and before it got the next. I would have thought I would have seen the gap of time where the database node was being used, not after the flow completed. The trace makes it look like this had nothing to do with the database. Time waiting doing nothing was in-between executions of the flow, not during.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Mar 05, 2003 12:47 pm    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Quote:
1.) While the message flow was waiting for a response from the database, it was blocked with nothing to do, so it would wait, consuming 0 CPU. But why were the MCAs effected? Whenever the blocking occurred, I saw XMIT queues backing up. I remember from previous scenarios that if WMQI was 100 % busy with the CPU, the MCAs would take a back seat. They are lower on the totem pole as far as resources are concerned when compared to WMQI. But would they also get blocked if WMQI was blocked (but using 0% CPU)?


If you are using coordinated transactions (globally or otherwise), the effect on the MCA could be explained. WMQI uses MQSeries as the TM for coordinated transactions. If the database is slow to respond to a commit request, that could be creating a block on the TM within MQSeries. The MCA could then presumably be affected by that block, because it is waiting for it's own commit processing. I'm not saying this *is* happening, but it could be.

If you're explicitly setting Transaction Mode = No for all compute nodes and message flows, or using non-persistent messages for all flows, it's harder to explain (and I can't).
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Wed Mar 05, 2003 12:54 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7722

Hmmm. Interesting theory. That particular flow is set to YES.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » WMQI takes a break every 65 seconds
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.