Author |
Message
|
PeterPotkay |
Posted: Tue Feb 22, 2005 12:19 pm Post subject: MQ Channel slows down every 10 minutes |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
QMA on ServerA has a SNDR channel to QMB on ServerB.
Every 10 minutes, on the 10 minutes, for about 50 seconds, the XMITQ on QMA backs up, and then it opens up and all the messages flow fine until the next 10 minute mark.
I started up repeated pins from various places into ServerB, and the pings look fine <1ms until that 10 minute mark. Then the pings take >80ms.
So I think MQ is a victim.
WHat do I tell these people to look at. I keep saying find out why the pings take >80ms every 10 minutes for 50 seconds. Solve that and MQ will be fine.
We are stuck here and going nowhere. What do you guys recomend I suggest to the 30 people on this conferance call? Every 10 minutes, all our MQ messages into QMB get delayed for 50 or so seconds. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Feb 22, 2005 1:19 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Is Transaction Vision doing active monitoring of MQ every ten minutes? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 22, 2005 1:29 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
TV is always on, recording everything that comes thru. I don't think it does anything only every 10 minutes. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Feb 22, 2005 1:37 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
PeterPotkay wrote: |
TV is always on, recording everything that comes thru. I don't think it does anything only every 10 minutes. |
It's typical in MQ monitoring for certain metrics to be gathered at regular interfals, like queue statistics, because MQ doesn't provide an event driven mechanism to gather them.
Granted, TV could have an entire suite of exits that allow it to bypass the "normal" mechanisms.
To determine if it's MQ or the network, you can try redirecting traffic from QMA to get to QMB by way of QMC - that uses a different network route.
I assume you've done things like reconcile your channel configuration against the problem (making sure your HBINT isn't ten minutes, that kind of thing).
And your ping tests do tend to rule out MQ - but it never hurts to be a bit paranoid.
The interval very much says to me that it's some kind of monitoring interfereance.
Also, your QM isn't using SAN disk space, is it? I remember some one a few months back who had weird issues like this that was finally nailed down to SAN implementation interference. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 22, 2005 1:41 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
That was me with the SAN, and no, this server has no SAN.
Just to be sure I stopped TV entirely in the QA environment, and still this occours. 2 10 minutes cycles already.
At 5 PM, we are stopping MQ completly on the server for 30 minutes to see if the behaviour goes away. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
csmith28 |
Posted: Tue Feb 22, 2005 1:43 pm Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
If you have friends in the Network group you could put a sniffer on ServerA and ServerB for a few minutes each and see what they can get by analyzing the contents of the buffer. _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 22, 2005 1:45 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
OK, the network guys have put the sniffer on an hour ago.
They see requests coming into the ServerB box (sniffer on the ServerB box),
but they don't see the 002 box responded(no TCP level acknowledgement).
Another thing, there are 2 boxes on our environment that are acting the
exact same way. One is the QA version of the other. Every other server is
OK. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Feb 22, 2005 1:49 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Is there common hardware involved - either "same type" or "same equipment"? Like all three affected boxes are on the same switch/hub, or all three are the same brand of box or using the same brand and make of NIC, or etc?
Is there any monitoring of the box other than TV? Like something monitoring SNMP traps, or hardware utilization? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 22, 2005 1:50 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
Yup, we are on that path. For the life of us we cannot figure out what these 2 servers and only these 2 servers have in common that would cause something like this. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
PeterPotkay |
Posted: Tue Feb 22, 2005 3:24 pm Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
MQ shutdown, server rebooted, server back up with no MQ, and still the pings are slow on the 10 minute mark.
At least they stopped blaming MQ and now realize MQ is handling the situation as best it can (no messages lost, just delayed) given the lack of a stable connection. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
bower5932 |
Posted: Tue Feb 22, 2005 6:58 pm Post subject: |
|
|
 Jedi Knight
Joined: 27 Aug 2001 Posts: 3023 Location: Dallas, TX, USA
|
PeterPotkay wrote: |
At least they stopped blaming MQ and now realize MQ is handling the situation as best it can (no messages lost, just delayed) given the lack of a stable connection. |
Isn't handling network problems for your application what WMQ is all about.  |
|
Back to top |
|
 |
csmith28 |
Posted: Tue Feb 22, 2005 10:19 pm Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
Quote: |
Isn't handling network problems for your application what WMQ is all about. |
Indeed, as WMQ Admins we are expected to troubleshoot, application problems, hardware, firmware/driver problems, network problems, firewall problems,... you name it.
Like I said before, if there is an "M" and a "Q" in the error output and in some cases even if there isn't. The title of the ticket is "Problem with MQ" or some such.
That said, Peter what are you doing to resolve this issue? _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
Michael Dag |
Posted: Wed Feb 23, 2005 1:25 am Post subject: |
|
|
 Jedi Knight
Joined: 13 Jun 2002 Posts: 2607 Location: The Netherlands (Amsterdam)
|
as usual the one that deals with all components (MQ...) get's blamed.
It's almost a fact of life we all have to live with.
Peter, do let us know what it turned out to be... 10 minutes = 600 seconds = 600000 milliseconds. 'something' must be doing something at these intervals... _________________ Michael
MQSystems Facebook page |
|
Back to top |
|
 |
kevinf2349 |
Posted: Wed Feb 23, 2005 5:41 am Post subject: |
|
|
 Grand Master
Joined: 28 Feb 2003 Posts: 1311 Location: USA
|
Peter
Do you use a security password synchronization product? I have seen ours slow things down from time to time.
Has this just started happening or has it been happening for a while but only just now been narrowed down? |
|
Back to top |
|
 |
zpat |
Posted: Wed Feb 23, 2005 5:43 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
It's not the BATCHINT by any chance? |
|
Back to top |
|
 |
|