Author |
Message
|
rekapalli_ravi |
Posted: Fri Dec 10, 2004 4:30 pm Post subject: Message Getting struck in Queue manager |
|
|
 Novice
Joined: 17 Jun 2002 Posts: 10
|
I have a Java application where message flow is as follows
REQUEST_QUEUE --> Java App --> Remote Queue --> Transmit Queue --> Sender Channel --> Receiver Channel --> Response Queue.
Under high volume we noticed that messages are taking a long time to reach the destination Response Queue after the xml message is placed on the remote queue by Java App. We record entry and exit timestamps in the Java App.
We turned trace on the Queue manager and identified that the message is taking more than 55 seconds after it reached the Remote Queue. From the trace we noticed the message flow as follows
Remote Queue --> SYSTEM.DEF.SVRCONN --> transmit Queue --> SYSTEM.ADMIN.SVRCONN --> Sender Channel.
First leg was 36 seconds and the second leg was 18 seconds. It took 1 second to get transmitted over the network.
Can someone help me identify what could have caused this delay. This is not happening for all the requests, but only for a few requests and only under load (upwards of 8,000 request per hour).
There are some requests where Java App directly puts the message on Transmit Queue, instead of putting in the Remote Queue. In such cases we are not seeing any delays even under high loads.
Thanks for looking into this and please let me know if you want me to send some additional information. |
|
Back to top |
|
 |
vennela |
Posted: Fri Dec 10, 2004 5:14 pm Post subject: |
|
|
 Jedi Knight
Joined: 11 Aug 2002 Posts: 4055 Location: Hyderabad, India
|
Quote: |
Remote Queue --> SYSTEM.DEF.SVRCONN --> transmit Queue --> SYSTEM.ADMIN.SVRCONN --> Sender Channel.
|
I don't understand this.
My first impression was it was a server to server communication. Where did SVRCONNs came from.. specifically SYSTEM.ADMIN.SVRCONN |
|
Back to top |
|
 |
csmith28 |
Posted: Fri Dec 10, 2004 5:37 pm Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
Please don't use the SYSTEM.DEF.SVRCONN channel. It is supposed to be a template/model for when you create your own SVRCONN channels.
The SYSTEM.ADMIN.SVRCONN should only be used to allow MQExplore or other remote WMQ Adimin utilities to connect to an MQManager.
No application should be allowed to put messages directly to the XMITQ.
Pull Up!... Puuuullll UuuuuP! _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
rekapalli_ravi |
Posted: Fri Dec 10, 2004 10:51 pm Post subject: |
|
|
 Novice
Joined: 17 Jun 2002 Posts: 10
|
Thank you guys for looking into this.
Actually my application does not use the DEF.SVRCONN and ADMIN.SVRCONN channels. I saw these channels in the trace file so I assumed that MQSeries internally uses these channels.
We also created some scripts which will check the transmission queue deapth every 4 seconds and we noticed some high accumulation on the transmit queue.
Could this accumulation on transmit Q be the reason that we are seeing the delay in passing the message. Network got monitored between my local queue manager (Champaign, IL) and remote Queue Manager (D.C) and we got enough bandwidth there. Network team ruled out any network bottlenck.
Could you let me know if any parameters could be tuned to improve the transmission rate through the sender channel.
Here are the MQ definitions I have on my QM.
dis queue(NXRAPMQ02)
1 : dis queue(NXRAPMQ02)
AMQ8409: Display Queue details.
DESCR( ) RNAME( )
RQMNAME(NXRAPMQ02) XMITQ(CPMQ12Q0)
CLUSTER( ) CLUSNL( )
QUEUE(NXRAPMQ02) ALTDATE(2003-12-31)
ALTTIME(15.02.55) PUT(ENABLED)
DEFPRTY(0) DEFPSIST(NO)
SCOPE(QMGR) DEFBIND(OPEN)
TYPE(QREMOTE)
DEFINE CHANNEL ('NXTXML1.CPMQ12Q0') CHLTYPE(SDR) +
* ALTDATE (2004-06-05) +
* ALTTIME (00.42.51) +
TRPTYPE(TCP) +
BATCHINT(0) +
BATCHSZ(50) +
CONNAME('10.9.10.153(9050)') +
CONVERT(NO) +
DESCR('Production XMLBridge') +
DISCINT(600) +
HBINT(300) +
LONGRTY(999999999) +
LONGTMR(1200) +
SHORTRTY(10) +
SHORTTMR(60) +
MAXMSGL(104857600) +
MCATYPE(PROCESS) +
MCAUSER(' ') +
MSGDATA(' ') +
MSGEXIT(' ') +
NPMSPEED(FAST) +
RCVDATA(' ') +
RCVEXIT(' ') +
SCYDATA(' ') +
SCYEXIT(' ') +
SENDDATA(' ') +
SENDEXIT(' ') +
SEQWRAP(999999999) +
USERID(' ') +
XMITQ('CPMQ12Q0') +
REPLACE |
|
Back to top |
|
 |
fjb_saper |
Posted: Sat Dec 11, 2004 4:52 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
You do not specify the message size. Assuming that your messages are persistent this may play a role. As well as the way you put the messages to the queue. Let's say you use syncpoint.... Until the commit the messages may show up in the xmit queue depth but cannot be sent anywhere.
So multiple possibilities for tweaking.
0)network latency Have network admin tell you about network latency and the importance it plays regarding your transmission time
a) if other traffic is using the same channel make sure that the delivery method on the xmitq is by priority. Make sure that the priority of the messages used by your app is adequate. You might even need to use a different channel for your app.
b) Depending on message size reduce batch size of channel (default 50). This should reduce the size of the channel redo logs. (calculate redo log size. A good ball point mark is to keep them under 2 gig). Make sure the overall log file space will accomodate your log size and if in linear logging will not fill up too fast. You need to do this on both sender and receiver side.
c) Reduce the transaction size (number of messages). It takes time to put a big number of messages to a queue. So the time between the actual put and the commit should not be counted. But to the human eye it is still part of the overall response time.
Hope it helps some
F.J.  |
|
Back to top |
|
 |
rekapalli_ravi |
Posted: Sat Dec 11, 2004 8:34 am Post subject: |
|
|
 Novice
Joined: 17 Jun 2002 Posts: 10
|
1. Message size is under 1K.
2. Network team said they did not have any latency in the network.
3. Unfortunately these messages originate from the client, and arrive to our network before passing thru multiple layers (IVR, IBM Crossworlds, MQ WF Etc). We tried the priority option but no success there. Client is not willing to change the priority because of the workflow issues and all the messages that go thru this channel should have the same priority.
4. Our current batch size is defined as 50. We will try reducing that number hoping that it helps.
5. Our application treats each single call as a single transaction. In other words each message is put on Queue and immediately commited.
Does short retry interval (=60) and short retry count(=10) have any significance in this issue. One the OS (Solaris) we are seeing huge counts of tcpRetransBytes ( =107,552,212).
Putting message on a remote queue Vs. putting on a remote Queue Manager --> Does it make any difference. |
|
Back to top |
|
 |
vennela |
Posted: Sat Dec 11, 2004 12:32 pm Post subject: |
|
|
 Jedi Knight
Joined: 11 Aug 2002 Posts: 4055 Location: Hyderabad, India
|
Another question:
Quote: |
Putting message on a remote queue Vs. putting on a remote Queue Manager |
When you say putting on a remote Queue Manager do you mean putting by making a client connection.
Since you are saying it passes throug many other products and you also mentioned that only few messages are delayed, are there any DB transactions involved. Are there any database deadlocks in the application DBs. Are there any DB deadlocks in the workflow runtime DB.
Are the crossworlds collaborations transactional... are the collaborations configured for concurrent processing... |
|
Back to top |
|
 |
fjb_saper |
Posted: Sat Dec 11, 2004 2:22 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
rekapalli_ravi wrote: |
3. Unfortunately these messages originate from the client, and arrive to our network before passing thru multiple layers (IVR, IBM Crossworlds, MQ WF Etc). We tried the priority option but no success there. Client is not willing to change the priority because of the workflow issues and all the messages that go thru this channel should have the same priority. |
part 1
Seing your answer I would say that the transport mechanism MQ is not your bottle neck. Now you are doing a lot of other stuff with your message: IVR, Crossworlds, WF ...) You need to find out exactly where your bottleneck is (Adapter, DB, Flow etc...). It might be as simple as the sending application committing a batch of x messages in one call and you processing them serially and not concurrently... or high volume within very short time and serial processing of said volume
rekapalli_ravi wrote: |
Putting message on a remote queue Vs. putting on a remote Queue Manager --> Does it make any difference. |
part 2
Definitely depending on the way you handle it. You will have to add the time for handling the connection to the target and transport of message to the target host. Depending on the frequency and if you have to open a lot and close a lot, I would suggest using a local qmgr with a channel to the target.($ involved) Anyway I would first try to solve part 1. You may not have to touch this part at all.
Enjoy  |
|
Back to top |
|
 |
rekapalli_ravi |
Posted: Sat Dec 11, 2004 3:02 pm Post subject: |
|
|
 Novice
Joined: 17 Jun 2002 Posts: 10
|
Quote: |
You need to find out exactly where your bottleneck is (Adapter, DB, Flow etc...) |
We traced the entry and exit timestamps of the message at each and every layer. We found the delay of 50 sec plus after Java App wrote the message on the remote Queue of the local Queue manager and when the message was sent out from the Sender Channel. In all the other layers message did not stay for more than a second. In the java DB the message statyed for 3 sec. which was expected.
We opened a call with IBM and they analyzed the traces and found the following:
Quote: |
This segment is from the trace file for the runmqchl process (PID 27289) which is the channel agent for channel NXTXML1.CPMQ12Q0
14:45:34.862778 27298.1 --(05)----{ recv
14:45:34.862805 27298.1 --(05)----}! recv rc=Unknown(FFFFFFFF)
14:45:34.862832 27298.1 waiting 360s select
14:45:34.862850 27298.1 --(05)----{ select
14:45:34.862868 27298.1 --(06)-----{ xcsSelect
*14:46:41.439778 27298.1 --(06)-----}! xcsSelect rc=Unknown(1)
14:46:41.439836 27298.1 --(05)----}! select rc=Unknown(1)
14:46:41.439854 27298.1 --(05)----{ recv
14:46:41.439884 27298.1 --(05)----}! recv rc=Unknown(1C)
14:46:41.439925 27298.1 Channel Name:NXTXML1.CPMQ12Q0
This shows delay of 67 seconds occurring at the time the messages were delayed. This was at the point where the channel agent was waiting for an acknowledgement back from the receiving channel agent at the destination queue manager.
No problems were found in the MQSeries error logs or the FDC files.
To determine the cause and the solution, we will need data from the receiving queue manager. To start with, we should gather the MQSeries error logs and the FDC files using the commands I sent you earlier. |
We are trying to get this info from the client. In the meanwhile did any one run into similar problem earlier? What did you do to get around this one. Does recreating the MQ objects help here ? |
|
Back to top |
|
 |
Nigelg |
Posted: Sun Dec 12, 2004 1:56 pm Post subject: |
|
|
Grand Master
Joined: 02 Aug 2004 Posts: 1046
|
Why did you bother to open a call with IBM if you are just going to ignore the answer?
Of course recreating the objects will not help.
What will help is finding what the delay is as alfready analysed by IBM support, so get the info from your client.
FYI..
I see your network guys said that there is no latency in the network. They always say that; it means they have not looked. |
|
Back to top |
|
 |
csmith28 |
Posted: Sun Dec 12, 2004 2:36 pm Post subject: |
|
|
 Grand Master
Joined: 15 Jul 2003 Posts: 1196 Location: Arizona
|
@Nigel
rekapalli_ravi wrote: |
We are trying to get this info from the client. |
Looks like he's trying to get the information IBM asked for?....
rekapalli_ravi wrote: |
In the meanwhile did any one run into similar problem earlier? What did you do to get around this one. Does recreating the MQ objects help here ? |
Doesn't hurt to ask. _________________ Yes, I am an agent of Satan but my duties are largely ceremonial. |
|
Back to top |
|
 |
rekapalli_ravi |
Posted: Wed Dec 15, 2004 6:52 pm Post subject: |
|
|
 Novice
Joined: 17 Jun 2002 Posts: 10
|
Hi guys,
IBM has identified transaction processes that tied up the receiving channel agent on the remote queue manager. Because of this our sender channel did not receive the acknowledgement for an extended period of time (30+ Seconds). One more observation that IBM made was that the remote Queue Manager was running on MQ 5.2 CSD06 running on SUN OS where as our Queue manager was running on MQ5.3 CSD05 running on SUN OS and recommended our client to upgrade the MQ.
Once the upgrade was made we did not see any delays happening and no accumulation was observed on the transmission queue either. Though we could not pinpoint the particualr transaction process that was holding up the agent, the issue was resolved with version upgrade. Probably some incompatible process with MQ 5.2.
Thanks for looking into this and I appreciate your help. |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Dec 15, 2004 7:35 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Thanks a lot for the update.
F.J.  |
|
Back to top |
|
 |
|