|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Merging Queues |
« View previous topic :: View next topic » |
Author |
Message
|
manjus |
Posted: Tue Aug 16, 2005 5:26 am Post subject: Merging Queues |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
Hi,
I have data coming on multiple queues. The load is about 12 Million messages/day at the peak rate of 1000 msgs/ second. Data from multiple queues have to be merged , transformed and streamed onto one tcp/ip connection for sending downstream. This is the problem description.
I am considering multiple options here : flat files, using queues to do the merging and using an in memory DB. In this context, I would like to get some inputs on :
-- have you done this kind of integration and your thoughts on this . The client does not want a broker as the costs are large.
-- Capacity limits on MQ queues - how fast should the data be read so that queues do not overflow ?
-- Speed of MQ if persistence is enabled
-- Flat files seemed to be used a lot in this scenario. What are the speed limitations for using flat files here |
|
Back to top |
|
 |
KeeferG |
Posted: Tue Aug 16, 2005 5:34 am Post subject: |
|
|
 Master
Joined: 15 Oct 2004 Posts: 215 Location: Basingstoke, UK
|
I would suggest looking at the performance reports for the platform(s) that you are using. They contain all the information needed to understand how much traffic you can load down channels and onto queues. 1000 messages per second should be achievable with fast discs and persistent messaging if you get the design right. Non-persistent messages will handle that sort of load easily. I have rarely come accross a problem where MQ is the bottleneck if the system is designed correctly. _________________ Keith Guttridge
-----------------
Using MQ since 1995
Last edited by KeeferG on Tue Aug 16, 2005 5:42 am; edited 1 time in total |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Aug 16, 2005 5:36 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
You say that data from multiple queues has to be merged... Can you be more specific?
Is it that, for example, there are five queues and the first message on each queue has to be simply concatenated?
Or is it more complicated than that? Do you have to put two messages from queue 1 together with three messages from queue 5, unless the first message on queue 4 is "XYZ", then you have to sort the records in the first four messages in queue three and then join them with the messages from queue 1 EXCEPT where the messages is queue 5 have the same record numbers...
If performance is the SINGLE criteria, then you should not use persistance - as it enacts a penalty. You also won't want to use a database or flatfiles - merely work directly in your program's working storage. You will still want to use syncpoint, as a) it improves the speed of NP messages, and b) it provides some protection/retry if your program crashes.
The data should be read as fast as it possibly can - and the limiting factor on this is flat out going to be the nature of the merging and transformation. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
manjus |
Posted: Tue Aug 16, 2005 6:05 am Post subject: |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
[quote="jefflowrey"]You say that data from multiple queues has to be merged... Can you be more specific?
Is it that, for example, there are five queues and the first message on each queue has to be simply concatenated?
Or is it more complicated than that? Do you have to put two messages from queue 1 together with three messages from queue 5, unless the first message on queue 4 is "XYZ", then you have to sort the records in the first four messages in queue three and then join them with the messages from queue 1 EXCEPT where the messages is queue 5 have the same record numbers...
If performance is the SINGLE criteria, then you should not use persistance - as it enacts a penalty. You also won't want to use a database or flatfiles - merely work directly in your program's working storage. You will still want to use syncpoint, as a) it improves the speed of NP messages, and b) it provides some protection/retry if your program crashes.
The data should be read as fast as it possibly can - and the limiting factor on this is flat out going to be the nature of the merging and transformation.[/quote]
When I say data has to be merged, I mean that messages from teh queues have to be picked , a round robin would be okay and whenever there are 10 messages they have to be written to the tcp socket. The restriction on 10 messages is because we cannot change the receiver and it expects so. Each message has to be transformed before it is sent. If the receiver goes down, the queue should not overflow. that is why I thought of saving the messages to a flat file as they are read off the queue to enable the reciver to have a restart. Can the flat file storage be avoided if I persist the messages on MQ itself ? If so, what is the price that I pay for this persistence in terms of performance ?
The current system , receives input from a source that publishes it, stores to flat file for enabling restarts of the receiver and sends it over sockets. This is being replaced. But it has amazing performance - it can receive, store in flat file and send over sockets, 10000 messages/second.
The queue based system is going to replace this. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Aug 16, 2005 6:20 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
If you write messages to a file when the receiver is down, then the filesystem could fill up before the receiver came back.
It's entirely possible to have a queue that holds many gigs of data at once.
You don't need, necessarily, to use persistance for the messages if you simply want them to sit in a queue until later - even if you want to maintain them across a queue manager restart (this is a new feature as of a somewhat recent CSD of 5.3).
So you can keep the messages on the queue, and the queue won't "fill up", as long as you have a reasonably idea of what size of backlog is typical and can plan ahead just a bit.
If it's really doing 10000 messages a second, then I suspect you're doing overlapped IO on the flat file, and you are on a high performance machine with a high performance i/o subsystem - and I'm also pretty sure that the messages are reasonably small - under 1 meg or so.
Is there *any* order dependance in the messages at all? That is, pretend you have 10 queues. Is it okay if ten messages from queue 1 go through, and then 1 from each queue, and then 5 from queue 3 and 5 from queue 8, and etc... ?
If so, you can get some concurrancy on the queue reads - likely using threads, so you can process messages essentially in parallel.
As KeeferG said, if you are looking for numbers or mechanisms for desiging for performance, then look at the support packs for your platform that detail performance considerations. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
manjus |
Posted: Tue Aug 16, 2005 7:29 am Post subject: |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
The message size is 1K max.
another point is:
when the receiver goes down and comes up it sends the record number of the last record number that it got. so if I were reading off a queue, trasnforming , inserting a record number and sending :
1) how can I corelate the record number that I put to the message id in the queue - is there a message Id for every message that MQstores in the queue ? this has to be accross multipel queues too
2) what if the receiver comes up and gives me a record number that I read off the queue and committed. In that case I cannot retrieve that message from the queue ever unless I have a flat file or some other storage mechanism of my own isn't it? |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Aug 16, 2005 7:43 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Okay, so there is an order dependancy on the messages. You have to send the messages to the receiver in order of record number.
I would recommend only using ONE queue, then.
Yes, every message on a queue has a unique id - stored in a field in the MQMD called... Message ID.
As long as the messages are published by the sender in record order, and placed on the queue in record order, you can simply have the queue reading process stop when the receiver shuts down. Then when the receiver starts up again, the message you need to resend will be close to the front of the queue (or possibly the first message).
Under what situations in the current process are records DELETED from the flatfile? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
manjus |
Posted: Tue Aug 16, 2005 7:52 am Post subject: |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
messages are not deleted from the flat file now, till end of day
The order dependency is :
the data in a queue has to be read off in the order that they were entered ,but there is no order dependece accross queues. The record number is an identification number that we put in.
but the scenario that I had written is:
suppose there are 2 Qs and I read 5 off each and then write to the receiver over sockets.The tcp write did not fail. I commit on the Get. The write fails when I try to write the next 10 messages. so I stop getting messages from the Q. but when the reciver next comes up it asks for a message from the first batch that I committed. Now these messages will not be in the Q isn't it - since I did a get and commit ?
What you said is interesting - when do we need to split data into multiple queues - the reason we are thinking of splitting is because of the load - is sending 24 million messages a day at 1000/second mesg size 1K too much load on one queue ? |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Aug 16, 2005 8:15 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
The measureable load on a queue is, particularly with NP messages, only the current queue depth. That is, there is no limit to the number of messages that can pass through a queue.
A queue is only said to be backing up when the queue depth is increasing. This only happens when the rate of messages being removed from the queue is slower than the rate of messages being put on the queue.
If it is at all possible that, at the end of the day, the receiver will need a record sent at the start of the day, then you are probably better off either writing to a flat file or using a database. The only way to handle your current requirements using MQ is to use Browse. You would have to open the queue at the beginning of the day in one thread, browse each message as it comes in, and send in groups of ten. When the receiver indicated that it needed to restart processing at a particular record, you would close the queue and start browsing at the needed message - mapping record ID to msgID would be good for this. Then a separate process at the end of the day would either clear the queue, or destructively get messages up to a certain point.
But if the receiver needs to do anything more complicated than reset back to a particular record ID and resume from there, something like "give me thirty records starting from here, and then give me all records newer than this one", you are starting to use a queue like a database (and you are already using a file like a database). So you're much better off just using a database! _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
manjus |
Posted: Tue Aug 16, 2005 8:21 am Post subject: |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
I actually started off with an in memory DB :
costs were high - TimesTen and the functionality I exploit in this system is not worth the $, Oracle support..
open source not proven(?) HSQL
and came to RDBMS :
slow ?
we dont want the DBA's involved in a real time system
and came to flat files. There won't be queries other than give me records starting from ...
and the data is persisted in the db by the receiver . |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Aug 16, 2005 8:25 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Well, like I said, you can do this with browse.
I think you would be significantly better if you can clear the messages from the queue more often than once a day - particularly at the volumes you're talking about.
It may complicate the browse cursors, though, if messages are being taken out behind them. But I don't think so - I seem to remember some in depth discussions of this awhile ago that indicated it would be okay... but my memory is fallable.
Particularly if the receiver is persisting into a database, I would expect that they would not need to roll back a large number of messages during a failure. But you know what the current pattern of behavior is, and I don't. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
manjus |
Posted: Tue Aug 16, 2005 10:00 am Post subject: |
|
|
Newbie
Joined: 16 Aug 2005 Posts: 6
|
Thank you very much for your replies.
It has helped me understand the Queue design issues. The only option for intermittent storage seems to be an in memory DB or flat file if I do not use browse. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|