MQSeries.net :: View topic - Transform 10MB messages from CSV to Fixed Width.

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Transform 10MB messages from CSV to Fixed Width.

Transform 10MB messages from CSV to Fixed Width.

« View previous topic :: View next topic »

Author

Message

vallu

Posted: Thu Mar 11, 2004 6:45 pm Post subject: Transform 10MB messages from CSV to Fixed Width.

Apprentice

Joined: 29 Jun 2002
Posts: 31

We have a requirement to transform 10MB CSV files into Fixed width. But, a 1MB of message takes 50 Minutes for the transformation. A 50K messages takes 1 second for transformation. What options do i have to accomplish my task of transforming 10MB file?
We are seriously considering using 50K messages (splitting larger messages into logical groups of 50K each).

We are at CSD5 of MQSI 2.1 on Win 2K machine (2GB RAM).

Please suggest.

jefflowrey

Posted: Thu Mar 11, 2004 7:08 pm Post subject:

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

There's a support pack, IP04 , about how to design your message flows for performance.

Start by reviewing that to see if there are ways you can improve your code or your model.
_________________
I am *not* the model of the modern major general.

kirani

Posted: Thu Mar 11, 2004 11:14 pm Post subject:

Jedi Knight

Joined: 05 Sep 2001
Posts: 3779
Location: Torrance, CA, USA

Can you tell us more about your input message layout? Are you trying to send a complete file into 1 message? I believe your file consist of multiple records. What is your output message format?
You can significantly improve the performance by using REFERENCE data type within your ESQL. Please take a look at ESQL reference manual along with the material suggested by Jeff.
_________________
Kiran

IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries

vallu

Posted: Fri Mar 12, 2004 3:30 am Post subject:

Apprentice

Joined: 29 Jun 2002
Posts: 31

kirani wrote:

Thanks for the reply.
We have an input file which is CSV. We want an output file in FIXED WIDTH. We have multiple records.
40,000 repeating records making upto 10MB in data. I would think WMQI cannot transform any messages bigger than 50KB in an efficient manner(in short time). I have had a look at the material suggested by Jeff sometime back. Not much for me. Performance figures by IBM do not show any figure above 60KB (for obvious reasons..)

We are planning to approach the problem using logical grouping of messages. Message sequence is not important for us. But all messages should reach the destination. We can handle this programmatically and WMQI supports grouping.
Has somebody done this before?

Sample CSV Data:
,0280,021860,0,,0,,,, , , , ,0,0,0,0,0, ,1,

Sample fixed width
002 B024895020 0 1 0 2004 1 310A7DB 205/55 R16 90Q DRICE TL PC/LT 0 0 0 0 0YES 1 9.74

kirani

Posted: Fri Mar 12, 2004 2:19 pm Post subject:

Jedi Knight

Joined: 05 Sep 2001
Posts: 3779
Location: Torrance, CA, USA

You can still work with single input message. Model your input message using TDS and set the record element to Repeating. Within your message flow you need to loop thru input records and then transform them to output CWF format.
Please note that CWF can take only fixed number of occurs or occurs depending on some numeric variable, so model your message accordingly.

Here is the sample code (not tested).

Code:

DECLARE inref REFERENCE to InputRoot.MRM.MyRecord[1];

SET OutputRoot.MRM.TotalRecs = 0;
CREATE FIELD OutputRoot.MRM.OUTREC;
DECLARE outref REFERENCE to OutputRoot.MRM.OUTREC;

DECLARE CNT INT 1;

while (LASTMOVE(inref) = TRUE ) DO
SET outref.Fld1 = inref.Fld1;
SET outref.Fld2 = inref.Fld2;
...

move inref NEXTSIBLING;

CREATE NEXTSIBLING on outref as outref NAME 'OUTREC';
SET CNT = CNT + 1;
END WHILE;

SET OutputRoot.MRM.TotalRecs = CNT - 1;

detach outref;

If you machine does not have enough memory then it's a good idea to split the input message into smaller parts and then do the transformation. The point I am trying to make here is to Use References to loop thru the input/output tree to get better performance.

Hope this helps.
_________________
Kiran

IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries

fitzcaraldo

Posted: Sat Mar 13, 2004 3:58 am Post subject:

Voyager

Joined: 05 May 2003
Posts: 98

This really gets down to whether the message must be processed as a single unit of work (ie all 40000 records or none).

If not, you may be able to split the message into 40000 separate ones and process them individually. Does the target application require one large message or can it handle 40000 small ones?

If it requires one large message in a single unit of work, you can do things like splitting and adding an RFH2 to each message with a sequence number and then have a flow that reassembles them into one big message and checking for omissions. Or use the MQ grouping you mention.

To handle a single 10MB message you would want to keep the parsing and number of compute nodes to a minimum.

vallu

Posted: Sun Mar 14, 2004 4:41 pm Post subject:

Apprentice

Joined: 29 Jun 2002
Posts: 31

Thanks all of you.
I shall try using reference. But I do not think our machine has enough memory to handle 10MB message. In such case, i shall try splitting.

Thanks again

surenat

Posted: Mon Mar 15, 2004 2:31 pm Post subject:

Apprentice

Joined: 01 Jan 2002
Posts: 32

Hi Vallu:
I had experienced the same problem. I had a situation where I need to convert 1MB CSV message to XML. Using normal plain ESQL coding, it took 10 mins to process one message. After that I altered the code using REFERENCE concept and removing CARDINALITY from while loop condition check, it processed in 2 mins. But still, 2 mins is too much time. I opened PMR with IBM...nothing helpful. I ended up writting java plugin and process time reduced to 10 secs per message!
_________________
IBM Certified Specialist MQSeries
IBM Certified Specialist - Websphere MQ Integrator

vallu

Posted: Mon Mar 15, 2004 7:32 pm Post subject:

Apprentice

Joined: 29 Jun 2002
Posts: 31

I have noticed that using REFERENCE is much faster. So, you wrote your own plugin. Does it transform CSV to XML? How does it do this?

surenat

Posted: Tue Mar 16, 2004 7:34 am Post subject:

Apprentice

Joined: 01 Jan 2002
Posts: 32

In summary, I wrote some java classes to load CSV into data strcuture(object), and then mapped the data to XML (DOM parser for Java). Integrated this parsing classes with main Java-pluin class.

Let me know if you need detailed frame work, first make sure your client agree to use java plug-ins insdie MQSI.
_________________
IBM Certified Specialist MQSeries
IBM Certified Specialist - Websphere MQ Integrator

vallu

Posted: Tue Mar 16, 2004 11:33 pm Post subject:

Apprentice

Joined: 29 Jun 2002
Posts: 31

Hi Surenat,
Please let us know the framework. I am curious to know, as to how DOM parsing could be faster than MQ compute nodes.

surenat

Posted: Wed Mar 17, 2004 7:35 am Post subject:

Apprentice

Joined: 01 Jan 2002
Posts: 32

Hi Vallu:
I do not think DOM parser is faster than WMQI XML parser. The only difference I made in plugin was, I did not load incoming CSV message as MRM tree, instead I used java string tokenization to parser the CSV and then map the tokenized string data to DOM tree.
_________________
IBM Certified Specialist MQSeries
IBM Certified Specialist - Websphere MQ Integrator

JLRowe

Posted: Fri Mar 19, 2004 2:47 am Post subject:

Yatiri

Joined: 25 May 2002
Posts: 664
Location: South East London

I would warrant that the DOM parser is much faster than the WMQI one, especially for large messages.

There have been lots of posts in the past about WMQI performance problems with large messages. Part of the problem must be that the message tree is copied over for every node in the flow. The WMQI parser probably only has an advantage when you partially parse towards the head of the message and you do not update the message.

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Transform 10MB messages from CSV to Fixed Width.

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP