Author |
Message
|
cwazpitt3 |
Posted: Wed Mar 21, 2012 6:44 am Post subject: |
|
|
Acolyte
Joined: 31 Aug 2011 Posts: 61
|
marko.pitkanen wrote: |
Hi cwazpitt3,
Just to check did you set up your FileInput node to read file line by line?
Code: |
Records and Elements
Record detection = Delimeted
Delimeter = DOS or UNIX Line End |
--
Marko |
Yes |
|
Back to top |
|
 |
marko.pitkanen |
Posted: Wed Mar 21, 2012 6:45 am Post subject: |
|
|
 Chevalier
Joined: 23 Jul 2008 Posts: 440 Location: Jamsa, Finland
|
Hi,
I ran a quick test:
FileInput -> Trace node
FileInput
Code: |
Records and Elements
Record detection = Delimeted
Delimeter = DOS or UNIX Line End |
Trace node
Code: |
Destination = File
Pattern = ${CAST(CURRENT_TIMESTAMP AS CHARACTER FORMAT 'yyyyMMdd-HHmmss')}
${CAST(Root.BLOB.BLOB AS CHAR CCSID Root.Properties.CodedCharSetId)} |
It took 1163 seconds to process 1 485 301 lines with this scenario. So 1277 lines per second.
--
Marko |
|
Back to top |
|
 |
mqsiuser |
Posted: Wed Mar 21, 2012 7:10 am Post subject: |
|
|
 Yatiri
Joined: 15 Apr 2008 Posts: 637 Location: Germany
|
marko.pitkanen wrote: |
Trace node |
How fast is it, if you use the local environment (or a shared row) for keeping a single variable (for counting the lines) and no Trace node (no trace enabled at all) ?
Wikipedia says: "Some published tests demonstrate message rates in excess of 10,000 per second in particular configurations."
We could measure against that  _________________ Just use REFERENCEs |
|
Back to top |
|
 |
Vitor |
Posted: Wed Mar 21, 2012 7:36 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
marko.pitkanen wrote: |
So 1277 lines per second. |
OS WMBv & platform details?
I did functionally the same thing on my sad little Solaris sandbox (2Gb, 1 core of a Blade T6320 as a whole root zone) and got 850 lines per second. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Mar 21, 2012 7:39 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
mqsiuser wrote: |
Wikipedia says: "Some published tests demonstrate message rates in excess of 10,000 per second in particular configurations." |
Yes. If the OP had the kind of configuration that could manage that, he wouldn't be posting about memory usage.... _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
cwazpitt3 |
Posted: Wed Mar 21, 2012 7:51 am Post subject: |
|
|
Acolyte
Joined: 31 Aug 2011 Posts: 61
|
But I have to also write the contents to an outbound file (essentially copying the file from one location to another). Is that where my flow is spending all its time?
I did FileInput --> FileOutput and for 39,974 lines it took 504 seconds for an whopping 79 lines per second! At that rate, my 250,000 line file would take a really long time. Am I doing something wrong?
If I could get to 800 lines per second or so it would compare to the 6 minutes the Java approach takes. I might just stick with what I have. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Mar 21, 2012 8:01 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
cwazpitt3 wrote: |
is that where my flow is spending all its time? |
the user trace will show you where your flow is spending all it's time.
Did you deploy any additional instances.... ? |
|
Back to top |
|
 |
cwazpitt3 |
Posted: Wed Mar 21, 2012 8:03 am Post subject: |
|
|
Acolyte
Joined: 31 Aug 2011 Posts: 61
|
mqjeff wrote: |
the user trace will show you where your flow is spending all it's time.
Did you deploy any additional instances.... ? |
I am working on the user trace thing. I don't think just putting a trace node in give me any other information about the flow.
No additional instances. |
|
Back to top |
|
 |
marko.pitkanen |
Posted: Wed Mar 21, 2012 8:20 am Post subject: |
|
|
 Chevalier
Joined: 23 Jul 2008 Posts: 440 Location: Jamsa, Finland
|
Hi,
I made a quick review of the user trace and found that it took most of the time with Trace node. Changed test flow to use compute and fileoutput nodes to produce the same functionality. Now running the same test.
--
Marko |
|
Back to top |
|
 |
marko.pitkanen |
Posted: Wed Mar 21, 2012 11:22 am Post subject: |
|
|
 Chevalier
Joined: 23 Jul 2008 Posts: 440 Location: Jamsa, Finland
|
Hi,
Some how I messed up with Compute and FileOutput nodes and made test flow a lot slower. I need to find why. But with Compute node and SHARED INT I managed to count 1 485 300 lines in 979 seconds.
--
Marko |
|
Back to top |
|
 |
marko.pitkanen |
Posted: Wed Mar 21, 2012 12:22 pm Post subject: |
|
|
 Chevalier
Joined: 23 Jul 2008 Posts: 440 Location: Jamsa, Finland
|
Hi,
From my quick test It seems that something takes the most of the time happens between building the new parser and reading the record for every line while reading line by line and counting them from a local file.
Code: |
2012-03-21 21:34:51.216648 12 UserTrace BIP6064I: A parser of type ''BLOB'' was created on behalf of node 'readFileLineByLine.File Input' to handle the input stream, beginning at offset '0'. The parser type was selected based on value ''NONE'' from the previous parser.
2012-03-21 21:34:51.217308 = 660 12 UserTrace BIP3352I: ''FileInput'' node ''File Input'' in message flow ''readFileLineByLine'' is propagating record ''5'' obtained from file ''/home/xxxxx/mqsitransitin/yyy-readFileLineByLine.in'' at offset ''132'' to terminal ''out''.
The FileInput node read a record from the file, and will propagate it to the named terminal.
No action is required.
2012-03-21 21:34:51.217356 = 708 12 UserTrace BIP3907I: Message received and propagated to 'out' terminal of input node 'readFileLineByLine.File Input'.
2012-03-21 21:34:51.217404 = 756 12 UserTrace BIP6063I: A parser of type ''Properties'' was created on behalf of node 'readFileLineByLine.File Input' to handle the input stream, beginning at offset '0'.
2012-03-21 21:34:51.217444 = 796 12 UserTrace BIP6069W: The broker is not capable of handling a message of data type ''BLOB''.
The message broker received a message that requires the handling of data of type ''BLOB'', but the broker does not have the capability to handle data of this type.
Check both the message being sent to the message broker and the configuration data for the node. References to the unsupported data type must be removed if the message is to be processed by the broker.
2012-03-21 21:34:51.217520 = 872 12 UserTrace BIP6064I: A parser of type ''BLOB'' was created on behalf of node 'readFileLineByLine.File Input' to handle the input stream, beginning at offset '0'. The
2012-03-21 21:34:51.218148 = 628 12 UserTrace BIP3352I: ''FileInput'' node ''File Input'' in message flow ''readFileLineByLine'' is propagating record ''6'' obtained from file ''/home/xxxxx/mqsitransitin/
|
--
Marko |
|
Back to top |
|
 |
marko.pitkanen |
Posted: Wed Mar 21, 2012 11:08 pm Post subject: |
|
|
 Chevalier
Joined: 23 Jul 2008 Posts: 440 Location: Jamsa, Finland
|
Hi,
From debug trace I can see that it takes some time to finalise the current flow execution after "return false" statement and perhaps a little a bit longer time to initialise file node to read next record.
Code: |
2012-03-22 08:46:18.180320 12 UserTrace BIP2539I: Node 'readFileLineByLine.Compute': Evaluating expression ''iRows = 1 OR MOD(iRows, 100000) = 0'' at ('.readFileLineByLine_Compute.Main', '6.18'). This
resolved to ''FALSE OR FALSE''. The result was ''FALSE''.
2012-03-22 08:46:18.180342 12 UserTrace BIP2537I: Node 'readFileLineByLine.Compute': Executing statement ''RETURN FALSE;'' at ('.readFileLineByLine_Compute.Main', '16.3').
2012-03-22 08:46:18.180744 12 UserTrace BIP4144I: Entered function 'cniCreateElementAsLastChild'(42396e5c, acbc160, 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A').
.
.
.
2012-03-22 08:46:18.181174 12 UserTrace BIP4142I: Evaluating cniElementSet'Name'. Changing value from '''' to ''Wildcard''.
Element ''Name'' has been changed to ''Wildcard''.
No user action required.
2012-03-22 08:46:18.181304 12 UserTrace BIP3352I: ''FileInput'' node ''File Input'' in message flow ''readFileLineByLine'' is propagating record ''4'' obtained from file ''/home/xxx/mqsitransitin/yyy-readFileLineByLine.in'' at offset ''131'' to terminal ''out''.
The FileInput node read a record from the file, and will propagate it to the named terminal.
No action is required.
2012-03-22 08:46:18.181336 12 UserTrace BIP3907I: Message received and propagated to 'out' terminal of input node 'readFileLineByLine.File Input'. |
--
Marko |
|
Back to top |
|
 |
mqsiuser |
Posted: Thu Mar 22, 2012 12:35 am Post subject: |
|
|
 Yatiri
Joined: 15 Apr 2008 Posts: 637 Location: Germany
|
Code: |
The parser type was selected based on value ''NONE'' from the previous parser. |
I'd try and change the "NONE".
Currenly I assume you are creating 1,4853 Mio Parsers (Parser-Objects)... probably that is where your performance goes down the drain ?! _________________ Just use REFERENCEs |
|
Back to top |
|
 |
Esa |
Posted: Thu Mar 22, 2012 12:55 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
I guess you are using 'Parsed Record Sequence' or 'Delimited' for record detection in the FileInput node. That will guarantee huge overhead.
I tested counting records for a whole file using the techniques from Large Messaging sample. Processing a 18 MB file of 198313 records took 10,2 seconds. The actual counting took less than 1 ms. The rest went to allocating memory for the blob and read/write operations. I must admit that I used MQinput and MQOutput, because I happened to have a test setup at hand.
That makes almost 20 000 records per second.
So I would say that you get the best performance by using Record detection settings 'Whole File' or 'Fixed Length' (of 5-10 MB perhaps), a simple message set and large message processing techniques. |
|
Back to top |
|
 |
rramasu |
Posted: Tue May 13, 2014 10:20 am Post subject: |
|
|
Newbie
Joined: 23 Sep 2007 Posts: 9 Location: india
|
Hi esa,
Could you provide the peices of code to count the lines from BLOB char
Thanks,  _________________ Rajamani |
|
Back to top |
|
 |
|