Author |
Message
|
EnthusiasticSatya |
Posted: Thu Aug 11, 2011 12:39 am Post subject: How to read a huge file in smaller parts via File Input Node |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
Generally we read the whole file via File input node (file placed in the file server). My requirement now is to read the huge text file placed in the File server in smaller parts or blocks , such that i should not read the whole file at one go ( which crashes the server as the file is more than 1 MB).
Example :
Input file is something like this.
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
I should read the file in blocks
Block 1
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
Block 2
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
Block 3
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
I have to check the first character in the file , here it is '1'.
Can somebody have an idea on this? |
|
Back to top |
|
 |
zpat |
Posted: Thu Aug 11, 2011 1:14 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
Read it record by record (using the appropriate FileInput node options).
Then the message flow can look for particular column data values to process chunks.
Aternatively - you can also set delimiter characters in the fileinput node - not sure if it could do exactly what you need though.
Last edited by zpat on Thu Aug 11, 2011 2:04 am; edited 1 time in total |
|
Back to top |
|
 |
EnthusiasticSatya |
Posted: Thu Aug 11, 2011 1:42 am Post subject: |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
In the file input node under the property Records and Elements we need to select parsed record sequence in the dropdown of record detection to read the file record by record.
As given in the example, I should read the file based on the record Type "1" which is the first char of every line.
Where should I mention in the file input node attribute , so that the node detects the first char and read the file? |
|
Back to top |
|
 |
zpat |
Posted: Thu Aug 11, 2011 2:07 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
You mean process the record, not "read the file".
As I perhaps did not make clear, the builtin record parsing may not do what you want. So you can process each record individually in the flow and build up your output data in ESQL variables, until you encounter the delimiter sequence that prompts you to generate an output.
No doubt Vitor would say it's easier to do this using the aggregation node. So you now have enough clues to get on with. |
|
Back to top |
|
 |
EnthusiasticSatya |
Posted: Thu Aug 11, 2011 2:19 am Post subject: |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
I need to Group records and propegate in one REXI until i get next 1 in the first char of the line. |
|
Back to top |
|
 |
EnthusiasticSatya |
Posted: Thu Aug 11, 2011 3:08 am Post subject: |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
First I need to restrict the file input node to read the whole file , because by default, file Input node reads the whole file.
I need to read the file in blocks based on the Recordtype "1".
as suggested below ,I can process the records in ESQl variables, but by the time I come to the next node, File node would have read the whole file, which I do not want.
"process each record individually in the flow and build up your output data in ESQL variables, until you encounter the delimiter sequence that prompts you to generate an output. "
So this will not work. |
|
Back to top |
|
 |
zpat |
Posted: Thu Aug 11, 2011 3:42 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
Set the FileInput node to record, not whole file mode.
See Record Detection, use fixed length or delimit with a DOS/UNIX end of line.
This reads each record, you then do the rest. |
|
Back to top |
|
 |
EnthusiasticSatya |
Posted: Thu Aug 11, 2011 5:36 am Post subject: |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
I know about the Parsed Record Sequence in Record Detection.
I cannot use either Delimiter or the fixed length .
I need to split the file based on the first char ( inthis case "1")
So at one time , I should read a block which might contain 3 to 4 lines
eg:
Block 1
1020000121914110000064285
3020000121914110000064285
3020000121914110000064285
3020000121914110000064285
I am not able to find out where should I specify that I should read the file until I see the another row with first character as "1". |
|
Back to top |
|
 |
smdavies99 |
Posted: Thu Aug 11, 2011 5:47 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
EnthusiasticSatya wrote: |
I cannot use either Delimiter or the fixed length .
|
Why? Why can't you do that?
Why can't you read each record and save them temporarily until you reach the first record in the next block.
Then bundle the records of the block together and send it on its merry way.
I have a flow that regularly reads files greater than 1Mb record by record.
You could save them in a DB table with a unique key per block and when the whole file has been read whatever you want to with each block. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Last edited by smdavies99 on Thu Aug 11, 2011 5:51 am; edited 1 time in total |
|
Back to top |
|
 |
Vitor |
Posted: Thu Aug 11, 2011 5:48 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
EnthusiasticSatya wrote: |
I cannot use either Delimiter or the fixed length . |
Why not?
EnthusiasticSatya wrote: |
I need to split the file based on the first char ( inthis case "1") |
No you don't
EnthusiasticSatya wrote: |
I am not able to find out where should I specify that I should read the file until I see the another row with first character as "1". |
That's because you can't. But it's ok, because you don't need to. You need to read the file one record as a time as previously suggested. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Aug 11, 2011 5:49 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
smdavies99 wrote: |
You could save them in a DB table with a unique key per block and when the whole file has been read whatever you want to with each block. |
Or if the number is known to be small and/or DB use is infeasible, collect them within WMB. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
whydieanut |
Posted: Tue Aug 16, 2011 10:09 pm Post subject: |
|
|
 Disciple
Joined: 02 Apr 2010 Posts: 186
|
What about using Delimited with Infix. The delimter could be:
Isn't it also possible by specifying a Tag Delimited Message Set and using Parsed Record Sequence? |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Aug 17, 2011 3:49 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You can *easily* accomplish this using a properly constructed TDS model that uses '1' and '3' as tags.
Then TDS will identify that you have reached another '1' record, and stop accumulating repeats of '3' records.
If you are careful with your model construction, then FileInput node can emit the whole set of '1' record followed by repeating '3' records as a single message. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Aug 17, 2011 4:04 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
mqjeff wrote: |
If you are careful with your model construction, then FileInput node can emit the whole set of '1' record followed by repeating '3' records as a single message. |
It seems to me that therein lies your problem. It takes time, patience and persistency to get the model you describe correct.
It if does not come with a configuration wizard that only accepts Yes as an answer then crafting something by hand is a dying art form. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
EnthusiasticSatya |
Posted: Wed Aug 17, 2011 11:24 pm Post subject: |
|
|
Apprentice
Joined: 10 Aug 2011 Posts: 26
|
Hi ,
Since it was a bit difficult to read the file block wise, so as sugsted by Vitor, I am trying to read the file Record by Record.
So here is how I am dealing with it ::
1. Read the file line by line and then store it in a global variables in Java Compute node(Now my main key here is dealer code present ).
2. Store the whole record in a global variables.
3. Come back again to File input and read the second line.
4. Compare the dealer code with the first record stored in a global variable and if same then store it in a global variable and if Different then propagate to the File Output.
Problem which I am facing currently now is ::
I am storing the first record in a global variables and then read the next record from the File Input , but when I come to Java comput node, I do not see anything stored in the Global variables .
The records in the Global Variables are getting refreshed, so I do not get any previous data and hence I am not able to compare the second record with the first record.
Can any one give any idea on this ? |
|
Back to top |
|
 |
|