Author |
Message
|
danferry |
Posted: Mon Sep 19, 2011 6:23 am Post subject: [FileRead node] Read multiple files |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
Hi
Is it possible to read multiple files with a FileRead node?
I know it's possible when I get back to the In-terminal of the FileRead node - but what I actually want is something like that:
FileInput (Control-File to start) => FileRead (Read xy*.txt) => Out: process data [serial] / No Match: process 'No file found'
Could anyone help me please?
Thanks a lot
Daniel |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Sep 19, 2011 6:38 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The FileRead node won't do this work for you. You must code a loop.
You might want to strongly consider converting whatever process this is to not use files in the first place.
You do not want to ever code a loop in message broker by wiring connections from out terminals to in terminals, you always want to use PROPAGATE in some form or another. |
|
Back to top |
|
 |
zpat |
Posted: Mon Sep 19, 2011 6:58 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
You could perhaps convert all files to messages, using a simpler flow with a fileinput node and a mqoutput node. You can also choose to create one message per record (rather than per file) which improves atomicity (remember the old ACID adage...).
Then you process the messages once you have your trigger condition (e.g, using a collector node).
As Jeff says, files tend to be an inferior form of messaging and it's better to get back to the origin and using MQ. Only MQ can offer transactional integrity between platforms.
Using files - runs the risk of processing data twice or not at all, having to remove the processed files, not being able to cope with partially processed files (e.g. rollback issues), reading data out of sequence, no detection of mssing files, problems with multi-threaded flows and any number of other problems that the use of files brings. |
|
Back to top |
|
 |
danferry |
Posted: Mon Sep 19, 2011 7:05 am Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
mqjeff wrote: |
The FileRead node won't do this work for you. You must code a loop.
You might want to strongly consider converting whatever process this is to not use files in the first place.
You do not want to ever code a loop in message broker by wiring connections from out terminals to in terminals, you always want to use PROPAGATE in some form or another. |
Thanks for your fast answer
What 'loop' do you mean? I cannot read a file in a compute-node, can I? I also cannot use the PROPAGATE in a File Read Node.
As I am limited in getting files that have to be inserted into a database.
Lets say: test*.xml and othertest*.xml
I don't know how many test-files and othertest-files there are, but before I process the othertest-files I must have processed the test-files.
Until now, I solved that with a collector node to read all files and process each of them with PROPAGATE TO TERMINAL commands. But this approach needs to configure a timeout-time on the collector node, that means a loss of time.
I just want to keep it as simple as possible.
If you have a good advice I would be very happy
Regards,
Daniel
edit: No need to rollback - data is written into database and if it fails, the db will do the rollback. Loss of files is not a problem. The solution should be as simple as possible and by using MQ's it gets more complicated. |
|
Back to top |
|
 |
zpat |
Posted: Mon Sep 19, 2011 7:09 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
You loop in a compute node, with repeated PROPAGATE to a terminal. That terminal is connected to a FileRead node (and/or other nodes).
Your downstream flow (on this terminal) may need to signal back to the compute node when to stop looping. In which case you can set values in local environment variables to communicate.
Last edited by zpat on Mon Sep 19, 2011 7:10 am; edited 1 time in total |
|
Back to top |
|
 |
danferry |
Posted: Mon Sep 19, 2011 7:10 am Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
zpat wrote: |
You loop in a compute node, with PROPAGATE to a terminal. That terminal is connected to a FileRead node (and/or other nodes). |
Ok, but how do I know how many times I have to loop? |
|
Back to top |
|
 |
zpat |
Posted: Mon Sep 19, 2011 7:12 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
How many files do you want to read?
This is all a bit of a distorted solution. Using a FileInput node is better since it will drive the flow for each file matching the pattern. |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Sep 19, 2011 7:13 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
danferry wrote: |
I just want to keep it as simple as possible. |
Your current attempt to do this without using a Compute node (of some kind) is *too* simple.
danferry wrote: |
If you have a good advice I would be very happy |
I gave you the best advice I have.
Write a loop. You can't do this with just the nodes you have created. |
|
Back to top |
|
 |
danferry |
Posted: Mon Sep 19, 2011 7:26 am Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
zpat wrote: |
How many files do you want to read?
This is all a bit of a distorted solution. Using a FileInput node is better since it will drive the flow for each file matching the pattern. |
I don't know. There could only be one file or 20. The FileInput with a collector node is what I have, but the timeout of the Collector node leads to a loss of time.
By the way: the flow runs once a day, started by a control file in a directory. There is a timewindow from 3am to 4am in which the files must be imported into database.
If it is possible to count the num of files that fulfil the filename-pattern I completely agree: Write a loop in a compute node. If this is not possible I cannot loop (ok, infinite, but that's not good at all).
Ok, I could run the loop for a fixed num of times and just ignore the 'no match' terminal. So all files will be read. |
|
Back to top |
|
 |
smdavies99 |
Posted: Mon Sep 19, 2011 8:10 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Split it into two operations.
1) A FileInputnode-->ComputeNode (Inserts into DB temp table) Runs every minute or so
2) MQInputNode-->ComputeNode (moves data into Target Table, triggered by message on queue, triggered by Cron job or similar)
KISS applies. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
zpat |
Posted: Mon Sep 19, 2011 8:37 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
As I have already stated, you can communicate between the compute node loop and the fileread node using local environment variables (connect a compute node to the other side of a file read node to set the variable). |
|
Back to top |
|
 |
danferry |
Posted: Mon Sep 19, 2011 10:42 pm Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
zpat wrote: |
As I have already stated, you can communicate between the compute node loop and the fileread node using local environment variables (connect a compute node to the other side of a file read node to set the variable). |
I'm sorry. It was good to leave the problem over night, I got what you meant, and I found the solution.
Two compute nodes, one file read node.
Compute Node 1: Loops as long as in an env-variable is a 'true'
File Read Node: Reads the file, No Match goes to Compute Node 2
Compute Node 2: Sets the env-variable to 'false'
That works so far.
Thanks to all of you who helped me!
Regards,
Daniel |
|
Back to top |
|
 |
danferry |
Posted: Thu Sep 29, 2011 10:53 pm Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
srikanth_n wrote: |
Hello Daniel,
Can you please explain how you implemented this solution as I have similar kind of requirement.
Thanks |
For sure, I have two Screenshots where you can see the nodes:
and here the ESQL of the first Picture
Code: |
CREATE COMPUTE MODULE ReadSubf_Loop_Files
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE I INTEGER 0;
SET Environment.FileFound = true;
-- Read as long as there are files / trigger after 20 files -> no endless loop on misconfiguration/error
WHILE Environment.FileFound AND I < 20 DO
-- Go to read the file
CALL CopyEntireMessage();
PROPAGATE TO TERMINAL 'out1';
SET I = I + 1;
END WHILE;
-- If FileRead fails => failure-state
IF Environment.FileFailure THEN
CALL CopyEntireMessage();
RETURN FALSE;
END IF;
-- End => propagate to Out-terminal
CALL CopyEntireMessage();
RETURN TRUE;
END;
CREATE PROCEDURE CopyEntireMessage() BEGIN
SET OutputRoot = InputRoot;
END;
END MODULE;
CREATE COMPUTE MODULE ReadSubf_No_Match_State
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
-- No more progressing
SET Environment.FileFound = false;
RETURN TRUE;
END;
END MODULE;
CREATE COMPUTE MODULE ReadSubf_Error_State
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
-- No more progressing & failure state
SET Environment.FileFound = false;
SET Environment.FileFailure = true;
RETURN TRUE;
END;
END MODULE; |
Hope that's enough
Dan |
|
Back to top |
|
 |
zpat |
Posted: Fri Sep 30, 2011 2:06 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
Congratulations on posting the first graphic of WMB flows that I have seen on here. So much easier than trying to describe it in words!!  |
|
Back to top |
|
 |
danferry |
Posted: Sun Oct 02, 2011 10:53 pm Post subject: |
|
|
Novice
Joined: 11 Jul 2011 Posts: 12
|
Hi Srikanth
You may have to check the 'Records and Elements' section. You should have 'Whole File' selected. the expression=true() should work.
Daniel |
|
Back to top |
|
 |
|