Author |
Message
|
akil |
Posted: Sat Mar 07, 2015 6:05 pm Post subject: IIB9: FileInput / Record Detection : Delimited, CROnly |
|
|
 Partisan
Joined: 27 May 2014 Posts: 338 Location: Mumbai
|
Hi
The "DOS or Unix Line Feed" option of the Record Detection property understands <CR><LF> as well as <LF>, but it seems to ignore / fail to detect <CR> only files.
<CR> only files are not common, they are I think generated by some Mac versions (http://www.editpadpro.com/manual/convertmac.html) ..
Is there a way out here?
We get various files, some with <CR><LF>, some with <CR> , some with <LF>... Short of preprocessing the files with sed (the broker is on linux), is there any other option?
The files are large, so I'd have to use the FileInput to read chunks of it, and they aren't fixed length records, so I'am left with Delimited as the only viable option..
Suggestions pls? _________________ Regards |
|
Back to top |
|
 |
nelson |
Posted: Sat Mar 07, 2015 9:41 pm Post subject: |
|
|
 Partisan
Joined: 02 Oct 2012 Posts: 313
|
The KC says this:
Quote: |
DOS or UNIX Line End, which, on UNIX systems, specifies the line feed character (<LF>, X'0A'), and, on Windows systems, specifies a carriage return character followed by a line feed character (<CR><LF>, X'0D0A'). The node treats both of these strings as delimiters, irrespective of the system on which the broker is running. If they are both in the same file, the node recognizes both as delimiters. |
So... In fact <CR> delimited records will not be detected by "DOS or UNIX Line End" option. You will need to use '0D' as a custom delimiter. |
|
Back to top |
|
 |
akil |
Posted: Sun Mar 08, 2015 3:19 am Post subject: |
|
|
 Partisan
Joined: 27 May 2014 Posts: 338 Location: Mumbai
|
How does one specify an OR condition in the custom delimiter?
Need to specify CR or CRLF or LF.. _________________ Regards |
|
Back to top |
|
 |
nelson |
Posted: Sun Mar 08, 2015 6:10 am Post subject: |
|
|
 Partisan
Joined: 02 Oct 2012 Posts: 313
|
akil wrote: |
How does one specify an OR condition in the custom delimiter?
Need to specify CR or CRLF or LF.. |
I'm afraid that this is not possible. I'm assuming your files all have the same name/pattern. So.. you may want to handle the CR files in another flow (with a different file name/pattern), or.... change your record detection to "Whole File" and do yourself the record detection ( )... |
|
Back to top |
|
 |
mqjeff |
Posted: Sun Mar 08, 2015 11:39 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
akil wrote: |
How does one specify an OR condition in the custom delimiter?
Need to specify CR or CRLF or LF.. |
You need to use the appropriate special character for DFDL that indicates a newline, rather than a specific CR or CRLF or LF. |
|
Back to top |
|
 |
akil |
Posted: Sun Mar 08, 2015 9:52 pm Post subject: |
|
|
 Partisan
Joined: 27 May 2014 Posts: 338 Location: Mumbai
|
Hi
The FileInput node documentation says that the custom delimited has to be specified as hex.
Quote: |
In Custom delimiter, specify the delimiter byte or bytes to be used when Custom delimiter is set in the Delimiter property. Specify this value as an even-numbered string of hexadecimal digits. The default is X'0A' and the maximum length of the string is 16 bytes (represented by 32 hexadecimal digits).
|
The documentation does not mention if there's a way to specify multiple delimiters, I would need X'0D'X'0A' or X'0D' or X'0A'.
The other option - 'DOS or UNIX Line End', checks for X'0D'X'0A' or X'0A'. It seems to ignore X'0D'.
I could use the 'Parsed Record Sequence' and let the DFDL model detect the record, I was hoping to avoid this (as inputs are large files). _________________ Regards |
|
Back to top |
|
 |
shanson |
Posted: Mon Mar 09, 2015 1:58 am Post subject: |
|
|
 Partisan
Joined: 17 Oct 2003 Posts: 344 Location: IBM Hursley
|
If Record Detection 'Delimited' does not give you the flexibility you need then as mqjeff says you can use Record Detection 'Parsed Record Sequence' with a DFDL model that uses %NL; as the delimiter (which accepts all new line variants). |
|
Back to top |
|
 |
akil |
Posted: Mon Mar 09, 2015 6:21 am Post subject: |
|
|
 Partisan
Joined: 27 May 2014 Posts: 338 Location: Mumbai
|
Thanks, will use the Parsed Record Sequence..
I asked to question to find out if there's a way to handle DOC or UNIX or MAC format in the File Input node itself.. looks like it isn't .. _________________ Regards |
|
Back to top |
|
 |
|