|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
dfdl parser reading whole file vs records |
« View previous topic :: View next topic » |
Author |
Message
|
jb3 |
Posted: Fri Aug 18, 2017 7:01 am Post subject: dfdl parser reading whole file vs records |
|
|
Apprentice
Joined: 18 Aug 2017 Posts: 26
|
Hi everyone,
I have previously worked on xml schema(soap/files) but i am new to dfdl schema. I am currently working on arequirement which includes reading from a file and forming an XML for each record after some transformation.
I am currently breaking my head at: DFDL parsing is successful (in toolkit) for whole file. But when i configure fileinput node to read each record, the first record is parsed, but at the end i also get DFDL parsing errors
Text:CHARACTER:An error occurred whilst parsing with DFDL
Insert Type:INTEGER:5
Text:CHARACTER:CTDP3058E: Separator '%CR;%LF;%WSP*;' not found at offset '282' for sequence or choice within element '/schema[1]'.
The File is TAB delimited and the structure is:
Column1<tab>Column2<tab>Column...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
<emptyline>
There are optional columns in these as well.
the schema looks like below:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:ibmDfdlExtn="http://www.ibm.com/dfdl/extensions" xmlns:ibmSchExtn="http://www.ibm.com/schema/extensions" xmlns:recSepFieldsFmt="http://www.ibm.com/dfdl/RecordSeparatedFieldFormat">
<xsd:import namespace="http://www.ibm.com/dfdl/RecordSeparatedFieldFormat" schemaLocation="IBMdefined/RecordSeparatedFieldFormat.xsd"/>
<xsd:annotation>
<xsd:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:format encoding="{$dfdl:encoding}" escapeSchemeRef="" occursCountKind="implicit" ref="recSepFieldsFmt:RecordSeparatedFieldsFormat"/>
</xsd:appinfo>
</xsd:annotation>
<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
<xsd:complexType>
<xsd:sequence dfdl:initiatedContent="no" dfdl:separator="%CR;%LF;%WSP*;" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:initiator="" maxOccurs="unbounded" name="body">
<xsd:complexType>
<xsd:sequence dfdl:separator="%HT;">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
This should be straight forward, but the more I look at it, more confusing this gets. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Aug 18, 2017 8:12 am Post subject: Re: dfdl parser reading whole file vs records |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
jb3 wrote: |
This should be straight forward |
It is.
If the DFDL you have works to parse the whole file, use whole file passing in the FileInput node and iterate through the records to produce the XML you want.
If you don't want to do that, adapt the DFDL to model a single record and use record based parsing, producing one XML for each record.
But don't expect a DFDL schema that describes the entire file to work when you feed it a single record. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
timber |
Posted: Fri Aug 18, 2017 10:59 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
Any reason why you did not use [c o d e] tags when quoting your XSD? It makes it much easier to read...
Code: |
<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
<xsd:complexType>
<xsd:sequence dfdl:initiatedContent="no" dfdl:separator="%CR;%LF;%WSP*;" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:initiator="" maxOccurs="unbounded" name="body">
<xsd:complexType>
<xsd:sequence dfdl:separator="%HT;">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element> |
|
|
Back to top |
|
 |
timber |
Posted: Fri Aug 18, 2017 11:06 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
You have to think about what the parser can see. I expect you have set the input node to split the file on line breaks, right? In which case, you will be getting individual lines propagated into your message flow. Including the blank line at the end.
My theory is that the DFDL parser is happily dealing with everything until the final blank line. It then complains because there is no content (no tab separators at all).
One more thing, and it's quite important. You have set delimiterSuppressionPolicy to 'anyEmpty'. I hope you know what you're doing, because I wouldn't dare to use that setting. I recommend that you set it to 'trailingEmpty' and then set minOccurs to 0 on any optional columns. Otherwise the DFDL parser will expect any missing column to be *completely* omitted (not even a tab separator to mark its passing). I cannot image that you want that behaviour. |
|
Back to top |
|
 |
jb3 |
Posted: Mon Aug 21, 2017 7:04 am Post subject: |
|
|
Apprentice
Joined: 18 Aug 2017 Posts: 26
|
Thanks Timber and Vitor,
I modified the schema as below, and now I am able to read 1 record at a time.
Code: |
<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
<xsd:complexType>
<xsd:sequence dfdl:initiatedContent="no" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" name="body">
<xsd:complexType>
<xsd:sequence dfdl:separator="%HT;">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element> |
[/code] |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|