ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexWebSphere Message Broker Supportdfdl parser reading whole file vs records

Post new topicReply to topic
dfdl parser reading whole file vs records View previous topic :: View next topic
Author Message
jb3
PostPosted: Fri Aug 18, 2017 7:01 am Post subject: dfdl parser reading whole file vs records Reply with quote

Newbie

Joined: 18 Aug 2017
Posts: 4

Hi everyone,

I have previously worked on xml schema(soap/files) but i am new to dfdl schema. I am currently working on arequirement which includes reading from a file and forming an XML for each record after some transformation.

I am currently breaking my head at: DFDL parsing is successful (in toolkit) for whole file. But when i configure fileinput node to read each record, the first record is parsed, but at the end i also get DFDL parsing errors

Text:CHARACTER:An error occurred whilst parsing with DFDL
Insert Type:INTEGER:5
Text:CHARACTER:CTDP3058E: Separator '%CR;%LF;%WSP*;' not found at offset '282' for sequence or choice within element '/schema[1]'.


The File is TAB delimited and the structure is:
Column1<tab>Column2<tab>Column...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
Data1<tab>Date2<tab>Data...40
<emptyline>


There are optional columns in these as well.

the schema looks like below:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:ibmDfdlExtn="http://www.ibm.com/dfdl/extensions" xmlns:ibmSchExtn="http://www.ibm.com/schema/extensions" xmlns:recSepFieldsFmt="http://www.ibm.com/dfdl/RecordSeparatedFieldFormat">
<xsd:import namespace="http://www.ibm.com/dfdl/RecordSeparatedFieldFormat" schemaLocation="IBMdefined/RecordSeparatedFieldFormat.xsd"/>
<xsd:annotation>
<xsd:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:format encoding="{$dfdl:encoding}" escapeSchemeRef="" occursCountKind="implicit" ref="recSepFieldsFmt:RecordSeparatedFieldsFormat"/>
</xsd:appinfo>
</xsd:annotation>

<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
<xsd:complexType>
<xsd:sequence dfdl:initiatedContent="no" dfdl:separator="%CR;%LF;%WSP*;" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:initiator="" maxOccurs="unbounded" name="body">
<xsd:complexType>
<xsd:sequence dfdl:separator="%HT;">
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
<xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>

</xsd:schema>

This should be straight forward, but the more I look at it, more confusing this gets.
Back to top
View user's profile Send private message
Vitor
PostPosted: Fri Aug 18, 2017 8:12 am Post subject: Re: dfdl parser reading whole file vs records Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 24614
Location: Ohio, USA

jb3 wrote:
This should be straight forward


It is.

If the DFDL you have works to parse the whole file, use whole file passing in the FileInput node and iterate through the records to produce the XML you want.

If you don't want to do that, adapt the DFDL to model a single record and use record based parsing, producing one XML for each record.

But don't expect a DFDL schema that describes the entire file to work when you feed it a single record.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
timber
PostPosted: Fri Aug 18, 2017 10:59 am Post subject: Reply with quote

Shaman

Joined: 25 Aug 2015
Posts: 701

Any reason why you did not use [c o d e] tags when quoting your XSD? It makes it much easier to read...
Code:
<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
  <xsd:complexType>
    <xsd:sequence dfdl:initiatedContent="no" dfdl:separator="%CR;%LF;%WSP*;" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
      <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:initiator="" maxOccurs="unbounded" name="body">
        <xsd:complexType>
          <xsd:sequence dfdl:separator="%HT;">
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
          </xsd:sequence>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>
Back to top
View user's profile Send private message
timber
PostPosted: Fri Aug 18, 2017 11:06 am Post subject: Reply with quote

Shaman

Joined: 25 Aug 2015
Posts: 701

You have to think about what the parser can see. I expect you have set the input node to split the file on line breaks, right? In which case, you will be getting individual lines propagated into your message flow. Including the blank line at the end.
My theory is that the DFDL parser is happily dealing with everything until the final blank line. It then complains because there is no content (no tab separators at all).

One more thing, and it's quite important. You have set delimiterSuppressionPolicy to 'anyEmpty'. I hope you know what you're doing, because I wouldn't dare to use that setting. I recommend that you set it to 'trailingEmpty' and then set minOccurs to 0 on any optional columns. Otherwise the DFDL parser will expect any missing column to be *completely* omitted (not even a tab separator to mark its passing). I cannot image that you want that behaviour.
Back to top
View user's profile Send private message
jb3
PostPosted: Mon Aug 21, 2017 7:04 am Post subject: Reply with quote

Newbie

Joined: 18 Aug 2017
Posts: 4

Thanks Timber and Vitor,

I modified the schema as below, and now I am able to read 1 record at a time.

Code:
<xsd:element ibmSchExtn:docRoot="true" name="TESTRecord">
  <xsd:complexType>
    <xsd:sequence dfdl:initiatedContent="no" dfdl:separatorPosition="postfix" dfdl:separatorSuppressionPolicy="anyEmpty">
      <xsd:element dfdl:emptyValueDelimiterPolicy="none" name="body">
        <xsd:complexType>
          <xsd:sequence dfdl:separator="%HT;">
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data1" type="xsd:string"/>
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data2" type="xsd:string"/>
            <xsd:element dfdl:emptyValueDelimiterPolicy="none" dfdl:nilValueDelimiterPolicy="none" name="Data3" type="xsd:string"/>
          </xsd:sequence>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>
[/code]
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Page 1 of 1

MQSeries.net Forum IndexWebSphere Message Broker Supportdfdl parser reading whole file vs records
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.