ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL parsing csv file error

Post new topic  Reply to topic
 DFDL parsing csv file error « View previous topic :: View next topic » 
Author Message
icyblue7
PostPosted: Fri Sep 02, 2016 5:28 am    Post subject: DFDL parsing csv file error Reply with quote

Newbie

Joined: 02 Sep 2016
Posts: 3

Hello,

I created a dfdl model with CommaSeparatedFormat to parse the sample csv file below

Code:
"1467 27497 89062","A",116274,8,"17380923"
"1467 27497 89062","F","3","231333302 863000","","2"
"1467 27497 89062","F","2","231333325 051000","","2"
"1467 27497 89062","B","3","231333302 865000","","2","1"
"1467 27497 89062","Z","9360722 16233",6



Here is the XSD file

Code:
    <xsd:import namespace=".../CommaSeparatedFormat" schemaLocation="IBMdefined/CommaSeparatedFormat.xsd"/>
    <xsd:annotation>
      <xsd:appinfo source=".../dfdl/">
         <dfdl:format documentFinalTerminatorCanBeMissing="yes" encoding="{$dfdl:encoding}" escapeSchemeRef="csv:CSVEscapeScheme" ref="csv:CommaSeparatedFormat"/>
      </xsd:appinfo>
   </xsd:annotation>
   <xsd:element dfdl:outputNewLine="%LF;" ibmSchExtn:docRoot="true" name="SAMPLE">
      <xsd:complexType>
         <xsd:sequence dfdl:separator="%LF;%WSP*;" dfdl:separatorSuppressionPolicy="anyEmpty">
               <xsd:element name="A_Line">
               <xsd:complexType>
                  <xsd:sequence dfdl:outputNewLine="%LF;" dfdl:separator=",">
                     <xsd:element default="" minOccurs="0" name="ACCT" type="xsd:string"/>
                     <xsd:element default="" minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                     <xsd:element default="" minOccurs="0" name="FILLER2" type="xsd:string"/>                     
                            <xsd:element default="" minOccurs="0" name="FILLER3" type="xsd:string"/>
                            <xsd:element default="" minOccurs="0" name="FILLER4" type="xsd:string"/>
</xsd:sequence>
               </xsd:complexType>
            </xsd:element>
                <xsd:element dfdl:terminator="" maxOccurs="unbounded" minOccurs="0" name="idoc">
               <xsd:complexType>
                  <xsd:sequence dfdl:separator="%LF;%WSP*;"  >
                   <xsd:element  maxOccurs="unbounded" minOccurs="0" name="lines">
                <xsd:complexType>
                  <xsd:sequence dfdl:separator="%LF;%WSP*;">
                            <xsd:choice dfdl:terminator="">
                      <xsd:element dfdl:initiator="{fn:concat(/SAMPLE/A_Line/ACCT,',F,')}" dfdl:outputNewLine="%LF;" dfdl:terminator="" name="F_Line">
                        <xsd:complexType>
                          <xsd:sequence>
                     <xsd:element minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                            <xsd:element minOccurs="0" name="FILLER2" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER3" type="xsd:string"/>
               </xsd:sequence>
                        </xsd:complexType>
                      </xsd:element>
                      <xsd:element dfdl:initiator="{fn:concat(/SAMPLE/A_Line/ACCT,',Y,')}" dfdl:outputNewLine="%LF;" dfdl:terminator="" name="Y_Line">
                        <xsd:complexType>
                          <xsd:sequence>
                     <xsd:element minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                            <xsd:element minOccurs="0" name="FILLER2" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER3" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER4" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER5" type="xsd:string"/>   
</xsd:sequence>
                        </xsd:complexType>
                      </xsd:element>
                    </xsd:choice>
                  </xsd:sequence>
                </xsd:complexType>
              </xsd:element>
                 <xsd:element dfdl:initiator="{fn:concat(/SAMPLE/A_Line/ACCT,',B,')}" name="B_Line">
                <xsd:complexType>
                  <xsd:sequence>
                     <xsd:element minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                            <xsd:element minOccurs="0" name="FILLER2" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER3" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER4" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER5" type="xsd:string"/>
                  </xsd:sequence>
                </xsd:complexType>
              </xsd:element>                   
               </xsd:sequence>
                </xsd:complexType>
              </xsd:element>
          <xsd:element  name="Footer">
          <xsd:complexType>
            <xsd:sequence dfdl:separator="%LF;%WSP*;">
              <xsd:choice>
                <xsd:element dfdl:initiator="{fn:concat(/SAMPLE/A_Line/ACCT,',Y,')}" name="Y_Line">
                  <xsd:complexType>
                    <xsd:sequence dfdl:separator="%LF;%WSP*;">
                     <xsd:element minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                            <xsd:element minOccurs="0" name="FILLER2" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER3" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER4" type="xsd:string"/>
                            <xsd:element minOccurs="0" name="FILLER5" type="xsd:string"/>
</xsd:sequence>
                  </xsd:complexType>
                </xsd:element>
                <xsd:element dfdl:initiator="{fn:concat(/SAMPLE/A_Line/ACCT,',Z,')}" dfdl:outputNewLine="%LF;" name="Z_Line">
                  <xsd:complexType>
                    <xsd:sequence dfdl:separator="%LF;%WSP*;">
                      <xsd:element minOccurs="0" name="FILLER1" type="xsd:string"/>                     
                            <xsd:element minOccurs="0" name="FILLER2" type="xsd:string"/>
</xsd:sequence>
                  </xsd:complexType>
                </xsd:element>
              </xsd:choice>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
            </xsd:sequence>
          </xsd:complexType>


And here is the exception:

Code:
info: Calculating value of DFDL property 'initiator' using DFDL expression '{fn:concat(/SAMPLE/A_Line/ACCT,',F,')}'. The calculated value was '1467 27497 89062,F,'
[dfdl = /LIBRARY/SAMPLE.xsd, scd = #xscd(/schemaElement::SAMPLE/type::0/model::sequence/schemaElement::idoc/type::0/model::sequence/schemaElement::lines/type::0/model::sequence/model::choice/schemaElement::F_Line), 166]

error: CTDP3041E: Initiator '1467' not found at offset '43'  for element '/SAMPLE[1]/idoc[1]/lines[1]/F_Line[1]'.

info: Offset: 43. Parser was unable to resolve data on the current branch and will evaluate the next available branch beginning at offset '43' owned by the 'choice' group contained within element 'sequence'.
[dfdl = /LIBRARY/SAMPLE.xsd, scd = #xscd(/schemaElement::SAMPLE/type::0/model::sequence/schemaElement::idoc/type:
:0/model::sequence/schemaElement::lines/type::0/model::sequence/model::choice), 209]
Back to top
View user's profile Send private message
Vitor
PostPosted: Fri Sep 02, 2016 5:37 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Why are you using initiators?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
icyblue7
PostPosted: Fri Sep 02, 2016 6:28 am    Post subject: Reply with quote

Newbie

Joined: 02 Sep 2016
Posts: 3

this is the structure of the input:
- first line is the header (record "A")
- detail contains 1..* line items (record "F") and 1..* additional Record "Y"
- summary record "B"
- last line is the trailer (record "Z")
Back to top
View user's profile Send private message
mqjeff
PostPosted: Fri Sep 02, 2016 6:39 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

You should construct your model this way
  • Acct#
    • Choice Group
      • 'A' header record
      • 'F' record
        • 'Y' optional record
      • 'B record
      • 'Z'record

The F record should be a complex type that repeats an unlimited # of times.
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
icyblue7
PostPosted: Fri Sep 02, 2016 6:45 am    Post subject: Reply with quote

Newbie

Joined: 02 Sep 2016
Posts: 3

mqjeff wrote:
You should construct your model this way
  • Acct#
    • Choice Group
      • 'A' header record
      • 'F' record
        • 'Y' optional record
      • 'B record
      • 'Z'record

The F record should be a complex type that repeats an unlimited # of times.


but my input is like this

"1467 27497 89062","A",116274,8,"17380923"
"1467 27497 89062","F","3","231333302 863000","","2"
"1467 27497 89062","F","2","231333325 051000","","2"
"1467 27497 89062","B","3","231333302 865000","","2","1"
"1467 27497 89062","Z","9360722 16233",6

so I tried add initiators like this fn:concat(/SAMPLE/A_Line/ACCT,',Y,')
Back to top
View user's profile Send private message
mqjeff
PostPosted: Fri Sep 02, 2016 6:53 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

icyblue7 wrote:
but my input is like this

"1467 27497 89062","A",116274,8,"17380923"


So that's
"1467 27497 89062" == Acct #
'A" = record discriminator
116274,8,"17380923" = fields in Record Type A

The structure I showed always has a Acct # as the first field.

I forgot to indicate that the entire structure repeats an unlimited # of times (until end of file)
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
Vitor
PostPosted: Fri Sep 02, 2016 7:09 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqjeff wrote:
icyblue7 wrote:
but my input is like this

"1467 27497 89062","A",116274,8,"17380923"


So that's
"1467 27497 89062" == Acct #
'A" = record discriminator
116274,8,"17380923" = fields in Record Type A

The structure I showed always has a Acct # as the first field.

I forgot to indicate that the entire structure repeats an unlimited # of times (until end of file)




You don't need initiators to model this.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Sep 02, 2016 8:16 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

What you want to use and mqjeff has given you a huge hint are discriminators...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
timber
PostPosted: Sat Sep 03, 2016 2:17 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1292

There are two ways to solve this problem:
1. Organise the CSV records into structures using DFDL. To do this you need to use discriminators in the model because the field that indicates the record type is not the first field of the record.
2. Parse the CSV as an unbounded list of records and do the record type recognition in the message flow logic.

Personally I would probably choose 2, unless
a) the input CSV is large, so that Records and Elements needs to be set to 'Parsed Record Sequence'
and
b) there might be a need to run more than one instance of the message flow
( because then it would not be possible to use a SHARED variable to hold the record type ).
Back to top
View user's profile Send private message
mqjeff
PostPosted: Tue Sep 06, 2016 4:36 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

timber wrote:
There are two ways to solve this problem:
1. Organise the CSV records into structures using DFDL. To do this you need to use discriminators in the model because the field that indicates the record type is not the first field of the record.

I believe that my suggested model structure addressed this? The acct # is a fixed field and always occurs. The choice structure then resolves based on the first record of the remaining line?
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
timber
PostPosted: Tue Sep 06, 2016 6:55 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1292

@mqjeff: Consider these options:
a) Use the built-in CSV wizard. Accept that the message tree will be an unbounded list of records. The record type of each record must be detected by inspecting the value of its second child.
b) Use your suggested model structure. Accept that the message tree will be an unbounded list of records. The record type of each record must be detected by inspecting the name of its second child.
c) Use discriminators so that DFDL builds a structured message tree. Accept that the DFDL model generated by the CSV wizard will need to be heavily customized. On the other hand, Parsed Record Sequence can now be used to process a huge input file without requiring the message flow to hold the record type in a SHARED variable.

I think b) requires adjustments to the DFDL and produces a more complex message tree without offering any compensations. But I may be missing something - it wouldn't be the first time.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Tue Sep 06, 2016 7:31 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Perhaps my structure is a bit too nested.

My intent is that the Acct # is a peer of the choice structure, as part of an ordered sequence. The ordered sequence would repeat until the end of file.

The choice structure would be resolved by using the first field of the remaining record (everything *after* the acct#) as a discriminator/indicator/whatever the correct DFDL name is.

Perhaps the A, B, and Z records could be removed from the choice, and modeled as records that do not repeat.
_________________
chmod -R ugo-wx /
Back to top
View user's profile Send private message
shanson
PostPosted: Wed Sep 07, 2016 9:32 am    Post subject: Reply with quote

Partisan

Joined: 17 Oct 2003
Posts: 344
Location: IBM Hursley

Quote:
Perhaps the A, B, and Z records could be removed from the choice, and modeled as records that do not repeat


That's how I would do it. The choice model is flexible but if you want strict validation then it won't detect out-of-order records.

Quote:
- detail contains 1..* line items (record "F") and 1..* additional Record "Y


Do all the Fs appear before all the Ys or are they interleaved? If they all appear before, you don't need a choice at all.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL parsing csv file error
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.