ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexWebSphere Message Broker SupportDFDL parser for unbounded records with delimited fields

Post new topicReply to topic Goto page 1, 2  Next
DFDL parser for unbounded records with delimited fields View previous topic :: View next topic
Author Message
rahulk01
PostPosted: Thu Dec 26, 2019 11:53 am Post subject: DFDL parser for unbounded records with delimited fields Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

Hi,
I have to generate a DFDL parser for a format which contains some unbounded records within a sequence, like following:
Transaction
sequence
Record1 1,1
Record2 0,unbounded
Record3 1,2
Record4 1,1
end of sequence
end of Transaction

The parser that I have created looks like this:
<xsd:element ibmSchExtn:docRoot="true" name="Message">
<xsd:complexType>
<xsd:sequence dfdl:separator="">
<xsd:element dfdl:outputNewLine="{$dfdl:outputNewLine}" dfdl:terminator="%CR;%LF;" name="FileHeader" type="HeaderRecord"/>
<xsd:element name="Bundle">
<xsd:complexType>
<xsd:sequence dfdl:separator="">
<xsd:element dfdl:terminator="%CR;%LF;" name="BundleHeader" type="BundleHeaderRecord"/>
<xsd:element dfdl:occursCountKind="implicit" maxOccurs="unbounded" name="Transactions" type="Transaction"/>
<xsd:element dfdl:terminator="%CR;%LF;" name="BundleTrailer" type="BundleTrailerRecord"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element dfdl:initiator="" name="FileTrailer" type="TrailerRecord"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="Transaction">
<xsd:sequence dfdl:separator="">
<xsd:element dfdl:terminator="%CR;%LF;" name="BelopInfo" type="BelopRecord"/>
<xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="%CR;%LF;" maxOccurs="2" minOccurs="0" name="AccountPrintInfo" type="AccountPrintRecord"/>
<xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="unbounded" minOccurs="0" name="KidInfo" type="KIDRecord"/>
<xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="2" name="AddressInfo" type="AddressRecord"/>
<xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="2" name="MessageInfo" type="MessageRecord"/>
</xsd:sequence>
</xsd:complexType>

But when I am trying to parse a message which has multiple KidInfo, it parses the first occurence as KidInfo and the next one as AddressInfo and fails (since the structures are different).

I am not sure if I need to add a descriptor in each record to identify the record while parsing. If yes, then I am not sure on how to do it.
Actually the 8th field in each record (called RECORD-IDENTIFIER) has the value which determines what record it is, but I am not sure on how to use it, as this is my first DFDL parser.

Any help would be greatly appreciated.

BR
Rahul
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Dec 27, 2019 7:04 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20160
Location: LI,NY

You need to add a DFDL discriminator for each record type, and link that to the record identifier you described. Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
rahulk01
PostPosted: Fri Dec 27, 2019 10:18 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

fjb_saper wrote:
You need to add a DFDL discriminator for each record type, and link that to the record identifier you described. Have fun

Thanks for your reply.
I added the DFDL discriminator, initially for 2 records, AccountPrintInfo and KidInfo.
See below the schema
Code:

<xsd:complexType name="Transaction">
      <xsd:sequence dfdl:separator="">
         <xsd:element dfdl:terminator="%CR;%LF;" name="BelopInfo" type="BelopRecord"/>
         <xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="%CR;%LF;" maxOccurs="2" minOccurs="0" name="AccountPrintInfo" type="AccountPrintRecord">
              <xsd:annotation>
                <xsd:appinfo source="http://www.ogf.org/dfdl/">
                  <dfdl:discriminator>{fn:contains(/Message/Bundle/Transactions/AccountPrintInfo/RECORD_IDENTIFIER , 'X')}</dfdl:discriminator>
                </xsd:appinfo>
              </xsd:annotation>
            </xsd:element>
            <xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="unbounded" minOccurs="0" name="KidInfo" type="KIDRecord">
              <xsd:annotation>
                <xsd:appinfo source="http://www.ogf.org/dfdl/">
                  <dfdl:discriminator>{/Message/Bundle/Transactions/KidInfo/RECORD_IDENTIFIER eq 'U'}</dfdl:discriminator>
                </xsd:appinfo>
              </xsd:annotation>
            </xsd:element>
            <xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="2" name="AddressInfo" type="AddressRecord"/>
            <xsd:element dfdl:occursCountKind="implicit" dfdl:terminator="" maxOccurs="2" name="MessageInfo" type="MessageRecord"/>
        </xsd:sequence>
   </xsd:complexType>


AccountPrintInfo record has been defined to have 0-2 occurence and the next record KidInfo can have 0-unbounded occurrence.
The message that I used had 1 AccountInfo and the next record was a KidInfo, but during the parsing
The Model created 1 AccountPrintInfo successfully and the next record (which was a KidInfo) again got parsed as AccountPrintInfo, even though the next record had the RECORD_IDENTIFIER field as 'U'. I am not sure what am I missing.
Thanks in advance for any help.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Dec 27, 2019 11:43 pm Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20160
Location: LI,NY

Look at the tutorial for discriminators. RECORD_IDENTIFIER isn't defined anywhere!!! How do you expect the system to recognize the record if the identifier field for the record isn't defined anywhere on the record!!!
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
rahulk01
PostPosted: Sat Dec 28, 2019 3:13 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

fjb_saper wrote:
Look at the tutorial for discriminators. RECORD_IDENTIFIER isn't defined anywhere!!! How do you expect the system to recognize the record if the identifier field for the record isn't defined anywhere on the record!!!


Thanks for your lead. I was able to achieve what I was trying to. The XPath of Record_Identifier in the discriminator was the issue, when I used it as ./RECORD_IDENTIFIER instead, it worked for me.
And by the way, I did not post the complete schema in my posts, to save space. RECORD_IDENTIFIER is defined as a fixed length field inside the complex types for ACCOUNTINFO, KIDINFO and many others.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Dec 30, 2019 6:02 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20160
Location: LI,NY

rahulk01 wrote:
fjb_saper wrote:
Look at the tutorial for discriminators. RECORD_IDENTIFIER isn't defined anywhere!!! How do you expect the system to recognize the record if the identifier field for the record isn't defined anywhere on the record!!!


Thanks for your lead. I was able to achieve what I was trying to. The XPath of Record_Identifier in the discriminator was the issue, when I used it as ./RECORD_IDENTIFIER instead, it worked for me.
And by the way, I did not post the complete schema in my posts, to save space. RECORD_IDENTIFIER is defined as a fixed length field inside the complex types for ACCOUNTINFO, KIDINFO and many others.

Glad I could help and thanks for sharing the solution.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
rahulk01
PostPosted: Mon Dec 30, 2019 10:11 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

fjb_saper wrote:
rahulk01 wrote:
fjb_saper wrote:
Look at the tutorial for discriminators. RECORD_IDENTIFIER isn't defined anywhere!!! How do you expect the system to recognize the record if the identifier field for the record isn't defined anywhere on the record!!!


Thanks for your lead. I was able to achieve what I was trying to. The XPath of Record_Identifier in the discriminator was the issue, when I used it as ./RECORD_IDENTIFIER instead, it worked for me.
And by the way, I did not post the complete schema in my posts, to save space. RECORD_IDENTIFIER is defined as a fixed length field inside the complex types for ACCOUNTINFO, KIDINFO and many others.

Glad I could help and thanks for sharing the solution.


Thought I was done, but then landed up in another problem. I am building a DFDL schema where some elements are defined to be delimited with '%'. But when I use % in the terminator for the element, I get an error saying 'CTDV1438E : DFDL property 'terminator' contains an invalid entity '%'. A valid entity must obey pattern ['%#' [0-9]+ ';' | '%#x' [0-9a-fA-F]+ ';' | '%#r' [0-9a-fA-F] (2)';' | '%' <name> ';']. '.
Please help me to identify on how to set % as a delimiter, and what escape should I use. I have tried setting the delimiter as "%", {%, "{%" but none worked
Back to top
View user's profile Send private message
timber
PostPosted: Mon Dec 30, 2019 3:31 pm Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1070

If you need to use the character % in a delimiter or initiator string, just use the string %%.

In case it helps, the DFDL specification is here: https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
The section that describes String Literals is 6.3.1.2. You may find it useful in future if you need to represent control characters or raw byte values in your DFDL models.
Back to top
View user's profile Send private message
rahulk01
PostPosted: Tue Dec 31, 2019 2:16 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

timber wrote:
If you need to use the character % in a delimiter or initiator string, just use the string %%.

In case it helps, the DFDL specification is here: https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
The section that describes String Literals is 6.3.1.2. You may find it useful in future if you need to represent control characters or raw byte values in your DFDL models.


Thanks a lot for your input. It works with %%.
Back to top
View user's profile Send private message
rahulk01
PostPosted: Tue Dec 31, 2019 3:53 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

rahulk01 wrote:
timber wrote:
If you need to use the character % in a delimiter or initiator string, just use the string %%.

In case it helps, the DFDL specification is here: https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
The section that describes String Literals is 6.3.1.2. You may find it useful in future if you need to represent control characters or raw byte values in your DFDL models.


Thanks a lot for your input. It works with %%.


Hey, just when I think my DFDL is complete, I get stuck with something else.
I am defining a variable length Amount field, which will be terminated by either '+' or '-'. I am able to use one of them at a time and it works, but how do I use both like an enumeration?
Back to top
View user's profile Send private message
timber
PostPosted: Wed Jan 01, 2020 4:51 am Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1070

That's an easy one. All DFDL delimiters (initiators, separators, terminators) are a whitespace-separated list of alternatives.

It is unusual for a data format to allow alternative delimiters with exactly the same meaning. Is there a specification for this format that you are trying to model? If so, you should check whether there are rules about when a + or a - are used. If you don't check, there is a risk that your DFDL model will fail to parse valid documents.
Back to top
View user's profile Send private message
rahulk01
PostPosted: Thu Jan 02, 2020 5:37 am Post subject: Reply with quote

Novice

Joined: 26 Dec 2019
Posts: 13

timber wrote:
That's an easy one. All DFDL delimiters (initiators, separators, terminators) are a whitespace-separated list of alternatives.

It is unusual for a data format to allow alternative delimiters with exactly the same meaning. Is there a specification for this format that you are trying to model? If so, you should check whether there are rules about when a + or a - are used. If you don't check, there is a risk that your DFDL model will fail to parse valid documents.


I agree to your point, but the requirement for me is to define a variable length amount field which would be followed by amount sign i.e. + or -. Padding of 0s at the front is also not accepted, so the only way to identify that the amount has ended is by the way of it's sign.

I have another issue now. I have been using the field 8 of my records as the record identifier. But now there are 2 different records which have the same value in field 8. So I need to put an additional check in the first record to not have the first field of the record's value as 082 (this is a hardcoded value in 1st field of 2nd record).
I have used the check as
{./RECORD_IDENTIFIER eq 'B' AND fn:contains(./ANTALL_BYTES, '082') ne TRUE}
but I am getting an error saying Xpath exression ... contains a path location that does not resolve to an element in th schema.
When I use the check as {./RECORD_IDENTIFIER eq 'B'} it works, but this check is just not enough for me.
I am trying to build a model for a very old Mainframe application consumption being used in a bank. They do not have a copybook for it, and as such does not follow much standards in their message.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Thu Jan 02, 2020 6:23 am Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20160
Location: LI,NY

Trust me if it is a mainframe application there is a copybook somewhere... you just have to unearth it...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Vitor
PostPosted: Thu Jan 02, 2020 6:37 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 25885
Location: Texas, USA

fjb_saper wrote:
Trust me if it is a mainframe application there is a copybook somewhere... you just have to unearth it...





Especially if it's that old an application, no one back in the day wrote the kind of complex parsing code you're having to build in DFDL out of original OS COBOL. I speak as someone who was writing COBOL code back in the day.

There is a copybook or copybooks. Old COBOL code "does not follow much standards in their messages" - HA!

(I'll just repeat that - HA!)

Do you have any idea how hard it is to get OS/COBOL to write out free form text? What the COBOL equivalent of
Code:
Console.WriteLine("Hello World!")
looks like? Not following standards in ancient COBOL is like a vampire installing a sun bed.

They may not know where the copybook(s) is(are) but they exist.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
timber
PostPosted: Thu Jan 02, 2020 8:26 am Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1070

Quote:
I have used the check as
{./RECORD_IDENTIFIER eq 'B' AND fn:contains(./ANTALL_BYTES, '082') ne TRUE}
but I am getting an error saying Xpath exression ... contains a path location that does not resolve to an element in th schema.

I think it is risky to write that expression without parentheses to force your intended evaluation order. I would write it thus:
Code:
{./RECORD_IDENTIFIER eq 'B' AND (fn:contains(./ANTALL_BYTES, '082') ne TRUE)}


Also, most programmers would use the NOT function instead of writing ' ne TRUE'.
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum IndexWebSphere Message Broker SupportDFDL parser for unbounded records with delimited fields
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.