ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » XPath syntax in DFDL expression

Post new topic  Reply to topic Goto page 1, 2  Next
 XPath syntax in DFDL expression « View previous topic :: View next topic » 
Author Message
petervh1
PostPosted: Wed Dec 19, 2018 7:04 am    Post subject: XPath syntax in DFDL expression Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

I'm trying to code a dynamic length value of element "Field4" as follows, but I'm unsure of the correct syntax. I can't find enough detailed info on the syntax.

Record layout:

Code:
Field1 name = recordlength type = nonNegativeInteger etc.
Field2 name = xyz type = string
Field3 name = abc type = string
Field4 name = payload type = hexBinary



The entire record length is given in Field1. I need the length for Field4 only, so I need to set up a dfdl:length XPath statement for Field4 that has this type of pseudocode:


<xsd:element dfdl:initiator="nnn:" dfdl:length="{(../recordlength - (length(recordlength) + length(xyz)
+length(abc) ))}"

I've got the entire length returned correctly in ../recordlength
Can someone tell me how to code the XPath for the
Code:


(length(recordlength) + length(xyz)
        +length(abc)


part?

Thanks
Back to top
View user's profile Send private message
timber
PostPosted: Wed Dec 19, 2018 8:51 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

You have the right idea, and the DFDL specification does allow that kind of thing to be done. However...according to https://www.ibm.com/support/knowledgecenter/en/SSMKHH_10.0.0/com.ibm.etools.mft.doc/df00150_.htm the DFDL functions dfdl:contentLength() and dfdl:valueLength() are not supported by IBM DFDL.

I think your best bet is to precalculate (if possible) the lengths of the 'extra' fields and subtract that constant value from the value of recordLength.
Back to top
View user's profile Send private message
petervh1
PostPosted: Thu Dec 20, 2018 3:03 am    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

I've discovered that unfortunately the 'extra' fields are not always the same length.
Back to top
View user's profile Send private message
timber
PostPosted: Thu Dec 20, 2018 9:18 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

I was afraid you might say that. The next thing that I would try is:
- create a sub-element 'payloadContainer' after recordLength with lengthKind=explicit, length=./recordLength
- on element payloadContainer/payload, set lengthKind=endOfParent
Worth a try, anyway.
Back to top
View user's profile Send private message
petervh1
PostPosted: Thu Dec 20, 2018 11:33 pm    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

I'm a little confused - did you mean modify Field4 to include

lengthKind=explicit, length=./recordLength?

Where do I add

lengthKind=endOfParent?

As I understand it, IBM DFDL does not support lengthKind=endOfParent

Thanks again for your assistance
Back to top
View user's profile Send private message
petervh1
PostPosted: Tue Jan 08, 2019 12:09 am    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

I'm still unable to parse this.

I saw timber's suggestion in
Quote:
http://www.mqseries.net/phpBB2/viewtopic.php?t=74722


and have tried using

Code:
dfdl:lengthKind="pattern" dfdl:lengthPattern="\x1C"


to find the last occurrence of hex 1C (ie ASCII FS) in the string.

This gives me an error:

Quote:
Element xxx with lengthKind='pattern' could not be found using the pattern '\x1C'.


Am I missing something in my regex here?
Back to top
View user's profile Send private message
timber
PostPosted: Tue Jan 08, 2019 11:23 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Quote:
Am I missing something in my regex here?
Yes - your regex only describes the terminator. It should match *all* of the content of the element. You need something like this (not tested):
Code:
dfdl:lengthKind="pattern" dfdl:lengthPattern="[^\x1C]*\x1C"
Back to top
View user's profile Send private message
petervh1
PostPosted: Tue Jan 08, 2019 10:10 pm    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

Thanks for the update.

When I use the following element definition

Code:

<xsd:element dfdl:initiator="14.999:" dfdl:lengthKind="pattern" dfdl:lengthPattern="[^\x1C]*\x1C" name="T14.DAT" type="xsd:string"/>


I get the same error:

Quote:
Element T14.DAT with lengthKind='pattern' could not be found using the pattern '[^\x1C]*\x1C'.


The data I am trying to model has multiple records, each starting with an intiator = 14.001 and terminated with x1C. The element T14.DAT described above is the last one in the record. The element contains binary data.
Back to top
View user's profile Send private message
timber
PostPosted: Wed Jan 09, 2019 2:15 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Quote:
The data I am trying to model has multiple records, each starting with an intiator = 14.001 and terminated with x1C. The element T14.DAT described above is the last one in the record. The element contains binary data.
You have not said so (yet), but I assume that
a) the 0x1C cannot appear in the binary value
b) all of the other elements have dfdl:representation="text", and so you are able to define a DFDL terminator to represent the 0x1C byte.

You might want to consider this option...
- define the binary element with representation="text" and
- set dfd:encoding to ISO8859-1 for this element only (unless the entire file happens to use that encoding, of course)
- set the terminator to %#x1C; (as I assume you have done for the preceding elements)

This will parse the binary as a string of characters. Normally, this would not be safe because random binary data does not translate safely into most encodings. But ISO8859-1 is a single-byte encoding with 256 defined characters. So you can never get an 'illegal character' error. Just a rather unreadable string value for your element.
Back to top
View user's profile Send private message
petervh1
PostPosted: Wed Jan 09, 2019 11:35 pm    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

Looking through more data samples, I've established that there are 0x1C sequences appearing in the final, binary field. This means that the parser stops when it thinks it recognises this sequence as the end of the record.

Yes, all of the other fields have dfdl:representation="text".

Before I try and code this, would this work:

1 Define a separator for the end of the binary field as "0x1C" followed by "14.001" in hex (this is the initiator for a subsequent record as stated in my earlier post)

The problem that I see with this is - how do I parse the data if there is no second 14.001 record, i.e. the file contains:

Code:
14.001abcdef14.999binarystuff[0x1C] - end of file


as opposed to:

Code:
14.001abcdef14.999binarystuff[0x1C]14.001abcdef14.999binarystuff[0x1C] - end of file


Once again, your assistance is much appreciated.
Back to top
View user's profile Send private message
petervh1
PostPosted: Thu Jan 10, 2019 12:00 am    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

Update:

There is another record in the file that contains a field as follows:

1.003nnn14n01n14n02 etc. This gives a count of the number of type 14 records appearing later in the file. Can I use this count somehow to determine how to parse for the end of 14.999 as stated earlier?
Back to top
View user's profile Send private message
timber
PostPosted: Thu Jan 10, 2019 1:04 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Some of your problems are not DFDL problems, they are problems with your understanding of the data format. You will struggle to parse a complex format like this one without understanding exactly what the format specification says.

Quote:
Define a separator for the end of the binary field as "0x1C" followed by "14.001"
That will not work. The binary could still contain your separator value (it is less likely but still possible). Is there a name for this format that you are parsing? What does the format specification say about the length of these binary fields?
Back to top
View user's profile Send private message
petervh1
PostPosted: Thu Jan 10, 2019 2:37 am    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

The format of the data supplies the following, amongst others:

Record type 1.003 - this indicates the number of records of other types, e.g. 1.003nnn03n01n14n01n14n02 shows that there is 1 type 3 record and 2 type 14 records in the file.

Question: Can I use an fn:count or XPath statement to count the number of type 14 records as indicated by this type 1.003 record?

As I said earlier in this post, the record format contains the length of the binary field (Field4) in Field1. The problem with using this is that dfdl:contentLength() and dfdl:valueLength()
are not supported by IBM DFDL as you've already said.

This means that I can't use the record length field to know where to delimit the binary field.

The binary field is terminated by 0x1C but also can contain values that equate to 0x1C (in binary), so I can't use a terminator of 0x1C.

The name of the format is NIST (Data Format for the Interchange of Fingerprint, Facial & Other Biometric Information).

Any assistance would be appreciated.
Back to top
View user's profile Send private message
timber
PostPosted: Fri Jan 11, 2019 1:22 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Thanks - that's very helpful. We can forget about using separators/terminators to find the end of these binary fields. I think this will probably work:
Code:

message
  complexField
    lengthfield
    complexElement length=${../lengthField-4}
      otherField1
      otherField2
      binaryField lengthKind='delimited'
The use of lengthKind='delimited' on the binary element is deliberate. If the binary field is the final field in the complex element, then it works in the same was as lengthKind='endOfParent' (which is one reason why IBM DFDL does not yet support lengthKind='endOfParent').
Do take care to avoid defining any separators/terminators except for members of complexField. Otherwise the lengthKind='delimited' on binaryField will attempt to scan for them!

Please give it a try and let me know.
Back to top
View user's profile Send private message
petervh1
PostPosted: Sun Jan 13, 2019 11:31 pm    Post subject: Reply with quote

Centurion

Joined: 19 Apr 2010
Posts: 122

I have tried what I think is what you are suggesting. This is what I coded:

Code:

Type 1 records (successfully parsed)
Type 2 records (successfully parsed)
.
.
<xsd:element dfdl:initiator="" maxOccurs="unbounded" name="Type14" dfdl:length="{../Type14-4}">
    <xsd:complexType>
              <xsd:sequence dfdl:separator="" >
        <xsd:element dfdl:initiator="14.001:" dfdl:terminator="%#x1D;" name="T14.LEN" type="xsd:string"/>
        <xsd:element dfdl:initiator="14.002:" dfdl:terminator="%#x1D;" name="T14.IDC" type="xsd:string"/>       
        <xsd:element dfdl:initiator="14.999:" dfdl:lengthKind="delimited" name="T14.DAT" type="xsd:hexBinary"/>
            </xsd:sequence>
    </xsd:complexType>
    </xsd:element>



The reulst of this is that a message with 2 type 14 records parses successfully according to the DFDL Test Parse Model. However, it appears that only the first of the 2 type 14 records is parsed.

Questions:

1) I assume I have correctly followed your instruction "avoid defining any separators/terminators except for members of complexField"

2) I'm a bit confused about the placement of the "dfdl:length="{../Type14-4}" - it appears to make no difference whether this is coded or not (only the first type 14 record is parsed in both cases). Is this in the right place, as the DFDL trace does not show this calculation being executed?
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » XPath syntax in DFDL expression
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.