Author |
Message
|
elvis_gn |
Posted: Tue Apr 24, 2018 11:00 pm Post subject: DFDL Parse data containing delimiter |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
Hi guys,
I have an issue parsing a record delimited message. The high-level structure is as follows:
Quote: |
"HEADER", "value1", "value2"
"BODY","value3","value4"
"FOOTER","value5","value6_contains_comma(,)_inside"
|
Note: each value is in double quotes, and I need to get rid of it in the parsed data.
I've successfully parsed FOOTER till 'value5' getting rid of the double quotes by setting the FOOTER sequence separator as comma(,) and child element level initiator and terminator as double quotes(").
The issues lies where FOOTER contains a value6 with a comma within the data. From other posts, I understood I could try a 'pattern=.*' and this worked, but only without the initiator and terminator of quote(") on the field itself. In which case the value got captured with the quotes.
If I set the initiator and terminator as double quotes("), then I get parser errors at the initiator (but error icon on sequence) saying
Quote: |
Unexpected data found at offset 'XYZ' |
XYZ is the location of the opening quote.
Any ideas on how to get rid of the quotes ? |
|
Back to top |
|
 |
timber |
Posted: Wed Apr 25, 2018 1:11 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
I assume that you are using DFDL. Did you create your model using the 'New/Message Model...' wizard? |
|
Back to top |
|
 |
elvis_gn |
Posted: Wed Apr 25, 2018 3:38 am Post subject: |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
I started with the wizard, but my body tag is more complicated than what I've posted below. The body has child segments that are to be parsed as child elements, but are actually under the body hierarchy level.
i.e It is supposed to be like the below
Quote: |
HEADER
BODY
---TYPE_1
---TYPE_2
------TYPE_3
BODY
---TYPE_2
------TYPE_1
FOOTER
|
But comes in as
Quote: |
HEADER
BODY
TYPE_1
TYPE_2
TYPE_3
BODY
TYPE_2
TYPE_1
FOOTER
|
Hence the default wizard generated model got tweaked a lot.
Is that a problem ? |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Apr 25, 2018 4:06 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
There should be no problem of it coming in like you say.
When building your record all you need to do is say something like
Code: |
Header
Body occurs 1 to x
choice occurs 1 to x
Type 1
Type 2
Type 3
-- end choice
Footer |
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
elvis_gn |
Posted: Wed Apr 25, 2018 5:11 am Post subject: |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
Not really...
Body first has simple fields that are comma separated. Then there are unordered complex elements that could be 0 to unbounded. There is no comma between the simple and complex elements, only CRLF.
Anyway, I crossed that bridge and reached the last element. The only problem right now is to escape the sequence separator comma inside an element value.
How can I do that while getting rid of the quotes ? |
|
Back to top |
|
 |
timber |
Posted: Wed Apr 25, 2018 6:15 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
The DFDL default escape scheme (which will have been set up by the wizard) will automatically remove the quotes. Do some experiments and and check the DFDL docs. |
|
Back to top |
|
 |
|