Author |
Message
|
Frik |
Posted: Mon Jan 14, 2013 7:47 am Post subject: DFDL doesn't support base64Binary? |
|
|
Acolyte
Joined: 25 Nov 2009 Posts: 69
|
Hey,
again, I have messageSet with message that contains a field of type base64binary.
Trying to migrate to V8 with DFDL - I see that there is no such a field type.
Any suffestions?
Thanks,
Erez |
|
Back to top |
|
 |
kimbert |
Posted: Mon Jan 14, 2013 8:32 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
DFDL cannot decode a base64Binary field into a BLOB - it is not a very common requirement for non-XML messages. But nor can the MRM parser - unless your message set is using an XML physical format.
So the key question is: what is the physical format of this 'binary' field? |
|
Back to top |
|
 |
Frik |
Posted: Mon Jan 14, 2013 9:45 pm Post subject: physical format |
|
|
Acolyte
Joined: 25 Nov 2009 Posts: 69
|
The physical format of this type is a BLOB. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Jan 15, 2013 1:46 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
The physical format of this type is a BLOB |
What does it look like on the wire? Is it
a) a raw byte array ( one byte on the wire per byte in the value )
b) a byte array encoded as text using two characters per byte?
c) a byte array encoded as text using the base64Binary encoding?
d) some other custom encoding of a BLOB ( if so, please describe it ) |
|
Back to top |
|
 |
Frik |
Posted: Wed Feb 20, 2013 2:28 am Post subject: hmm |
|
|
Acolyte
Joined: 25 Nov 2009 Posts: 69
|
kimbert wrote: |
Quote: |
The physical format of this type is a BLOB |
What does it look like on the wire? Is it
a) a raw byte array ( one byte on the wire per byte in the value )
b) a byte array encoded as text using two characters per byte?
c) a byte array encoded as text using the base64Binary encoding?
d) some other custom encoding of a BLOB ( if so, please describe it ) |
it's C. a byte array encoded as text using the base64Binary encoding.
any idea? |
|
Back to top |
|
 |
kimbert |
Posted: Wed Feb 20, 2013 4:10 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
That should be easy, then.
1. Parse it using the BLOB domain
2. Convert the BLOB it to text using ESQL CAST
3. Convert the base64 string to BLOB using ESQL's base64decode function
4. Parse the resulting BLOB - probably using DFDL. |
|
Back to top |
|
 |
Frik |
Posted: Wed Feb 20, 2013 4:31 am Post subject: it's |
|
|
Acolyte
Joined: 25 Nov 2009 Posts: 69
|
it's more compilcated.
there is some text and integer fields,
and the last one is hexbinary with unknown size.
I want to do it by parser, once.
I can do it when I defines the last fiels as hexBinary, with specific size,
but actually - I don't know what its size, and since there is no "Delimited" with hex binary nor "endOfParent" is supported,
I just can't understand what is the alternative. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Feb 20, 2013 5:00 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
So your input data looks something like this
Code: |
textField binaryInteger textCalendar textNumber base64data |
The key point is this: that final field is not binary. It is text. The base64 encoding was invented for the specific purpose of encoding BLOB data as 7-bit ASCII text ( for emails ). So you can set dfdl:representation='text' and type='xs:string' on that final field. Then you should be able to set dfdl:lengthLind='delimited', which will work exactly like 'endOfParent'. |
|
Back to top |
|
 |
Frik |
Posted: Wed Feb 20, 2013 5:10 am Post subject: hmm |
|
|
Acolyte
Joined: 25 Nov 2009 Posts: 69
|
ok.
I will explain more - this field is actually a string, but a string that contains fields of integer and short and string.
If i'll parse with as String, it will lose the codepage of this string (in this case it is a MF string), and the short and integer fields would be messed. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Feb 20, 2013 5:20 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
OK - this is difficult to explain in words. Let's try it another way...
Please post two things:
1. An example input message, properly formatted to make it readable.
2. The exact message tree that you want to end up with
If the link between 1. and 2. is not obvious, please supply details of the how the parsing needs to work. This is particularly important for the final field ( you do not seem to have a problem with the other fields ). |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Feb 20, 2013 5:22 am Post subject: Re: hmm |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Frik wrote: |
ok.
I will explain more - this field is actually a string, but a string that contains fields of integer and short and string.
If i'll parse with as String, it will lose the codepage of this string (in this case it is a MF string), and the short and integer fields would be messed. |
Something has taken a set of fields and turned them into a base64 encoded data structure.
That base64 encoded data structure is a string. It is not a data structure any more.
When handling that string, you must indeed make efforts to ensure that you keep the correct code page associated with it so that you can properly decode the correct characters into the correct bytes.
But that's just a matter of making sure that the DFDL parser knows what codepage the string is, it's not at all any more complicated than that.
This is almost always accomplished by knowing what codepage the entire message is in, which is usually very very easy with Broker.
What kimbert is telling you is that you CAN NOT automatically base64 decode the field using the DFDL parser.
What you can do is use the DFDL parser to extract the specific portion of the input message that is the base64 encoded string, and then use any number of methods to base64 decode that string into the relevant bytes.
You can then parse those bytes with the DFDL parser back into a data structure.
If you wish to do all of those steps at once, with a "single parse", then you have to write your own parser to do so. |
|
Back to top |
|
 |
|