Author |
Message
|
angka |
Posted: Wed Dec 07, 2011 1:30 am Post subject: XMLNSC parser removed x'0D' from character element |
|
|
Chevalier
Joined: 20 Sep 2005 Posts: 406
|
Hi,
This is the input data in blob:
3C4D6573 73616765 3E0D0A3C 4F757470
75744461 74613E0D 0A3C4F75 74707574
5061796C 6F61643E 0D0A0D0A 3C2F4F75
74707574 5061796C 6F61643E 0D0A3C2F
4F757470 75744461 74613E0D 0A3C2F4D
65737361 67653E0D 0A
This is the input data in char:
<Message>
<OutputData>
<OutputPayload>....</OutputPayload>
</OutputData>
</Message>
Where the "...." is x'0D0A0D0A'
This is my code:
CREATE LASTCHILD OF Environment.Variables DOMAIN 'XMLNSC' PARSE(InputRoot.BLOB.BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
SET OutputRoot.BLOB.BLOB = CAST(Environment.Variables.XMLNSC.Message.OutputData.[1] AS BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
And my Output become only X'0A0A'. Why is this so?
Anyway if i code this way:
CREATE LASTCHILD OF Environment.Variables DOMAIN 'XMLNSC' PARSE(InputRoot.BLOB.BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
SET OutputRoot.BLOB.BLOB = ASBITSTREAM(Environment.Variables.XMLNSC CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
I got the X'0D0A' in it. However, this is not what I want.
I just want the output to be the PCDATA in the outputpayload tag. Is there a workaround?
Thanks |
|
Back to top |
|
 |
Esa |
Posted: Wed Dec 07, 2011 2:14 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
Your post lacks some vital information, e.g. Broker version. My workaround assumes you are working with version 6.1 or higher.
Instead of creating an XMLNSC tree into Environment, let the <protocol>Input node parse the message with XMLNSC, but declare OutputPayload as an opaque element in Parser Options tab of the <protocol>Input node. That should leave the field unparsed and thus preserve extra whitespace.
Didn't have time to test this so you have to test yourself what happens when you have some real data in OutputPayload. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Dec 07, 2011 3:11 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
I'm very glad to hear it. Please read this for the reason : http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends
If you want to protect the consecutive linefeed characters then you will need to hide them in a hexBinary or base64-encoded BLOB.
I can believe that XMLNSC does not remove consecutive line feeds on output - if so, then technically that is a defect in XMLNSC. |
|
Back to top |
|
 |
Esa |
Posted: Wed Dec 07, 2011 3:45 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
I had a feeling that even opaque parsing would not help you. What kimbert says confirms it, I think.
Because even the opaque elements are parsed, even if the InfoCenter says the opposite here http://publib.boulder.ibm.com/infocenter/wmbhelp/v7r0m0/index.jsp?topic=%2Fcom.ibm.etools.mft.doc%2Fac20990_.htm&resultof=%22%6f%70%61%71%75%65%22%20%22%6f%70%61%71%75%22%20 (the second last paragraph). If you have invalid XML in your opaque element, you will get an exception.
What opaque parsing does, it does not build an element tree of the opaque element, but puts it in one element as a string value. However, the element contents are still parsed and I strongly suscpect that it will also modify the linebreaks.
So, it seems the only thing you can do is to cast the input blob as character, locate the positions of '<OutputPayload>' and '</OutputPayload>', use SUBSTRING to extract the string between and cast it back to blob. Perhaps you could do it even without the casts (depends on the codepages).
If you already tried the opaque element approach, please tell us what it did with the linebreaks. |
|
Back to top |
|
 |
angka |
Posted: Wed Dec 07, 2011 3:45 am Post subject: |
|
|
Chevalier
Joined: 20 Sep 2005 Posts: 406
|
Hi,
Okay noted. but I tried with CDATA and XMLNSC parser also remove the X'0D'. why?
anyway the reason i need the x'0D0A' is because the XML is actually an ouputcard from a WTX map. I need to preserve the all bytes for this element. any workaround to instruct XMLNSC not to remove it?
Btw, if I set it as opaque element, i will not be able to access it.
Thank you |
|
Back to top |
|
 |
kimbert |
Posted: Wed Dec 07, 2011 3:55 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
It's true that opaque elements do not allow badly-formed XML - but consecutive line feeds do not make the XML badly-formed. I honestly don't know, off the cuff, whether an opaque element will collapse the line feeds but it's worth a try.
Quote: |
I tried with CDATA and XMLNSC parser also remove the X'0D'. why? |
Because that's what the XML specification requires XML parsers to do. If XMLNSC did not do this ( even within a CDATA section) it would be non-compliant.
btw, you seem to be suffering from a common misunderstanding about CDATA. A CDATA section still needs to obey certain XML rules ( e.g. no illegal characters ), and line feed collapsing does apply to its content. You may think that's unhelpful, and it may well be, but it's what the XML specification says. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Dec 07, 2011 3:57 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
It's not clear that you understand what you have been told.
XML strictly does not support preservation of line breaks within element content.
You need to base64 encode or otherwise transform the contents of this element into something that does not allow the XMLNSC Parser to recognize the line breaks. |
|
Back to top |
|
 |
Esa |
Posted: Wed Dec 07, 2011 4:03 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
angka wrote: |
Okay noted. but I tried with CDATA and XMLNSC parser also remove the X'0D'. why?
|
If you produced the CDATA element in WTX, it might have worked. I think WTX does not follow the XML spec as tightly as XMLNSC. But I am not telling you to try it.
angka wrote: |
Btw, if I set it as opaque element, i will not be able to access it.
Thank you |
Yes, you will. Remember it is a character value. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Dec 07, 2011 4:53 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
the XML is actually an ouputcard from a WTX map. I need to preserve the all bytes for this element. |
Please explain why it is important to preserve the line feeds? The next well-behaved XML parser in the processing pipeline will collapse those line feeds anyway so I'm struggling to see why it matters. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Dec 07, 2011 5:15 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
kimbert wrote: |
Please explain why it is important to preserve the line feeds? The next well-behaved XML parser in the processing pipeline will collapse those line feeds anyway so I'm struggling to see why it matters. |
Perhaps there is a PHB that does not understand that fact? _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
rekarm01 |
Posted: Thu Dec 08, 2011 2:54 am Post subject: Re: XMLNSC parser removed x'0D' from character element |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
kimbert wrote: |
I can believe that XMLNSC does not remove consecutive line feeds on output - if so, then technically that is a defect in XMLNSC. |
Why would that be a defect? The consecutive line feeds occur as character data, (not as mixed data). The XMLNSC parser should (and does) remove the carriage returns from the input message, but why should it collapse the remaining linefeeds?
In any event, the given code seems to be using the BLOB parser, rather than the XMLNSC parser, to generate the output message.
Esa wrote: |
I had a feeling that even opaque parsing would not help you. |
The XML specification requires that the parser normalizes the end-of-lines on input, before parsing, so opaque parsing probably won't help.
angka wrote: |
... but I tried with CDATA ... |
... encoded as HexBinary or Base64?
Esa wrote: |
If you produced the CDATA element in WTX, it might have worked. I think WTX does not follow the XML spec as tightly as XMLNSC. |
The XML spec does not prohibit an XML processor from adding carriage returns to an output message.
angka wrote: |
I need to preserve the all bytes for this element. any workaround to instruct XMLNSC not to remove it? |
If that's really the case, then, as already suggested, either use the BLOB parser, or use hexBinary- or Base64-encoded CDATA. Another option is for WTX to use character references to represent carriage returns. |
|
Back to top |
|
 |
Esa |
Posted: Thu Dec 08, 2011 6:21 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
If the XML parser does not work the way you like, dont use it:
Esa wrote: |
So, it seems the only thing you can do is to cast the input blob as character, locate the positions of '<OutputPayload>' and '</OutputPayload>', use SUBSTRING to extract the string between and cast it back to blob. Perhaps you could do it even without the casts (depends on the codepages).
|
If you want to convert a blob to xml to be able to select one part of it and convert it back to blob exactly as it was, it is simpler and gives better performance if you just extract the sequence from the blob without parsing it at all.
This is the humble solution to the OP's little problem. Unfortunately I had hidden it the middle of more interesting stuff on opaque parsing. Nobody reads longer posts? Or maybe it is my (well deserved) reputation as a poster of nonsense  |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Dec 08, 2011 6:29 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Esa wrote: |
Or maybe it is my (well deserved) reputation as a poster of nonsense  |
Don't be ridiculous. |
|
Back to top |
|
 |
Esa |
Posted: Thu Dec 08, 2011 7:03 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
mqjeff wrote: |
Don't be ridiculous. |
I'm sorry. Being ridiculous is our national sport. |
|
Back to top |
|
 |
|