Author |
Message
|
jharringa |
Posted: Thu Nov 13, 2008 10:13 am Post subject: CData within CData question |
|
|
Acolyte
Joined: 24 Aug 2007 Posts: 70
|
I am currently receiving an XML message like this:
<?xml version="1.0" encoding="UTF-8"?>
<SomeXML>
<MyElement><![CDATA[<?xml version="1.0" encoding="utf-8"?><Document><XMLVersion Version="1.0" /><SomeTag><![CDATA[http://someurl]]></SomeTag></Document>]]></MyElement>
</SomeXML>
I need to get the xml from within the CDATA tag in order to look at some data within it.
Am I going to have to convert the character entities manually or is there a PARSE option (or some other option) that I'm missing? I've been able to successfully test this out without the CDATA enclosure in MyElement and I've also been able to test this successfully with the non-translated xml within the CDATA (but without the nested CDATA).
The main problem here was that through the whole chain of events one application sends a CDATA with a URL and then sends the enclosed message to another application. That application, in turn, encloses the whole message in a CDATA tag and converts the special xml characters to character entities.
Any help would be much appreciated. I have a feeling that I may smack myself on the head when the answer comes across.
 |
|
Back to top |
|
 |
m4c0 |
Posted: Thu Nov 13, 2008 10:34 am Post subject: |
|
|
 Novice
Joined: 07 Nov 2008 Posts: 17
|
Did you try mapping the first CDATA as a BLOB then reparse as a XML to access the second CDATA?
I never tested with CDATA, but, in XML, CDATA is only an easy way to embed invalid XML text inside a valid XML. MQSI should read the CDATA as any other string. |
|
Back to top |
|
 |
jharringa |
Posted: Thu Nov 13, 2008 10:49 am Post subject: |
|
|
Acolyte
Joined: 24 Aug 2007 Posts: 70
|
Yes. I've tried to do that and I think the issue is that I have character entities within the CDATA tag. Essentially the parser error on that is that there is no Root tag found at that location on the message tree. That leads me to believe that the parser is ignoring the character entities and is not translating them to their xml special character counterparts (which doesn't seem out of line).
While I'm certainly not opposed to translating the entities with additional code, I think it would be cleaner if there were an ESQL option for this.
By the way, I'm running Broker 6.1.0.1 on AIX. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Nov 13, 2008 10:52 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Nothing in the XML specification allows you to nest CDATA tags. The definition of CDATA is clear on this.
The XML message you are receiving is illegal according to the standard. If they base64 encoded the inner XML message and then included it in the CDATA tag, it would then be legal. |
|
Back to top |
|
 |
jharringa |
Posted: Thu Nov 13, 2008 11:09 am Post subject: |
|
|
Acolyte
Joined: 24 Aug 2007 Posts: 70
|
Quote: |
Nothing in the XML specification allows you to nest CDATA tags. The definition of CDATA is clear on this. |
This is understood. I believe that is why the other application is converting the xml characters to character entities within the parent CDATA.
Quote: |
The XML message you are receiving is illegal according to the standard. If they base64 encoded the inner XML message and then included it in the CDATA tag, it would then be legal. |
I guess I'm not sure I understand why you are saying that this message is illegal. At this point, there is no longer a CDATA tag within a CDATA tag since they're using character entities to escape the xml special characters. The XMLNSC parser is alright with this. I just can't get the data out and parsed into a separate xml message. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Nov 13, 2008 12:15 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
one application sends a CDATA with a URL and then sends the enclosed message to another application. That application, in turn, encloses the whole message in a CDATA tag and converts the special xml characters to character entities. |
The 2nd application in the chain is being lazy. It is sending out an XML message which is technically legal, but practically impossible to process safely.
mqjeff is correct. The 2nd application should be base64-encoding the wrapped message, If base64 is too hard, it can encode it as hexBinary. |
|
Back to top |
|
 |
jharringa |
Posted: Thu Nov 13, 2008 12:21 pm Post subject: |
|
|
Acolyte
Joined: 24 Aug 2007 Posts: 70
|
That makes sense.
I'm guessing that there are unsafe scenarios that I'm not thinking of for this. Just out of curiousity, would you be able to give me an example?
In the meantime, they are OK with simply dropping the outside CDATA tag. Any thoughts on whether that is safe or is Base64 the best way to go in this case?
Thank you all for your responses. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Nov 13, 2008 4:02 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
I'm guessing that there are unsafe scenarios that I'm not thinking of for this. Just out of curiousity, would you be able to give me an example? |
Have you read this?
http://www.w3.org/TR/2006/REC-xml-20060816/#sec-cdata-sect
I was wrong. The XML which you are receiving is not even 'technically legal'.
Quote: |
they are OK with simply dropping the outside CDATA tag. Any thoughts on whether that is safe or is Base64 the best way to go in this case? |
It depends on how much you trust that XML.
If it ever contains an illegal XML character, then including it as-is within the outer message will make the outer message invalid as well. By encoding the wrapped XML, you protect the wrapper message, and ensure that any errors will affect only the system which processes the inner message. |
|
Back to top |
|
 |
jharringa |
Posted: Thu Nov 13, 2008 4:11 pm Post subject: |
|
|
Acolyte
Joined: 24 Aug 2007 Posts: 70
|
Interesting. I must not have digged in deep enough.
All very great recommendations! I think that they may be able to base64 encode the message and I'll just need to call a static Java method to decode and then parse it.
Thank you everyone! |
|
Back to top |
|
 |
kimbert |
Posted: Fri Nov 14, 2008 1:26 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
I'll just need to call a static Java method to decode and then parse it. |
That's a perfectly good solution. Another approach would be to
- create a message set to describe the incoming message.
- set Validation to 'Content and Value'
- Enable 'Build Tree using XML Schema types'
...and XMLNSC will decode the base64 for you. And if you reference that message set from your message flow project, you will also get help with constructing your ESQL paths via the ctrl-space auto-complete feature of Eclipse. |
|
Back to top |
|
 |
|