Author |
Message
|
jeasterl |
Posted: Tue Jan 25, 2011 11:23 am Post subject: Need to parse an XML message containin CDATA tag |
|
|
 Acolyte
Joined: 24 Jun 2001 Posts: 65
|
I am looking for the best method for parsing an incoming XML message that contains a CDATA tag. For example, if an incoming message looks like this:
<?xml version="1.0" ?>
<data psnonxml="Yes">
<![CDATA[ <Message><SubElement1>Hello World</SubElement1><SubElement12>abc123</SubElement2></Message>]]>
</data>
And I wanted the outbound message to look like this:
<Message>
<SubElement1>Hello World</SubElement1>
<SubElement12>abc123</SubElement2>
</Message>
I have looked at the inforcenter, and can only find references to how to create the CDATA tag on the outboud side.
Thanks for any help you can provide! |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Jan 25, 2011 11:48 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
So what you're saying is, you want to extract the value of a CDATASection element, and then parse them using some known parser.
I'd use CREATE FIELD.
I'd also stop trying to wrap un-encoded XML data inside a CDATA Section in a vain attempt to prevent it from being mangled by a valid XML parser, since I know what the rules for handling CDATASections and what they protect against and what they DO NOT protect against are.
If you want to preserve any random string of data inside an XML document, you should at a minimum base64 encode it. CDataSections are not production-caliber protection. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Tue Jan 25, 2011 12:10 pm Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
mqjeff has a very valid point. If possible, have the program that creates this data base64 encode the Hello World part (everything in your CDATA section).
Is there a business reason you are attempting to handle the data this way? Please explain your design.
Humor: "With your payload and his routing information, you can really go places@!" _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
jeasterl |
Posted: Tue Jan 25, 2011 12:43 pm Post subject: |
|
|
 Acolyte
Joined: 24 Jun 2001 Posts: 65
|
Thank you both for your suggestions / feedback. To answer the question as to "why" we are even doing this, we are receiving the XML in the manner and really have no control around how it is being sent. Powers far above "little ole me" are trying to get them to do right. But, in the meantime, we are stuck with this.
I will read up on the CREATE function and give it a go.
Thanks again... |
|
Back to top |
|
 |
kimbert |
Posted: Tue Jan 25, 2011 12:49 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
CDATA has two advantages:
a) it leaves the embedded XML human-readable ( because you don't have to replace < with < everywhere ).
b) it protects against most types of badly-formed XML document
The catch, as mqjeff says, is that it only protects against 'most' badly-formed documents. If the embedded XML contains an illegal character then the whole document will be badly-formed. The only way to avoid that is to base64 encode the embedded XML.
If you're sure that CDATA is the right choice for you, then you need to do something like this:
Code: |
CREATE LASTCHILD OF OutputRoot.XMLNSC PARSE InputRoot.XMLNSC.(XMLNSC.CDataField)data; |
The above code is not tested, but something very similar will work. |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Jan 25, 2011 12:57 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The problem is, the minute the XML data that's embedded in the CDATA section contains in and of itself another CDATA section, the XML document is imperviously broken and unparseable.
It's entirely possible that some kind soul sending this data will have done nothing to try and validate their output data, and thus you will be left with a timebomb on your input node.
But as kimbert says... (XMLNSC.CDataField) and CREATE ... PARSE. |
|
Back to top |
|
 |
jeasterl |
Posted: Wed Jan 26, 2011 12:44 pm Post subject: |
|
|
 Acolyte
Joined: 24 Jun 2001 Posts: 65
|
Okay, I am havinig some trouble traversing the XML tree embedded within the CDATA tag
I wanted to see if I could first access fields and data within the CDATA tag, so I attempted this statement:
CREATE LASTCHILD OF OutputRoot.XMLNSC PARSE (InputRoot.XMLNSC.(XMLNSC.CDataField).(XMLNSC.CDataValue));
Below is the output from the trace:
Executing statement ''CREATE LASTCHILD OF OutputRoot.XMLNSC PARSE(InputRoot.XMLNSC.(XMLNSC.CDataField)*:*.(XMLNSC.CDataValue)*:*);'' at ('', '12.3').
('', '12.48') : Failed to navigate to path element number '2' because it does not exist.
Evaluating expression ''XMLNSC.CDataField'' at ('', '12.66'). This resolved to ''XMLNSC.CDataField''. The result was ''50331649''.
Evaluating expression ''XMLNSC.CDataValue'' at ('', '12.86'). This resolved to ''XMLNSC.CDataValue''. The result was ''33554433''.
Evaluating expression ''InputRoot.XMLNSC.(XMLNSC.CDataField)*:*.(XMLNSC.CDataValue)*:*'' at ('', '12.48'). This resolved to ''InputRoot.XMLNSC.(50331649)*:*.(33554433)*:*''. The result was ''NULL''.
Any suggestions? |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jan 26, 2011 1:12 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
jeasterl wrote: |
Any suggestions? |
Use something closer to what kimbert suggested?
Re-evaluate what your ESQL is doing, which I suspect is not what you think it's doing?
(I think you're trying to create a message tree in the output with a root tag of <Message>, which you can't that way)
Run a user trace to see what the input message tree looks like? And therefore what path elements are present? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
|