Author |
Message
|
mikepham |
Posted: Wed Mar 17, 2010 8:00 am Post subject: Need help on traverse xml message to remove cdata |
|
|
 Novice
Joined: 17 Mar 2010 Posts: 20
|
Hi everyone
I'm developing java compute node to traverse xml message to remove cdata from values of elements.
For example:
Input message: <A><![CDATA[123456]]></A>
Expected Output message: <A>123456</A>
Cdata can appear in many elements
The compute node will scan the xml message and remove all the cdata (not the value)
However my code below did not work as I expected
The cdata that surrounding value of elements are still there and could not be removed.
Could you help me figure out the mistake in my code ?
I'm just a newbie on this, so my code may not good. If you have any new solution to solve my problem, I really appreciated
I'm using wmb 6.1
Thank you
Code: |
public void evaluate(MbMessageAssembly contact admin) throws MbException {
MbOutputTerminal out = getOutputTerminal("out");
MbOutputTerminal alt = getOutputTerminal("alternate");
MbMessage inMessage = contact admin.getMessage();
// create new message
MbMessage outMessage = new MbMessage(inMessage);
MbMessageAssembly outAssembly = new MbMessageAssembly(contact admin,
outMessage);
try {
// ----------------------------------------------------------
// Add user code below
//Get root element of the whole mq message
MbElement root = outAssembly.getMessage().getRootElement();
//Get body of the message (XML)
MbElement body = root.getLastChild();
traverseAndRemoveCDATA(body);
// End of user code
// ----------------------------------------------------------
// The following should only be changed
// if not propagating message to the 'out' terminal
out.propagate(outAssembly);
} finally {
// clear the outMessage
outMessage.clearMessage();
}
}
public void traverseAndRemoveCDATA(MbElement node) throws MbException {
if (node == null){
return;
}
//Current node has one or more children
if (node.getFirstChild() != null){
MbElement childNode = node.getFirstChild();
traverseAndRemoveCDATA(childNode);
while (childNode.getNextSibling()!= null) {
traverseAndRemoveCDATA(childNode);
childNode = childNode.getNextSibling();
}
//Current node has no children
} else {
//check and remove cdata in this node
String value = node.getValue().toString();
String tempValue = value;
if (node.getType() == MbElement.TYPE_VALUE && value.startsWith("<![CDATA[") ){
//remove cdata surrounding the value
tempValue = tempValue.substring(9,value.indexOf("]]>"));
}
node.setValue(tempValue);
} |
Last edited by mikepham on Mon Mar 22, 2010 1:59 am; edited 1 time in total |
|
Back to top |
|
 |
kimbert |
Posted: Wed Mar 17, 2010 9:27 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
|
Back to top |
|
 |
mikepham |
Posted: Wed Mar 17, 2010 8:23 pm Post subject: |
|
|
 Novice
Joined: 17 Mar 2010 Posts: 20
|
Thank you for your reply. I know it could be a strange requirement ... but is there any way to solve my current issue, by esql option ?
The input xml (have cdata) comes from other system, the goal is I need an output xml without any cdata
I did some search in forum and see some suggestions such as
-copy current value of element (without cdata), then set element value to null, then put back the original value
However, I don't know how to determine which element containing cdata to do this
I'm using xmlns domain |
|
Back to top |
|
 |
kimbert |
Posted: Thu Mar 18, 2010 2:09 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
the goal is I need an output xml without any cdata |
Presumably the downstream application cannot handle CDATA sections. I think that's a poor design decision, but I accept that you're probably not in a position to change it.
Quote: |
I'm using xmlns domain |
You're on v6.1, so that decision needs to be justified. Please explain exactly why you have decided not to use XMLNSC, as recommended by the infocenter.
Quote: |
is there any way to solve my current issue, by esql option ? |
You could change the field type from XMLNSC.CDataField to XMLNSC.PCDataField. But you'd better hope that the not-quite-XML parser at the other end can handle XML character entities, because if the CDATA sections were *needed* then the output XML will end up containing a lot of < and > character entities. In other words this:
Code: |
<outerTag><![CDATA[<innerTag/>]]></outerTag> |
will become this:
Code: |
<outerTag><innerTag/></outerTag> |
Alternatively, you may want this output:
Code: |
<outerTag><innerTag/></outerTag> |
...in which case the correct solution is to parse the XML that was contained in the CDATA section using a CREATE...PARSE statement, and then copy the parsed subtree under the tag that used to contain the CDATA section.
Please reply with an indication of which type of output you need. |
|
Back to top |
|
 |
mikepham |
Posted: Fri Mar 19, 2010 12:49 am Post subject: |
|
|
 Novice
Joined: 17 Mar 2010 Posts: 20
|
Hi
Quote: |
Please explain exactly why you have decided not to use XMLNSC, as recommended by the infocenter |
Because almost the code currently using xmlns. My adtional code shoud follow it rather than changing all of them
I prefer this
Code: |
<outerTag><innerTag/></outerTag> |
Is it possible to traverse every elements in input xml file (using esql/java) to apply the way above and got the output xml file without any cdata
If yes, It would be nice if you give me example
Thanks  |
|
Back to top |
|
 |
kimbert |
Posted: Fri Mar 19, 2010 1:34 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
It would be nice if you give me example |
Just do a recursive walk of the tree starting at the root. Use a reference, of course. For each node, check its field type using the FIELDTYPE function. If the field type is XML.CDataSection then change it to Value.
Code: |
IF FIELDTYPE(nodeRef) = XML.CDataSection THEN
-- change the field type of the content to 'Value'
SET nodeRef TYPE Value;
ENDIF |
Note that I am now using the field type constant XML.CDataSection *not* XMLNSC.CDataSection. I hope you understand why  |
|
Back to top |
|
 |
mikepham |
Posted: Mon Mar 22, 2010 1:55 am Post subject: |
|
|
 Novice
Joined: 17 Mar 2010 Posts: 20
|
Hi kimbert
Thank you for your example
It worked for me with a little bit change
Code: |
IF FIELDTYPE(nodeRef) = XML.CDataSection THEN
-- change the field type of the content to 'Value'
SET nodeRef TYPE = Value;
END IF; |
The compiler said it's expecting the "=" after TYPE
Here is also the traverse code in esql. I put it together with a piece of code above. Looks like they worked fine for me
So, If I had a chance to meet you in real life, I would like to invite you some beers to thank you, my friend
Code: |
CREATE PROCEDURE Traverse (IN REF_Cursor REFERENCE) BEGIN
IF (FIELDTYPE(REF_Cursor) = XML.CDataSection) THEN
-- change the field type of the content to 'Value'
SET REF_Cursor TYPE = Value;
END IF;
MOVE REF_Cursor FIRSTCHILD ;
IF LASTMOVE(REF_Cursor) THEN
CALL Traverse(REF_Cursor);
MOVE REF_Cursor PARENT;
END IF;
MOVE REF_Cursor NEXTSIBLING ;
IF LASTMOVE(REF_Cursor) THEN
CALL Traverse(REF_Cursor);
MOVE REF_Cursor PREVIOUSSIBLING;
END IF;
END;
|
|
|
Back to top |
|
 |
|