Author |
Message
|
woofbert |
Posted: Thu Oct 18, 2007 4:40 pm Post subject: howto prevent CDATA content from being parsed |
|
|
Novice
Joined: 18 Oct 2007 Posts: 10
|
Hello,
I need to process a Message consisting of two parts, for example
Code: |
<message>
<partA>
<tag1>data</tag1>
</partA>
<partB>
<tagx>some data</tagx>
<tagy><![CDATA[some silly data including < & > & ! etc..
]]></tagy>
</partB>
</message> |
I get the message over an HTTP Input node using XMLNSC parser.
If I pass the message through to the HTTP Reply node, anything is working fine, my reply message looks exact like the input message.
If I change the tree in a Compute node, using
Code: |
SET OutputRoot = InputRoot;
DELETE FIELD OutputRoot.XMLNSC.message.partA.tag1; |
then my reply message looks like
Code: |
<message>
<partA/>
<partB>
<tagx>some data</tagx>
<tagy>some silly data including < & > & ! etc..</tagy>
</partB>
</message>
|
My problem is that the contents in <partB> are generic, they may or may not include fields with CDATA tags.
What can I do to preserve <partB> content of the message, so that it will be written exactly to the http reply as it was send to the broker flow?
Many thanks for your suggestions.
woofbert. |
|
Back to top |
|
 |
kimbert |
Posted: Fri Oct 19, 2007 1:17 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
For some reason, element 'OutputRoot.XMLNSC.message.partB.tagy' is not marked as being XMLNSC.CDataField. The XMLNSC parser will have marked 'InputRoot.XMLNSC.message.partB.tagy' as XMLNSC.CDataField, so something in your message flow must have reset the field type. Did you by any chance copy it to an Environment tree? |
|
Back to top |
|
 |
shalabh1976 |
Posted: Fri Oct 19, 2007 4:55 am Post subject: |
|
|
 Partisan
Joined: 18 Jul 2002 Posts: 381 Location: Gurgaon, India
|
I am not sure but will the XML.AsisElementContent help........ _________________ Shalabh
IBM Cert. WMB V6.0
IBM Cert. MQ V5.3 App. Prog.
IBM Cert. DB2 9 DB Associate |
|
Back to top |
|
 |
kimbert |
Posted: Fri Oct 19, 2007 5:11 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
I am not sure but will the XML.AsisElementContent help |
Two problems with that:
1. XMLNSC does {edited}not{/edited} have a constant called 'XMLNSC.AsIsElementContent' ( and you wouldn't try to use 'XML.AsIsElementContent' with XMLNSC, would you )
2. Using AsIsElementContent is a brute force solution, and should be a last resort. But there is a clear problem in woofbert's message flow, and if he solves that, he will get the behaviour he wants. So no need to use brute force.
Before you ask, XMLNSC has a field type 'XMLNSC.Bitstream' which does the same as XML.Bitstream ( outputs a BLOB as-is ). You just need to CAST your character data to BLOB before using it.
Last edited by kimbert on Sat Oct 20, 2007 2:58 am; edited 1 time in total |
|
Back to top |
|
 |
shalabh1976 |
Posted: Fri Oct 19, 2007 5:19 am Post subject: |
|
|
 Partisan
Joined: 18 Jul 2002 Posts: 381 Location: Gurgaon, India
|
kimbert,
Quote: |
XMLNSC does have a constant called 'XMLNSC.AsIsElementContent' ( and you wouldn't try to use 'XML.AsIsElementContent' with XMLNSC, would you ) |
It should have been XMLNSC all along.
However my point was more of a pointer rather than a definite answer.  _________________ Shalabh
IBM Cert. WMB V6.0
IBM Cert. MQ V5.3 App. Prog.
IBM Cert. DB2 9 DB Associate |
|
Back to top |
|
 |
kimbert |
Posted: Sat Oct 20, 2007 2:57 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
XMLNSC does have a constant called 'XMLNSC.AsIsElementContent' |
Sorry - that should have read 'XMLNSC does not have a constant called 'XMLNSC.AsIsElementContent'. Hence my point at the end of that post. |
|
Back to top |
|
 |
woofbert |
Posted: Sat Oct 20, 2007 9:02 am Post subject: |
|
|
Novice
Joined: 18 Oct 2007 Posts: 10
|
kimbert wrote: |
2. Using AsIsElementContent is a brute force solution, and should be a last resort. But there is a clear problem in woofbert's message flow, and if he solves that, he will get the behaviour he wants. So no need to use brute force. |
Sorry, I was offline for one day...
For drilling down the problem I created a really simple flow, did some testing and found that the first information I gave was not quite accurate.
The parsing problem only occurs, if there is a sign (whitespace or <CRLF> between the end of the CDATA Section and the enclosing tags (either opening or closing tag).
So using the input message
Code: |
<message>
<partA>
<tag1>data</tag1>
</partA>
<partB>
<tagx>some data</tagx>
<tagy><![CDATA[some silly data including < & > & ! etc..]]></tagy>
</partB>
</message> |
will result in
Code: |
<message><partA/><partB><tagx>some data</tagx><tagy><![CDATA[some silly data including < & > & ! etc..]]></tagy></partB></message> |
what is absolutely correct.
whereas the message (the only difference is the <CRLF> before </tagy>)
Code: |
<message>
<partA>
<tag1>data</tag1>
</partA>
<partB>
<tagx>some data</tagx>
<tagy><![CDATA[some silly data including < & > & ! etc..]]>
</tagy>
</partB>
</message> |
will become
Code: |
<message><partA/><partB><tagx>some data</tagx><tagy>some silly data including < & > & ! etc..
</tagy></partB></message> |
This is what the flow looks like:
HTTP Input -> Compute -> HTTP Reply
with Input Message Parsing = 'XMLNSC'. Everything else is left default.
The ESQL code looks exactly like:
Code: |
CREATE COMPUTE MODULE DummyService_Compute
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
SET OutputRoot = InputRoot;
DELETE FIELD OutputRoot.XMLNSC.message.partA.tag1;
RETURN TRUE;
END;
END MODULE; |
Is this behavior as it should be, or is it time to open a bug against WMB?
Is it brute force time?
What will the brute force code copying message.partB from InputRoot to OutputRoot looks like?
Thank you for you suggestions
woofbert |
|
Back to top |
|
 |
kimbert |
Posted: Wed Oct 24, 2007 4:01 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Can you please insert a Trace node and post the results for both message styles. |
|
Back to top |
|
 |
woofbert |
Posted: Fri Oct 26, 2007 4:24 am Post subject: |
|
|
Novice
Joined: 18 Oct 2007 Posts: 10
|
kimbert wrote: |
Can you please insert a Trace node and post the results for both message styles. |
Hello kimbert,
I took the messages and ESQL code as posted earlier in this thread and added two trace nodes. So the flow looks like
MQ Input -> Trace A -> Compute -> Trace B -> MQ Output
Here is the result of the traces:
1) Message style 1 (when everything is fine)
Code: |
Trace A:
(0x01000000):XMLNSC = (
(0x01000000):message = (
(0x01000000):partA = (
(0x03000000):tag1 = 'data'
)
(0x01000000):partB = (
(0x03000000):tagx = 'some data'
(0x03000001):tagy = 'some silly data including < & > & ! etc..'
)
)
)
Trace B:
(0x01000000):XMLNSC = (
(0x01000000):message = (
(0x01000000):partA =
(0x01000000):partB = (
(0x03000000):tagx = 'some data'
(0x03000001):tagy = 'some silly data including < & > & ! etc..'
)
)
)
|
And here message style 2, with <CRLF> and some whitespaces between "]>" and "</tagy>" (which afaik is o.k. in terms of XML)
Code: |
Trace A:
(0x01000000):XMLNSC = (
(0x01000000):message = (
(0x01000000):partA = (
(0x03000000):tag1 = 'data'
)
(0x01000000):partB = (
(0x03000000):tagx = 'some data'
(0x03000002):tagy = 'some silly data including < & > & ! etc..
'
)
)
)
Trace B:
(0x01000000):XMLNSC = (
(0x01000000):message = (
(0x01000000):partA =
(0x01000000):partB = (
(0x03000000):tagx = 'some data'
(0x03000002):tagy = 'some silly data including < & > & ! etc..
'
)
)
)
|
What I see is that the types are different (CData vs. Hybrid). But why does this happen?
Even if this is working as designed, how can I save the whole PartB that it looks identical in the incoming and outgoing message?
woofbert |
|
Back to top |
|
 |
kimbert |
Posted: Fri Oct 26, 2007 5:15 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Thanks woofbert,
I do have an explanation, but you may not like it.
The XMLNSC parser is a 'compact' parser, but in this case it is being a little too 'compact' . When it sees Message Style 2, it concatenates the CData with the PCData ( the <CRLF> ) and sets the field type of tagY to 'CHybridField'. When it outputs tagY, it ignores that fact that some of it started out as CData. It just outputs it as if it was normal PCData. That triggers the usual non-CData entity substitutions which you are seeing.
The XMLNS parser would do it differently. It would create two child elements of tagY. The first would hold the CData content, and its field type would be 'XMLNS.CDataSection'. The second would hold the <CRLF> and its field type would be 'XMLNS.Content'. I think the XMLNSC parser should do the same as XMLNS ( watch this space )
For now, your workarounds are:
a) use XMLNSC.Bitstream to output the exact data that you want to see
b) switch to XMLNS
b) has the disadvantage that you are moving away from XMLNSC just before it gets a big improvement in performance and capability ( in v6.1 ) |
|
Back to top |
|
 |
mgk |
Posted: Fri Oct 26, 2007 11:55 am Post subject: |
|
|
 Padawan
Joined: 31 Jul 2003 Posts: 1642
|
Hi.
I think you should go for option C: Raise a PMR and request a fix for this  _________________ MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. |
|
Back to top |
|
 |
woofbert |
Posted: Sun Oct 28, 2007 3:42 pm Post subject: |
|
|
Novice
Joined: 18 Oct 2007 Posts: 10
|
Thank you for all your input.
I think I'll take option b) because it works and there is no real throughput pain right now and start a parallel thread c) (open PMR) in the meanwhile
Thanks,
woofbert |
|
Back to top |
|
 |
|