Author |
Message
|
phoebs |
Posted: Mon Nov 21, 2011 9:49 pm Post subject: Getting inconsistent results with XMLNSC parser in broker |
|
|
Newbie
Joined: 18 Aug 2011 Posts: 5
|
Below is the message flow -
RCD (change domain to XMLNSC) --> Compute node (to access incoming XMLNSC message) --> MQOutput
Input message is in XMLNSC format with CDATA having large fixed length message (40000 bytes). And this FLM message within CDATA is having closing square bracket. Issue is coming up in compute node while manipulating the XMLNSC attribute value of one of the tags within input message. When attribute value of any tag is being modified using XMLNSC.Attribute; CDATA is getting altered. And two CDATA sections are getting created in outputroot XMLNSC. That is, XMLNSC parser is considering the closing square bracket within CDATA FLM message as end of the CDATA section. And adding new CDATA section for rest of the message.
Whereas, it is working fine when XMLNS domain is used instead of XMLNSC. Please suggest if anyone has faced the similar issue. |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Nov 21, 2011 10:25 pm Post subject: Re: Getting inconsistent results with XMLNSC parser in broke |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
phoebs wrote: |
Below is the message flow -
RCD (change domain to XMLNSC) --> Compute node (to access incoming XMLNSC message) --> MQOutput
......
That is, XMLNSC parser is considering the closing square bracket within CDATA FLM message as end of the CDATA section. And adding new CDATA section for rest of the message.
Whereas, it is working fine when XMLNS domain is used instead of XMLNSC. Please suggest if anyone has faced the similar issue. |
If the XMLNS domain works for you, by all means use it. XMLNSC will not work for EVERYTHING, but it will work most of the cases...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
kimbert |
Posted: Tue Nov 22, 2011 1:28 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
More information please; your description of the problem leaves a lot of questions in my mind.
Quote: |
And this FLM message within CDATA is having closing square bracket. |
Just one closing square bracket? Or do you mean the end-of-CDATA sequence ']]>'?
Quote: |
Issue is coming up in compute node while manipulating the XMLNSC attribute value of one of the tags within input message. When attribute value of any tag is being modified using XMLNSC.Attribute; CDATA is getting altered. |
Be specific. Quote the input XML, and quote the ESQL. And please use [c o d e] tags when quoting your ESQL/XML.
Quote: |
That is, XMLNSC parser is considering the closing square bracket within CDATA FLM message as end of the CDATA section. And adding new CDATA section for rest of the message. |
I don't understand what you mean. Are you claiming that XMLNSC is doing the parsing wrongly? Or are you claiming that XMLNSC is writing an invalid XML document after you have changed the message tree? |
|
Back to top |
|
 |
phoebs |
Posted: Tue Nov 22, 2011 7:06 am Post subject: |
|
|
Newbie
Joined: 18 Aug 2011 Posts: 5
|
Kimbert, please find the below details -
1)
Quote: |
Just one closing square bracket? Or do you mean the end-of-CDATA sequence ']]>'? |
Just one closing square bracket. Test message is provided below.
2) Input XML -
Code: |
<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<o:Orchestration xmlns:o="http://www.w3.org/2001/XMLSchema"
name="Test 1" trace="2011-11-21T09:34:54.667516">
</o:Orchestration>
<Payload>
<![CDATA[ 00000 000000 01393
I11E850020111121009345400020111121009345400MULWTX 122000030
3550000609602 020110720C00191USD100512000876
0512001234 000000000000125000S014
00000000000000000012500000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000
00000000MULWTX 100512000027191 00000WIRE TYPE:BOOK IN
DATE:060110 TIME:1851 ET +-=ABC LINE2- test after ~()
<>$#%+-=ABCD LINE3-[test aft] ~(ERVICE REF:
ELATED REF: RIG:WESTLAKE CHEMICAL
CORPORATION FBO BANK OF AMERICA .A. 2801 POST OAK BLVD STE
600 HOUSTON TX 77056 D:003750232879
RG BK: ID: NS BK: ID:
ND BK: ID: NF:MONITRONICS FUNDING
LP CLEARING ACCT ATTN: STEVE EDRICK PO BOX 814530 DALLAS TX
75381-4530 ID:005801014738 NF BK: ID:
AYMENT DETAILS: ]]>
</Payload>
</Envelope> |
ESQL -
Code: |
SET OutputRoot = InputRoot;
DECLARE root REFERENCE TO OutputRoot.XMLNSC.Envelope.*:Orchestration;
SET root.(XMLNSC.Attribute)trace = 'TEST_XMLNSC';
PROPAGATE TO TERMINAL 'out1' DELETE NONE;
|
Output XML with two CDATA sections -
Code: |
<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<o:Orchestration xmlns:o="http://www.w3.org/2001/XMLSchema"
name="Test 1" trace="TEST_XMLNSC">
</o:Orchestration>
<Payload><![CDATA[ 00000 000000
01393
I11E850020111121009345400020111121009345400MULWTX 122000030
3550000609602 020110720C00191USD100512000876
0512001234 000000000000125000S014
00000000000000000012500000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000
00000000MULWTX 100512000027191 00000WIRE TYPE:BOOK IN
DATE:060110 TIME:1851 ET +-=ABC LINE2- test after ~()
<>$#%+-=ABCD LINE3-[test aft]]><![CDATA[] ~(ERVICE REF:
ELATED REF: RIG:WESTLAKE CHEMICAL
CORPORATION FBO BANK OF AMERICA .A. 2801 POST OAK BLVD STE
600 HOUSTON TX 77056 D:003750232879
RG BK: ID: NS BK: ID:
ND BK: ID: NF:MONITRONICS FUNDING
LP CLEARING ACCT ATTN: STEVE EDRICK PO BOX 814530 DALLAS TX
75381-4530 ID:005801014738 NF BK: ID:
AYMENT DETAILS:
]]>
</Payload>
</Envelope> |
3) Input message is in proper XML format, but getting two CDATA sections in output XML when message tree is modified as specified above. |
|
Back to top |
|
 |
smdavies99 |
Posted: Tue Nov 22, 2011 7:37 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
To quote section 2.7 of the w3c document http://www.w3.org/TR/REC-xml/
Quote: |
[Definition: CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string " <![CDATA[ " and end with the string " ]]> ":]
|
By markup, I understand this as XML. The data from your post in the CDATA section is not XMl. Actually, I think you should be using a blob for this part.
viz base64binary section NOT CDATA
as described here
http://books.xmlschemata.org/relaxng/ch19-77017.html _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Nov 22, 2011 7:42 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
No need to post the entire message. A very small CDATA section would presumably exhibit the same behaviour, and would be easier to read and discuss.
I can confirm that XMLNSC does not split up CDATA sections when it writes out the value. So it would be interesting to see the message tree after the input node. Please do this:
- Add a Trace node with pattern ${Root} immediately after the input node
- Send in a small XML message, with a CDATA section that contains the text 'start]end'.
- Make sure that the flow alters the message tree in some way - otherwise the input document will get copied directly to the output.
- Post the output of the Trace node, in [c o d e] tags, of course. |
|
Back to top |
|
 |
phoebs |
Posted: Tue Nov 22, 2011 9:13 am Post subject: |
|
|
Newbie
Joined: 18 Aug 2011 Posts: 5
|
Kimbert,
I had posted the entire message in earlier post because there is discrepancy in output results received with small and large CDATA section being used in input message. This issue is coming up only when CDATA is having large FLM message.
Below are the two scenarios -
1) Small CDATA -
As suggested, small CDATA value provided as - <![CDATA[start]end]]>
No issues found in output xml. Output is having only one CDATA even if the message tree is modified in ESQL. Also, Cdata section is getting displayed properly in trace node attached right after MQInput -
display in Trace
Code: |
(0x03000001:CDataField ):Payload = 'start]end' (CHARACTER) |
2) Large CDATA -
Value is provided as - <![CDATA[start]end..... (with addition of 40000 bytes as spaces)]]>
Output XML is having two CData sections.
Also, two Cdata sections are published in trace node attached after MQInput node -
Code: |
)
(0x01000000:Folder ):Payload = (
(0x02000001:CDataValue):CDATA = 'start' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']end
) |
|
|
Back to top |
|
 |
kimbert |
Posted: Wed Nov 23, 2011 1:57 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
I had posted the entire message in earlier post because there is discrepancy in output results received with small and large CDATA section being used in input message. |
...but you decided not to mention it!!
Quote: |
Also, two Cdata sections are published in trace node attached after MQInput node |
That is an important finding. Clearly the XML parser is *purposely* reporting the CDATA section in two parts. There is nothing wrong with that, technically. The resulting message trees are equivalent in XML terms, and the output XML document means the same whether or not the CDATA section is in one part or two parts.
I don't know why the split is happening after the first ']' though - I may do some investigation on that. |
|
Back to top |
|
 |
sirsi |
Posted: Mon Dec 17, 2012 7:33 am Post subject: |
|
|
Disciple
Joined: 11 Mar 2005 Posts: 177
|
hi guys, this is an old thread but I would like to know if there is an explanation for this behaviour? |
|
Back to top |
|
 |
kimbert |
Posted: Mon Dec 17, 2012 7:46 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Before I comment, I would like somebody to take the time to explain carefully and accurately what 'this behaviour' is. The OP never got around to doing that.
Are you seeing something that looks similar, or are you just curious? |
|
Back to top |
|
 |
alastair |
Posted: Fri Feb 22, 2013 8:35 pm Post subject: |
|
|
 Novice
Joined: 08 Feb 2012 Posts: 18 Location: Sydney
|
Hi guys. I am seing this exact same behaviour.
I have a message coming in to an MQInput mode being processed with the XMLNSC parser. The input message is XML with one of the elements containing a CDATA string. This CDATA string contains a large XML structure which happens to contain a few instances of character ] apart from the end of the CDATA section. The XML is valid. The parser is creating a sibling CDATA every time it encounters the ].
Interesting thing is that I have tried to recreate this feature using a small XML message and the parser correctly creates a single CDATA field. Very strange.
I am happy to provide an RFHUTIL loadable file with the culprit message to help.
Here is a snippet of the trace on this field:
Code: |
<Items><Item><Description><Value>~!@#$%^&*()_+<,>.:;"'?/{[}' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']\|122ABCD 456789</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 123456</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>3000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 987654</Value><Control/></Description><Class><Value>Electronic</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>~!@#$%^&*()_+<,>.:;"'?/{[}' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']\|122ABCD 456789</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 123456</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>3000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 987654</Value><Control/></Description><Class><Value>Electronic</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 123456</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>3000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 987654</Value><Control/></Description><Class><Value>Electronic</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>~!@#$%^&*()_+<,>.:;"'?/{[}' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']\|122ABCD 456789</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 123456</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>3000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 987654</Value><Control/></Description><Class><Value>Electronic</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>~!@#$%^&*()_+<,>.:;"'?/{[}' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']\|122ABCD 456789</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item><Item><Description><Value>Home 123456</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>3000</Value><Control/></SumInsured></Item><Item><Description><Value>~!@#$%^&*()_+<,>.:;"'?/{[}' (CHARACTER)
(0x02000001:CDataValue):CDATA = ']\|122ABCD 456789</Value><Control/></Description><Class><Value>Other</Value><Control/></Class><SumInsured><Value>25000</Value><Control/></SumInsured></Item></Items> |
|
|
Back to top |
|
 |
adubya |
Posted: Sat Feb 23, 2013 12:54 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
|
Back to top |
|
 |
alastair |
Posted: Sat Feb 23, 2013 3:58 am Post subject: |
|
|
 Novice
Joined: 08 Feb 2012 Posts: 18 Location: Sydney
|
Running version 7004.
I'm also noticing that the content between the xml tags is being un-escaped in the same area that the cdata is split.
So
<Value>~!@#$%^&*()_+<,>.:;"'?/{[}]\|122ABCD 456789</Value>
Is being changed by a PARSE command to:
<Value>~!@#$%^&*()_+<,>.:;"'?/{[}]\|122ABCD 456789</Value>
Not sure what to do about this. |
|
Back to top |
|
 |
adubya |
Posted: Sun Feb 24, 2013 12:49 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
Unescaping the content when parsing is the correct behaviour though.
Looking at your input message you've got character data and CDATA content under the <Value> element, have you tried enabling the "Retain Mixed Content" property of the MQInput node ?
Last edited by adubya on Sun Feb 24, 2013 1:03 am; edited 1 time in total |
|
Back to top |
|
 |
alastair |
Posted: Sun Feb 24, 2013 12:53 am Post subject: |
|
|
 Novice
Joined: 08 Feb 2012 Posts: 18 Location: Sydney
|
Unescaping tags I have no problem with. Unescaping element values generates invalid XML. See my example above. |
|
Back to top |
|
 |
|