Author |
Message
|
TonyD |
Posted: Thu Oct 16, 2003 10:40 pm Post subject: UCS-2 XML Parsing UQestion |
|
|
Knight
Joined: 15 May 2001 Posts: 540 Location: New Zealand
|
We have a .NET application that puts a UCS-2 (CCSID '1200') XML message on to a WIN2k WBI 5 broker input queue. The MQInput node specifies Message Domain 'XML'. There is no byte order mark (x'FFFE') preceding the data. The WBI XML parser rejects the message:
Quote: |
ParserException BIP5004E: XML parsing error (Invalid document structure) encountered on line 1 column 1 while parsing element .
|
When I insert the byte order mark, the message is parsed correctly.
Is there any other way, besides preceding the message data with a byte order mark, to get the WBI parser to correctly handle the message? I wondered about specifying an encoding declaration, but it did not have any effect. |
|
Back to top |
|
 |
jfluitsm |
Posted: Sun Oct 19, 2003 5:50 am Post subject: |
|
|
Disciple
Joined: 24 Feb 2002 Posts: 160 Location: The Netherlands
|
When UCS-2 (sub-set of UTF-16) is used there should be a byte-order-mark, otherwise the XML is not valid (as the error tells you).
See http://www.w3.org/TR/REC-xml#charencoding. _________________ Jan Fluitsma
IBM Certified Solution Designer WebSphere MQ V6
IBM Certified Solution Developer WebSphere Message Broker V6 |
|
Back to top |
|
 |
TonyD |
Posted: Sun Oct 19, 2003 3:07 pm Post subject: |
|
|
Knight
Joined: 15 May 2001 Posts: 540 Location: New Zealand
|
Thank you for that reference. It does however appear to indicate that an encoding declaration is an acceptable alternative. I had already tried to my message with 'encoding=ISO-10646-UCS2' annd no byte order mark, but parsing failed. Should the WBI parser have been able to handle this alternative? |
|
Back to top |
|
 |
jfluitsm |
Posted: Mon Oct 20, 2003 11:32 am Post subject: |
|
|
Disciple
Joined: 24 Feb 2002 Posts: 160 Location: The Netherlands
|
Although not clear from the XML reference, I think the declaration has always to be in UTF-8 or UTF-16, otherwise there is no way to interpret the encoding. Without the BOM the parser can't be sure whether big-endian or little-endian UTF-16 is used. _________________ Jan Fluitsma
IBM Certified Solution Designer WebSphere MQ V6
IBM Certified Solution Developer WebSphere Message Broker V6 |
|
Back to top |
|
 |
TonyD |
Posted: Mon Oct 20, 2003 3:07 pm Post subject: |
|
|
Knight
Joined: 15 May 2001 Posts: 540 Location: New Zealand
|
That makes sense...thanks Jan. |
|
Back to top |
|
 |
|