MQSeries.net :: View topic - WMB V6.1 - Invalid Character in XML Message Issue

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » WMB V6.1 - Invalid Character in XML Message Issue

WMB V6.1 - Invalid Character in XML Message Issue

« View previous topic :: View next topic »

Author

Message

satya2481

Posted: Thu Dec 08, 2011 3:41 am Post subject: WMB V6.1 - Invalid Character in XML Message Issue

Disciple

Joined: 26 Apr 2007
Posts: 170
Location: Bengaluru

Hi All,
I am again back with some issue...

Back ground : There is a flow running in Production environment in V5 Broker. This flow is now migrated and deployed into V6.1 Broker.

Issue : XML message sent to V5 flow working fine. Same message if sent to V6 flow its throwing "Invalid character (Unicode: 0x1A)".

After checking which character causing the problem.. its a "->" mark in the XML message for one of the field value. I think its a bullet.

Is there any difference in the way V5 broker parses the message for XML domain and V6 broker.

How to fix this issue. Because we have to upgrade the flow and it should work fine how it was working in V5 broker in Production environment.

Thanks
Satya

mqjeff

Posted: Thu Dec 08, 2011 3:45 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

The parser in v5 was not as correct as the parser in v6.1 and etc.

You are actually using the XMLNSC parser in v6.1?

What I'm saying is that it's likely that the only reason it "worked just fine" in v5 is because it incorrectly accepted your incorrect XML documents.

kimbert

Posted: Thu Dec 08, 2011 4:07 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

mqjeff is correct - this looks very much like a defect that has been fixed. According to the XML specifiction http://www.w3.org/TR/2006/REC-xml-20060816/#charsets the allowed characters are:

Code:

[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

So 0x1A is not allowed. v5 was non-compliant.

satya2481

Posted: Thu Dec 08, 2011 10:04 pm Post subject:

Disciple

Joined: 26 Apr 2007
Posts: 170
Location: Bengaluru

Thank you very much for the information...

So is there any solution to resolve this kind of issues.

Should we replace such characters from the code ? Or any other alternative...

Thank You
Satya

fjb_saper

Posted: Thu Dec 08, 2011 10:09 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20772
Location: LI,NY

satya2481 wrote:

Thank you very much for the information...

So is there any solution to resolve this kind of issues.

Should we replace such characters from the code ? Or any other alternative...

Thank You
Satya

You need to find the producer of the message and determine the CCSID in which it is sent. This CCSID needs to support all the characters the producer is likely to send. (Hopefully you'll find the CCSID to be 1208 (UTF-8 ))
Then make sure no conversion happens until the broker reads the message.(DO NOT set the conversion flag on the MQInput node)
Verify also that on the outbound there is no CCSID specified that would force a substitution character for a non mapped character. (Make it easy set output CCSID to 1208).

Have fun

_________________
MQ & Broker admin

Last edited by fjb_saper on Thu Dec 08, 2011 10:11 pm; edited 1 time in total

smdavies99

Posted: Thu Dec 08, 2011 10:10 pm Post subject:

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

I've seen this character appear when people have cut/pasted a Windows screen.
As you say, the character seems to represent a bulllet point.

The only way (Apart from stopping this appearing in the first place) is to scan the message before it is parsed and replace the offending character with something that does not fall foul of the XMLNSC parser(other parsers available).
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.

kimbert

Posted: Fri Dec 09, 2011 2:03 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

0x1A is often used as a 'substitution character' by character encoders ( ICU being the most common one ). In other words, 0x1A is the character that is output when the source string contains a Unicode character for which the output CCSID does not have a mapping.
If that guess is correct, the ideal fix would be to change the upstream ( sending ) application to use UTF-8 instead of whatever they're currently using. UTF-8 has a mapping for every Unicode character, so will never get into this hole. Depends on whether the upstream application can be changed, of course.

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » WMB V6.1 - Invalid Character in XML Message Issue

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP