Author |
Message
|
Testo |
Posted: Tue Feb 01, 2005 12:13 pm Post subject: SOLVED - XML message: UTF-8 and invalid character |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
I'm receiving a large XML file (2.3 MB) and I have to wrap it in a soap envelope to be passed to a .NET web service.
The message I am receiving is defined as UTF-8 but it has inside some UTF-16 characters within it.
Now, .NET environment does not mind so much about that but the WBIMB CSD 4 parser does... so, I'm wondering what is the best approach now:
- transform it in a UTF-16 message with the XML Transformation node? This would probably double its size...
- parse it as a BLOB, then navigate a while within it with SUBSTRING (I just need to look for a certain tag to get the an ID to be passed to the web service together with the whole message)?
- escape it? I would not know how actually...
Any architecturale hint would be more than appreciated.
Thanks in advance,
Andrea Tedone
IBM IT Specialist
Last edited by Testo on Wed Feb 02, 2005 8:06 am; edited 1 time in total |
|
Back to top |
|
 |
JLRowe |
Posted: Tue Feb 01, 2005 2:44 pm Post subject: |
|
|
 Yatiri
Joined: 25 May 2002 Posts: 664 Location: South East London
|
Please detail the problem the parser is having with UTF-16 documents, how is the encoding specified in the XML declaration? |
|
Back to top |
|
 |
Testo |
Posted: Tue Feb 01, 2005 2:49 pm Post subject: details |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
Hi timna.
The broker is not able to parse the XML message because it has an invalid character (i.e. '£') within it. The encoding, in the XML decl., is defined as UTF-8 despite the '£' is a character valid in the UTF-16 encoding domain. For this reason, in fact, if you save the XML tree in an XML file and open it for instance with Internet Explorer, this character is represented with a little box, because the web browser is not able to interpret it.
Cheers,
Andrea |
|
Back to top |
|
 |
JLRowe |
Posted: Wed Feb 02, 2005 4:31 am Post subject: |
|
|
 Yatiri
Joined: 25 May 2002 Posts: 664 Location: South East London
|
Is the declaration right or wrong then? If it says UTF-8 and there is UTF-16 encoding within the document then perhaps it is wrong. |
|
Back to top |
|
 |
Testo |
Posted: Wed Feb 02, 2005 7:16 am Post subject: Additional info |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
Some additional info.
The invalid character in the XML message seems to be caused by the WBIMB.
I will explain better the scenario: message flow A calls a webservice .NET receiving a 2.3 MB response in UTF-8 encoding and with only valid characters.
Then this message is put on a queue, taken from message flow B that takes this input, wraps it in a SOAP envelope and then calls another .NET webservice.
Ok, the WBIMB parser, while moving this large XMLNS message from one flow to another, seems not to respect the UTF-8 encoding then modifying one single character.
Any similar experience?!
Cheers,
Andrea |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Feb 02, 2005 7:57 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
I would look at how message flow A is writing the data to the queue, and how message flow B is reading the data from the queue. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
Testo |
Posted: Wed Feb 02, 2005 8:06 am Post subject: SOLVED! |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
It was simply a question of CCSID.
As the XML response from the .NET WS is UTF-8, we forced the CCSID to 1208 instead of the default 437 of our Windows 2003 Server.
Thanks to a couple of my collegues that put me in the right way...
Cheers
Andrea |
|
Back to top |
|
 |
kirani |
Posted: Fri Feb 04, 2005 12:09 am Post subject: |
|
|
Jedi Knight
Joined: 05 Sep 2001 Posts: 3779 Location: Torrance, CA, USA
|
We had similar problems when working with .NET and WBIMB. The solution was similar to yours Andrea. _________________ Kiran
IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries
|
|
Back to top |
|
 |
Testo |
Posted: Fri Feb 04, 2005 12:23 am Post subject: Hope |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
Kiran, I hope you didn't spend one working day to solve it as we did!!!
Cheers,
Andrea |
|
Back to top |
|
 |
kirani |
Posted: Fri Feb 04, 2005 12:25 am Post subject: |
|
|
Jedi Knight
Joined: 05 Sep 2001 Posts: 3779 Location: Torrance, CA, USA
|
yeah, but the folks over here spent time in blaming MQ  _________________ Kiran
IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries
|
|
Back to top |
|
 |
martinrydman |
Posted: Fri Feb 04, 2005 1:11 am Post subject: |
|
|
 Centurion
Joined: 30 Jan 2004 Posts: 139 Location: Gothenburg, Sweden
|
Hi,
I'm just glad to hear that even grand masters struggle with these darn code page issues. No matter how long I do this work, I feel like I'll never stop tripping over one CCSID issue or other
Why can't everybody use Swedish?
/Martin |
|
Back to top |
|
 |
Testo |
Posted: Fri Feb 04, 2005 1:22 am Post subject: Common problems over IT generations... |
|
|
 Centurion
Joined: 26 Feb 2003 Posts: 120 Location: Italy - Milan
|
The IBM IT Architect Carlo Randone (.NET/WS/Interoperability guru) working with me on the project says that once retired, he will write a book with the common problems, affecting over and over and over the IT population: date fields & encoding and CCSID issues.
Cheers,
Andrea |
|
Back to top |
|
 |
|