Author |
Message
|
bhausaheb |
Posted: Mon Oct 01, 2012 11:26 am Post subject: Non Ascii Data Handle |
|
|
Newbie
Joined: 01 Oct 2012 Posts: 1
|
Hi All,
I am working on Message Broker 6.1.
Here we are sending data in MRM blob format to back-end.
User sends data in Soap XML format and Response is also parsed send back to user in XML format.
Now in one of the field we are receiving embedded non ASCII data from back-end. Now in parsing we are facing issue.
1) How to display the non ASCII data in XML response.
2) We have to pass exactly same data to back-end.
Is any one faced such issue in past.
Please guide us to resolve this. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Mon Oct 01, 2012 11:31 am Post subject: Re: Non Ascii Data Handle |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
bhausaheb wrote: |
I am working on Message Broker 6.1. |
WMB 6.1 is End-Of-Life in eleven months. Update your WMB ASAP.
bhausaheb wrote: |
Here we are sending data in MRM blob format to back-end. User sends data in Soap XML format and Response is also parsed send back to user in XML format. Now in one of the field we are receiving embedded non ASCII data from back-end. |
Excellent use case for WMB.
bhausaheb wrote: |
Now in parsing we are facing issue.
1) How to display the non ASCII data in XML response.
2) We have to pass exactly same data to back-end.
Is any one faced such issue in past.
Please guide us to resolve this. |
1. Use CDATA field.
2. Use BLOB. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
NealM |
Posted: Mon Oct 01, 2012 3:34 pm Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
A valid XML document doesn't have to be just ASCII, but it does have to be Unicode. Whether or not in a CDATA section. You would have to get into MIME parts otherwise.
If you needed to pass something like EBCDIC chars or signed/packed numeric data in an ASCII/Unicode XML doc element, you would need to send it's hex representation, not its real value. So for instance if you needed to send an element that contained EBCDIC 'Account 20' in ASCII XML element myStuff, you could send it as
<myStuff>X'C1838396A495A340F2F0'</myStuff> (with or without the X', ' - and notice that it is twice the length of the actual data) and the app at the other end would have to convert it back from pseudo-hex to EBCDIC.
However, is that really what you are after, or do you just need to do something as simple as converting a mainframe signed decimal field to an XML element value? If that is the case, maybe your backend's MRM field declarations just need a little work. Or maybe it is your XML schema. |
|
Back to top |
|
 |
smdavies99 |
Posted: Mon Oct 01, 2012 8:39 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
NealM wrote: |
A valid XML document doesn't have to be just ASCII, but it does have to be Unicode. |
AFAIK, it does not have to be unicode.
Code: |
<?xml version="1.0" encoding="UTF-8"?> |
could be
Code: |
<?xml version="1.0" encoding="ISO8859-15"?>
|
{apologies if this is the correct string format}
Getting back on topic,
There is an XML datatype for BLOB (Base64binary). I'd use that if the parser at the other end (and the schema especially as you are using SOAP) can handle it. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
NealM |
Posted: Tue Oct 02, 2012 12:54 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Quote: |
There is an XML datatype for BLOB (Base64binary). |
I stand (somewhat) corrected. There are actually two XML datatypes for a Binary or BLOB data, base64Binary and hexBinary. Two different ways to turn non-text data into a text representation of the data to make for legal XML. Technically, the hexBinary datatype is the BLOB as we know it in WMB; base64Binary though is more condensed (and a little bit more work) so better for larger chunks of data, seen mostly in MIME documents.
For a better explanation and how to use them in the v6.1 Broker, see http://publib.boulder.ibm.com/infocenter/wmbhelp/v6r1m0/index.jsp?topic=%2Fcom.ibm.etools.mft.doc%2Fac67173_.htm
Quote: |
AFAIK, it does not have to be unicode. |
Ah, the Euro sign. OK, I was unclear. (update: the Unicode assertion that follows proved to be false, see follow-on posts) XML does have to be Unicode (AKA UTF-8 ), or a subset of Unicode. Which is what ASCII is, as well as Latin-9 (AKA ISO8859-15). Briefest explanation, see http://en.wikipedia.org/wiki/ISO/IEC_8859-15
Last edited by NealM on Tue Oct 02, 2012 9:19 am; edited 1 time in total |
|
Back to top |
|
 |
kimbert |
Posted: Tue Oct 02, 2012 3:55 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
XML does have to be Unicode (AKA UTF-8 ), or a subset of Unicode. |
Quote: |
AFAIK, it does not have to be unicode. |
The great thing about XML and XML Schema is that they are rigorously defined languages. Their specifications are hosted and managed by the W3C consortium, and they are the only authoritative source for questions like this. So...I refer you to http://www.w3.org/TR/2006/REC-xml-20060816/#charencoding for the definitive answer:
Although an XML processor is required to read only entities in the UTF-8 and UTF-16 encodings, it is recognized that other encodings are used around the world, and it may be desired for XML processors to read entities that use them. In the absence of external character encoding information (such as MIME headers), parsed entities which are stored in an encoding other than UTF-8 or UTF-16 must begin with a text declaration (see 4.3.1 The Text Declaration) containing an encoding declaration: |
|
Back to top |
|
 |
NealM |
Posted: Tue Oct 02, 2012 7:48 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Ouch! The point on my sword sure does hurt. I went back to the W3C original 1996 XML working draft, and even then the encoding declaration was specifically put in to handle non-Unicode character sets, and the paragraph lancelot quoted has been unchanged since 1997. I'm not sure when/where I went astray, and my apologies for posting a falsehood.
In my defense (and a very poor and typically Americanized defense it is), any globalization projects that I had been involved with in the past that dealt with language issues, were usually solved by changing an XML doc's encoding from UCS-2, etc to UTF-8.
Anyway, I've taken this topic off course with this encoding discussion. Sorry.  |
|
Back to top |
|
 |
mqjeff |
Posted: Tue Oct 02, 2012 7:52 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
wrong attribution to the quote. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Tue Oct 02, 2012 7:53 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
|
Back to top |
|
 |
mqjeff |
Posted: Tue Oct 02, 2012 8:00 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
lancelotlinc wrote: |
mqjeff wrote: |
wrong attribution to the quote. |
I'll take any I can get. |
that explains the bud light. |
|
Back to top |
|
 |
NealM |
Posted: Tue Oct 02, 2012 8:10 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
oops! Of course it would be kimbert. Well, that maybe partially explains how I went astray.....
PS, hasn't bud light been globalized now also? |
|
Back to top |
|
 |
smdavies99 |
Posted: Tue Oct 02, 2012 9:09 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
NealM wrote: |
PS, hasn't bud light been globalized now also? |
Ha ha.
This very day the hotel I'm staying in 25km from Amman has a special on for 'Bud Light' 3 for 2.
guess what, there are no takers even at $7.50 a pint(US). They are all going for the Becks ($9.00 US a Pint). Ouch! _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Oct 02, 2012 9:15 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
any globalization projects that I had been involved with in the past that dealt with language issues, were usually solved by changing an XML doc's encoding from UCS-2, etc to UTF-8. |
I'm not surprised. I would choose UTF-8 over ahead of any non-Unicode ( or obsolete, as in UCS-2 ) encoding any day.
I must admit I had to look for a while to find the correct quote, and earlier sections of the spec only talk about Unicode. |
|
Back to top |
|
 |
sunny_30 |
Posted: Tue Oct 02, 2012 1:20 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
mqjeff wrote: |
lancelotlinc wrote: |
mqjeff wrote: |
wrong attribution to the quote. |
I'll take any I can get. |
that explains the bud light. |
But he'd like it if it comes with Lime ..  |
|
Back to top |
|
 |
rekarm01 |
Posted: Tue Oct 02, 2012 6:54 pm Post subject: Re: Non Ascii Data Handle |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
NealM wrote: |
There are actually two XML datatypes for a Binary or BLOB data, base64Binary and hexBinary. |
More precisely, these are XML Schema datatypes. An XML parser may or may not convert to/from the specified datatype when parsing/writing an XML document, but the XML document itself can only contain character data, using the specified (or default) character encoding.
kimbert wrote: |
I would choose UTF-8 over ahead of any non-Unicode (or obsolete, as in UCS-2) encoding any day. |
Hmmm ... the latest documentation for both WMQ and WMB indicate that they use UCS-2 internally, (not UTF-16). Is that still true, or does the documentation need updating? |
|
Back to top |
|
 |
|