Author |
Message
|
alechko |
Posted: Sun Mar 06, 2011 8:02 am Post subject: Hebrew encoding CCSID mismatch |
|
|
Apprentice
Joined: 12 Jan 2005 Posts: 37
|
Hi,
I've developed a WebService which receives messages with iso-8859-8 encoding (Hebrew).
When the flow is trying to SOAP Reply then the following exception is thrown:
Code: |
BIP3701E: A Java exception was thrown whilst calling the Java JNI method 'method_com_ibm_broker_axis2_Axis2Invoker_prepareToSendReplyNonSOAP'. The Java exception was 'javax.xml.stream.XMLStreamException: java.io.UnsupportedEncodingException: ibm-5012'. The Java stack trace was 'Frame : 0 javax.xml.stream.XMLStreamException: java.io.UnsupportedEncodingException: ibm-5012| @: com.ibm.xml.xlxp2.api.stax.msg.StAXMessageProvider.throwXMLStreamException(StAXMessageProvider.java:67)| @: com.ibm.xml.xlxp2.api.stax.XMLStreamReaderImpl.setDocumentEntity(XMLStreamReaderImpl.java:394)| @: com.ibm.xml.xlxp2.api.stax.XMLInputFactoryImpl.setDocumentEntity(XMLInputFactoryImpl.java:1440)| @: com.ibm.xml.xlxp2.api.stax.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:1455)| @: com.ibm.xml.xlxp2.api.stax.XMLInputFactoryImpl.createXMLStreamReaderInternal(XMLInputFactoryImpl.java:1555)| @: com.ibm.xml.xlxp2.api.stax.XMLInputFactoryIsite_1_1
|
I've debugged the service and I've noticed that the message properties contain CodedCharSetId=5012.
I have no idea why the Message Broker decided to parse the message with this CCSID.
If I update it manually to 916 (ISO-8859-8), the SOAP Reply works alright.
Does anybody know why WMB uses this CCSID?
I couldn't find any reference to such CCSID code.
Thanks,
Alik |
|
Back to top |
|
 |
smdavies99 |
Posted: Sun Mar 06, 2011 10:09 am Post subject: Re: Hebrew encoding CCSID mismatch |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Have you tried Mr Google? There are plenty of results there. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
rekarm01 |
Posted: Sun Mar 06, 2011 1:10 pm Post subject: Re: Hebrew encoding CCSID mismatch |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
alechko wrote: |
Does anybody know why WMB uses this CCSID? |
There are two different versions of the underlying standard for the "iso-8859-8" encoding:- ISO/IEC 8859-8:1988 (ccsid=916)
- ISO/IEC 8859-8:1999 (ccsid=5012) -- adds LRM and RLM characters
The broker uses ICU (international components for Unicode) libraries to manage character data. Assuming it uses the ICU alias table to map the input message encoding declaration to the equivalent input ccsid, then "iso-8859-8" would map to "ibm-5012" (ccsid=5012). |
|
Back to top |
|
 |
kimbert |
Posted: Mon Mar 07, 2011 2:52 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Let's get a couple of facts clear:
- Message broker *always* uses UTF-16 ( code page 1200 ) internally for all character data.
- When parsing an input message, if the incoming character data is not in UTF-16. then it is converted from the input code page to UTF-16 using ICU. So the input message *must* declare its input code page accurately, or else this conversion step will fail. Afterwards, the input code page is stored in InputRoot.Properties.CodedCharsetId.
- When writing an output message, message broker converts all character data from UTF-16 to the output code page. The output code page is taken from OutputRoot.Properties.CodedCharSetId. |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Mar 07, 2011 3:46 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
kimbert wrote: |
Let's get a couple of facts clear:
- Message broker *always* uses UTF-16 ( code page 1200 ) internally for all character data.
- When parsing an input message, if the incoming character data is not in UTF-16. then it is converted from the input code page to UTF-16 using ICU. So the input message *must* declare its input code page accurately, or else this conversion step will fail. Afterwards, the input code page is stored in InputRoot.Properties.CodedCharsetId.
- When writing an output message, message broker converts all character data from UTF-16 to the output code page. The output code page is taken from OutputRoot.Properties.CodedCharSetId. |
Kimbert, does the override on the MQMD-CCSID alone still work in V7, or do you have to set the OutputRoot.Properties.CodedCharSetId even though you did set OutputRoot.MQMD.CodedCharSetId? _________________ MQ & Broker admin |
|
Back to top |
|
 |
kimbert |
Posted: Mon Mar 07, 2011 6:47 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
As you know, the short answer is that if it worked in previous versions, then it should work in v7. Upgrading WMB should not break existing message flows.
I suspect you are asking about the automatic fixing up of the header chain that is done for MQ and some other transports. Yes - the properties folder is automatically populated from the MQMD header if the MQMD header is present. |
|
Back to top |
|
 |
rekarm01 |
Posted: Tue Mar 08, 2011 7:58 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
kimbert wrote: |
Let's get a couple of facts clear:
- Message broker *always* uses UTF-16 ( code page 1200 ) internally for all character data.
- When parsing an input message, if the incoming character data is not in UTF-16. then it is converted from the input code page to UTF-16 using ICU. So the input message *must* declare its input code page accurately, or else this conversion step will fail. Afterwards, the input code page is stored in InputRoot.Properties.CodedCharsetId.
- When writing an output message, message broker converts all character data from UTF-16 to the output code page. The output code page is taken from OutputRoot.Properties.CodedCharSetId. |
This is generally useful information, and mostly correct.
But how does it apply to the OP's issue? |
|
Back to top |
|
 |
kimbert |
Posted: Wed Mar 09, 2011 2:02 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
People get confused about the way in which message broker treats character data. Some of the posts in this thread, although technically accurate, could have been misunderstood by a reader who was not clear about the basics. So I stated the basics. |
|
Back to top |
|
 |
|