|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
WBIMB v6 java input node : setting xml charset encoding |
« View previous topic :: View next topic » |
Author |
Message
|
powerlord |
Posted: Fri Sep 02, 2005 12:56 am Post subject: WBIMB v6 java input node : setting xml charset encoding |
|
|
Novice
Joined: 02 Sep 2005 Posts: 19
|
I've had a look through the forum and tried a few things without sucess so...
I've got a custom java input node which gets an XML message from somewhere (UTF-8).
Originally I was just getting the bytes for this (with UTF-8), and doing a createMessage(msgBytes);
this does parse into xml in MB, but does not cope with UTF-8 stuff.
For example if I have an XML message which has a pound sign (£), this gets mangled when I look at it in the debug flow. If I have an HTTPRequest in the flow I can see that the HTTP request sends out a mangled message too.
Looking at the properties of the message when it comes out my custom input node I see that the CodedCharSetId is 0.
So I set this in the java node to 1208 (UTF-8):
msg.getRootElement().getFirstChild().getFirstElementByPath("CodedCharSetId").setValue(new Integer(1208));
it now has the proper value set for this property.
However it still doesn't parse it properly.
I even tried putting a compute node after the input node and changing the encoding there:
SET OutputRoot.XML.(XML.XMLDecl).(XML.Encoding)Encoding = 'UTF-8';
still mangled.
I then added a resetCOntentDescriptor to force a reparse after the compute, but still mangled.
so question is:
what can I set in my java node to force MB to parse the UTF-8 bytes I'm giving it for the message into a UTF-8 XML message ??
stu |
|
Back to top |
|
 |
powerlord |
Posted: Mon Sep 05, 2005 7:42 am Post subject: can noone help ?? |
|
|
Novice
Joined: 02 Sep 2005 Posts: 19
|
Still got this problem.
MB is definately not correctly parsing the UTF-8 bytes.
simply input message:
<value>£100</value>
saved to a file as a valid UTF-8 format file (via Textpad)... checked in binary mode to confirm "£" is saved as C2A3.
In input node, read bytes out of file...call createMessage(bytes).
and the £ appears garbled in the parsed Message.
Setting CodedCharSetId after the createMessage has no effect. Setting it before is not possible (as I don't have a message to get to the properties).
arg.
help.
stu |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Sep 05, 2005 11:12 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Well UTF-8 and a number of other CCSIDs will garble some stuff in XML.
Like the currency sign, and other stuff. In fact it uses some control char and passes the char value...
You have to parse from XML to text using the correct CCSID to see that stuff in clear text. Or use something like XML Spy...
See as well java.nio.* classes for reading from one CCSID and writing in another.
Enjoy  |
|
Back to top |
|
 |
powerlord |
Posted: Tue Sep 06, 2005 12:27 am Post subject: thanks |
|
|
Novice
Joined: 02 Sep 2005 Posts: 19
|
thanks, but I don't think that is the problem.
Here are some code snippets.
I have a UTF-8 saved simple XML file which looks like:
<?xml version="1.0" encoding="UTF-8"?>
<Value>£1234</Value>
I've checked this in hex to confirm that '£' is saved as C2A3. This is definately a valid UTF-8 format bitstream.
Some code from my AFInput node:
Code: |
....
try{
File f = new File ("c:\\in.txt");
DataInputStream bis = new DataInputStream(new FileInputStream(f));
msgBytes = new byte[(int)f.length()];
bis.readFully(msgBytes);
bis.close();
}catch(Exception e){}
MbMessage msg = null;
msg = createMessage(msgBytes);
MbElement props = msg.getRootElement().getFirstChild();
MbElement ccsid = props.getFirstElementByPath("CodedCharSetId");
Object o = ccsid.getValue();
ccsid.setValue(new Integer(1208));
MbMessageAssembly newAssembly = new MbMessageAssembly(ma, msg);
msg.finalizeMessage(MbMessage.FINALIZE_VALIDATE);
.....
|
A breakpoint after the node shows that a CodedCharSetId of 1208 HAS been set, but the XML has been pased with as:
XML
Value
-ú1234
So, now I try using the createElementAsLastChildFromBitstream method which allows me to specify encoding by simply detatching the root XML I've just created and creating a new one:
Code: |
....
try{
File f = new File ("c:\\in.txt");
DataInputStream bis = new DataInputStream(new FileInputStream(f));
msgBytes = new byte[(int)f.length()];
bis.readFully(msgBytes);
bis.close();
}catch(Exception e){}
MbMessage msg = null;
msg = createMessage(msgBytes);
MbElement props = msg.getRootElement().getFirstChild();
MbElement ccsid = props.getFirstElementByPath("CodedCharSetId");
Object o = ccsid.getValue();
ccsid.setValue(new Integer(1208));
//OK... Now scrub what I've jsut created now that I've got a message object
MbElement newXmlElement = msg.getRootElement().getLastChild();
newXmlElement.detach();
msg.getRootElement().createElementAsLastChildFromBitstream(msgBytes, "xml", null, null, null, 0, 1208, 0);
MbMessageAssembly newAssembly = new MbMessageAssembly(ma, msg);
msg.finalizeMessage(MbMessage.FINALIZE_VALIDATE);
|
A breakpoint after the node shows that my bitstream has now properly been parsed as UTF-8/1208! :
XML
Value
£1234
*******************
So, it seems pretty clear to me that createMessage(), then setting CodedCharSetId has no effect, whereas createElementAsLastChildFromBitstream with its explicit parameter for CCSID works.
However clearly this is not a performant method of coding. To create a 'dummy' message just to delete the root XML and create it properly. Surely there is a better way ?
[/code] |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Sep 06, 2005 7:04 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Have you opened a PMR yet?
I am assuming, also, that you really aren't using WBIMB v6, but are really using v5.
If you are using v6, then you should instead report this to the beta program that you are participating in. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
powerlord |
Posted: Tue Sep 06, 2005 11:21 pm Post subject: |
|
|
Novice
Joined: 02 Sep 2005 Posts: 19
|
yeh, MB5. CDS4 and CSD6 display same behaviour.
I wanted to check it was a bug before going further. If you reckon it is one I'll raise a PMR.
stu |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|