ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Character set trouble.

Post new topic  Reply to topic
 Character set trouble. « View previous topic :: View next topic » 
Author Message
matoh
PostPosted: Fri Mar 04, 2005 6:32 am    Post subject: Character set trouble. Reply with quote

Apprentice

Joined: 04 Mar 2005
Posts: 26

I try to inject an XML message into a message flow using rfhutilc. I want to inject an UTF-8 format message, but only manage to inject it as ANSI. When I try UTF-8 I get an 'Invalid document structure' message.

Code:

Setup:
Flow consisting of:
       Input node
           without the conversion flag set
           using MRM message definition with XML physical.
       Trace Node
       Mapping Node
          mapping from one XML schema to another
       Trace Node
       Compute Node
       Trace Node
       Output Node

Attempt 1:
Document saved as ANSI from notepad.
Read into rfhutilc.
Under MQMD tag:
   MQ Message Format set to 'MQHRF2'
Under RFH tag:
   RFH Type set to 'Version 2'
   Data Format to 'MQSTR'
   Code Page to 437
   CCSID to 1208

Result:
    Goes into the flow without a hitch.

Attempt 2:
Document saved as UTF-8 from notepad.
Read into rfhutilc.
Under MQMD tag:
   MQ Message Format set to 'MQHRF2'
Under RFH tag:
   RFH Type set to 'Version 2'
   Data Format to 'MQSTR'
   Code Page to 1208
   CCSID to 1208

Result:
   BIPv500 5117
   XMLHandler::error reported from the Xerces parser
   Null pointer
   Invalid document structure
   
Attempt 3:
Document saved as UTF-16 from notepad.
Read into rfhutilc.
Under MQMD tag:
   MQ Message Format set to 'MQHRF2'
Under RFH tag:
   RFH Type set to 'Version 2'
   Data Format to 'MQSTR'
   Code Page to 1208
   CCSID to 1208

Result:
   BIPv500 5117
   XMLHandler::error reported from the Xerces parser
   Null pointer
   Invalid document structure
   
Attempt 4:
Document saved as ANSI from notepad. (Same as attempt 1)
Read into rfhutilc.
Under MQMD tag:
   MQ Message Format set to 'MQHRF2'
Under RFH tag:
   RFH Type set to 'Version 2'
   Data Format to 'MQSTR'
   Code Page to 1208
   CCSID to 1208

Result:
    Goes into the flow without a hitch.



How do I solve this?
Back to top
View user's profile Send private message Send e-mail
matoh
PostPosted: Fri Mar 04, 2005 7:51 am    Post subject: Reply with quote

Apprentice

Joined: 04 Mar 2005
Posts: 26

Problem solved. Of course, it was something silly.

Notepad adds three invisible escape characters first in the file when you save it as UTF-8 (i.e, in my case before '<?xml').

These characters does not show up in rfhutilc when looking at the data as XML in the data pad. In fact, the data became incorrectly shown there when I removed them. But removing them causes the message to enter the flow correctly...

(Anybody who knows if these characters serve any purpose other than marking the file as UTF-8? I.e. will their removal have any consequences on my mapping?)
Back to top
View user's profile Send private message Send e-mail
jfluitsm
PostPosted: Sat Mar 05, 2005 9:13 am    Post subject: Reply with quote

Disciple

Joined: 24 Feb 2002
Posts: 160
Location: The Netherlands

These 2 (utf-16) or 3 (utf- characters are a bye order mark (BOM). They have no meaning for utf-8, but for utf-16 they state whether big-endian or little-endian utf-16 is used. For utf-16 XML messages the BOM is mandatory according to the XML standard.
The broker however looks at the encoding of the header to detect BE or LE. The broker should have skipped this character as it is a white space ('zero-length non-breaking space' is the official Unicode name of the BOM).
For utf-16 you used the wrong code page, you shoul have used 1200 (ucs-2, is about the same as utf-16).
_________________
Jan Fluitsma

IBM Certified Solution Designer WebSphere MQ V6
IBM Certified Solution Developer WebSphere Message Broker V6
Back to top
View user's profile Send private message Send e-mail
matoh
PostPosted: Thu Mar 17, 2005 6:11 am    Post subject: Reply with quote

Apprentice

Joined: 04 Mar 2005
Posts: 26

jfluitsm wrote:

For utf-16 you used the wrong code page, you should have used 1200 (ucs-2, is about the same as utf-16).


Yes, detetected that after I had posted here.
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Character set trouble.
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.