ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » One of the Japanese characters is not being converted to UTF

Post new topic  Reply to topic
 One of the Japanese characters is not being converted to UTF « View previous topic :: View next topic » 
Author Message
ragi_kmm
PostPosted: Thu May 21, 2009 12:52 am    Post subject: One of the Japanese characters is not being converted to UTF Reply with quote

Novice

Joined: 19 Sep 2006
Posts: 13

Hi Everyone,

I have a requirement of converting the Japanese characters into UTF8 format.

We run adapters to get the data from source database to put the messages in a MQ.

I have two options to convert the data into UTF8 format.
1) run adapters on the database to put messages in UTF8 format.
2) run adapters on the database to put messages in SJIS format and convert to UTF8 in MB.

In either of the cases all the Japanese characters are getting converted to UTF8 format, except one Japanese character(Double Hyphen)

But if I run the adapters with SJIS format all the characters are converted properly.

I am assuming that one Japanese character(Double Hyphen) is not part of the UTF8 code page.

Can any one suggest me on this ?

Rgds,
Giri
Back to top
View user's profile Send private message Yahoo Messenger
kimbert
PostPosted: Thu May 21, 2009 3:19 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I am assuming that one Japanese character(Double Hyphen) is not part of the UTF8 code page
I can answer that for you. UTF8 is not a code page. It can represent *any* Unicode code point.
It would probably be a good idea to read this: http://icu-project.org/docs/papers/codepages_and_unicode.html

I strongly recommend that anyone who has a problem with code pages reads articles like this. Anybody who can post on this forum can also access articles like this one.
Back to top
View user's profile Send private message
paranoid221
PostPosted: Thu May 21, 2009 10:27 pm    Post subject: Reply with quote

Centurion

Joined: 03 Apr 2006
Posts: 101
Location: USA

If this double hyphen happens to be part of a comments section of an XML, i can see why it is complaining. XML forbids using double hyphen in comments section.
Even otherwise, a quick googling revealed the following from Wikipedia and I thought this might be relevant to you.
http://en.wikipedia.org/wiki/Double_hyphen
_________________
LIFE is a series of complex calculations, somewhere multiplied by ZERO.
Back to top
View user's profile Send private message
rekarm01
PostPosted: Mon May 25, 2009 1:54 pm    Post subject: Re: One of the Japanese characters is not being converted to Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

paranoid221 wrote:
If this double hyphen happens to be part of a comments section of an XML, i can see why it is complaining. XML forbids using double hyphen in comments section.

No, it doesn't. Re-read this:
paranoid221 wrote:
... I thought this might be relevant to you. http://en.wikipedia.org/wiki/Double_hyphen

... which starts out with:
Quote:
The double hyphen (=, ゠, or ⸗) is a punctuation mark that consists of two parallel hyphens. It is not to be confused with two consecutive hyphens (--), ...

... and follows up with a more likely explanation, and common work-around:
Quote:
However, a distinct double hyphen character is not supported in the main Japanese character encodings such like Shift-JIS for Windows and DOS, and EUC-JP for Unix and Unix-like systems even though a newer standard JIS X 0213 includes the double hyphen. In practice, the equal sign are more frequently used instead. And middle dot (・) is more popular for the same purpose in traditional Japanese printings.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » One of the Japanese characters is not being converted to UTF
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.