ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Handling UTF- characters in message flow

Post new topic  Reply to topic
 Handling UTF- characters in message flow « View previous topic :: View next topic » 
Author Message
saurabh867
PostPosted: Fri Aug 06, 2010 1:07 am    Post subject: Handling UTF- characters in message flow Reply with quote

Voyager

Joined: 13 Jun 2010
Posts: 78

Hi,
I have a message flow which reads a file and then parse it against a certain message set. The file contains some UTF-8 characters like ® symbol.
After all my processing the output message contains a special character before the ® character i put in the input. I am also storing the entire message in the database as part of requirement and there also I could see a junk value  preceeding ® character.
I have used CCSID as 1208 and encoding as 437. Is there any way to handle this scenario.

Regards,
Saurabh
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Aug 06, 2010 1:15 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

Depends is UTF-8 available as a character set on your platform (see DB requirements for sender)? Is the file data defined to the broker in the correct ccsid before parsing? And please do not use read with Convert, as this will potentially downgrade the ccsid to the ccsid of the qmgr. Use the CCSID on InputRoot.Properties.CodedCharSet...

And remember in MQ the encoding has to do with endian recognition of binary numbers and nothing to do with the character set.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
saurabh867
PostPosted: Fri Aug 06, 2010 1:33 am    Post subject: Reply with quote

Voyager

Joined: 13 Jun 2010
Posts: 78

Yes,
I have used CCSID for setting Properties only . I agree that data in DB may depend on the platform but the data in output queue should show the correct character. Why everytime for each character, it is preceeding the same character Â.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Aug 06, 2010 1:47 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

saurabh867 wrote:
Yes,
I have used CCSID for setting Properties only . I agree that data in DB may depend on the platform but the data in output queue should show the correct character. Why everytime for each character, it is preceeding the same character Â.

Did you look at the message (hex data + ccsid) on the queue before it gets consumed by the broker? Does the data match the ccsid?
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
saurabh867
PostPosted: Fri Aug 06, 2010 4:04 am    Post subject: Reply with quote

Voyager

Joined: 13 Jun 2010
Posts: 78

Actually, the issue comes when I send my data to WTX node from broker then the output of TX node contains an extra junk character for every special character.
But the same input works fine when I run the file independently with TX map.
Any idea how could this happen?
Back to top
View user's profile Send private message
mqjeff
PostPosted: Fri Aug 06, 2010 4:39 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

It's almost certainly not a junk character.

it's almost certainly one half of a double-byte UTF-8 character. You see that word "double"? it means "two". So certain UTF-8 characters take *two* bytes, not *one* to define. So if you are looking at the data using something that presents every byte as a single character, you will see this result.

If you are only having issues because you are *seeing* this character, but you are able to *process* the file correctly, then there is no bug at all.

If you are having issues PROCESSING this file, then it is because something in the message flow is not properly indicating the correct CCSID for the data, and so the data is being serialized incorrectly.

Take a user trace. Pay very close attention to everything you do with CCSID, including what the CCSID on the message is when it goes into WTX and when it comes out of WTX.
Back to top
View user's profile Send private message
saurabh867
PostPosted: Fri Aug 06, 2010 5:18 am    Post subject: Reply with quote

Voyager

Joined: 13 Jun 2010
Posts: 78

You have a point coz I tried comparing the value of the element conataining extra character (one half ) against the original value without extra character and it passed the condition.
So does that mean the data is correct and it is just the represntaion of the data is not prooperly visible. In this case, is there any way to see the correct data and do I need to contact my DBA as I am putting my message in database and it has that extra character which does not look good from the end user perspective.

Regards,
Saurabh
Back to top
View user's profile Send private message
kimbert
PostPosted: Fri Aug 06, 2010 7:24 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
So does that mean the data is correct and it is just the represntaion of the data is not prooperly visible?
You need to answer that question for yourself. If you cannot, then you need to learn how. Please see this article for reasons why I believe this:
http://www.joelonsoftware.com/articles/Unicode.html
Back to top
View user's profile Send private message
saurabh867
PostPosted: Sun Aug 08, 2010 8:53 pm    Post subject: Reply with quote

Voyager

Joined: 13 Jun 2010
Posts: 78

Thanks Kimbert,
I did answer the question and yes this article is a must read for an understanding of encoding.

Regards,
Saurabh
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Handling UTF- characters in message flow
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.