MQSeries.net :: View topic - Reading japanese character through MQ

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ API Support » Reading japanese character through MQ

Goto page Previous 1, 2, 3, 4 Next

Reading japanese character through MQ

« View previous topic :: View next topic »

Author

Message

bruce2359

Posted: Thu Aug 11, 2011 7:59 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

This is MQ101 stuff.

The MQMD describes the application data.

The qmgr will set some of the MQMD fields, but the application is responsible for setting all of the MQMD fields to accurately describe the application data.

Clearly, Japanese is not ccsid 500, it is some other ccsid. The application that created the message blew it.

Smack the app developer with something attention-getting.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

bruce2359

Posted: Thu Aug 11, 2011 8:10 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

You cannot successfully convert to ccsid 500 until you know exactly what is the ccsid of the data the app developer put into the message.

If the application data is in DBCS, you cannot convert it to ccsid 500, which is SBCS.

CCSID 500 in the MQMD appears to be incorrect (a lie) for the app data. Ask the developers EXACTLY what kind of Japanese (ccsid) the app data is?

Read this very carefully: the CCSID may not accurately identify the ccsid of the application payload. It is the responsibility of the application to set the ccsid in the MQMD to the ccisd of the application payload. The qmgr cannot know what the ccsid of the app data is.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

vinothkhannas

Posted: Thu Aug 11, 2011 9:14 am Post subject:

Novice

Joined: 09 Aug 2011
Posts: 13

Hi All,

I just discovered that the byte content varies between queue and at my end. What i mean is, i took the entire byte content from queue using queue commands and other one is i read all the bytes at my end from the queue using .NET and wrote it in a file. When i compared both the hex values, i found that the hex values for kana is diff in the file which i wrote it than at the queue.

So from the queue itself am getting the junk values for japanese char. And i somehow need to convert this junk into japanese.

Will this information help in anyway?

Vitor

Posted: Thu Aug 11, 2011 9:25 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

contact admin wrote:

Will this information help in anyway?

Yes. It confirms everything that's been said so far in this thread.

You either:

a) Need to have the message sent such that the CCSID matches the actual CCSID of the message payload not just the default 500 WMQ sticks in the CCSID when the sender doesn't bother to make it match so it doesn't turn to junk

or

b) Treat the message as a stream of bytes and write code yourself to convert it to text on a byte by btye basis.
_________________
Honesty is the best policy.
Insanity is the best defence.

bruce2359

Posted: Thu Aug 11, 2011 10:19 am Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

Vitor wrote:

contact admin wrote:

Will this information help in anyway?

Option a) is the appropriate fix - to have the app developer correct the application so that it sets the CCSID field of the MQMD so that it correctly specifies the ccsid of the app data, so that down-stream applications can mqget the message with MQGMO_CONVERT option - which will have the qmgr drive conversion of the application data.

Option b) is needless manual work that should be accomplished through existing wmq facilities (see above).
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

fjb_saper

Posted: Thu Aug 11, 2011 1:44 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

contact admin wrote:

Hi All,

I just discovered that the byte content varies between queue and at my end. What i mean is, i took the entire byte content from queue using queue commands and other one is i read all the bytes at my end from the queue using .NET and wrote it in a file. When i compared both the hex values, i found that the hex values for kana is diff in the file which i wrote it than at the queue.

So from the queue itself am getting the junk values for Japanese char. And i somehow need to convert this junk into Japanese.

Will this information help in anyway?

So what gave you the conclusion that the content of the queue is wrong when obviously you don't have the same content in the file? Could it not be that the content of the file is wrong?

So you need to check carefully...

What is the CCSID of the message on the queue
What is the format of the message on the queue (MQFMT_STRING?)
What is the CCSID on the MQMD when you retrieve (read/browse) the message on the queue (hopefully 1200 or 1208)
Does the output of your .NET program correspond with the correct CCSID translation (check bytes)

Back to work now...

_________________
MQ & Broker admin

bruce2359

Posted: Thu Aug 11, 2011 6:55 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

fjb_saper wrote:

What is the CCSID of the message on the queue?

This means: what is the ccsid of the application data payload of the message on the queue?

You have been asked several times to find out from the developer exactly what is the ccsid (if any) of application data he/she put into the message that was put to the queue. Once again, the CCSID field of the MQMD does not determine what ccsid the application data is. he purpose of the ccsid field in the mqmd is identify the ccsid of the application data.

And, once again, you cannot develop a conversion program to convert an unknown ccsid to ccsid(500) or anything else.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

fjb_saper

Posted: Thu Aug 11, 2011 7:46 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

bruce2359 wrote:

fjb_saper wrote:

What is the CCSID of the message on the queue?

This means: what is the ccsid of the application data payload of the message

Let's be really precise here.
What I want to know there are 2 things

CCSID of the payload
Value of the CCSID field on the different headers on the message

Use RFHUtil(c) without conversion to make sure you can identify everything right. You will need to display the data in HEX or mixed hex + ascii to correctly identify the CCSID of the data. Do use the information from the developer as a guide, but in view of the history there, you should really verify it against the hex view of the payload.

Food for thought: and see here IBM's table of CCSIDs

_________________
MQ & Broker admin

Last edited by fjb_saper on Fri Aug 12, 2011 12:35 am; edited 3 times in total

bruce2359

Posted: Thu Aug 11, 2011 8:30 pm Post subject:

Poobah

Joined: 05 Jan 2008
Posts: 9475
Location: US: west coast, almost. Otherwise, enroute.

fjb_saper wrote:

You will need to display the data in HEX or mixed hex + ascii to correctly identify the CCSID of the data.

Presuming that you understand what the payload data content should look like. Quick quiz: what codepage is x'4e'?

fjb_saper wrote:

Do use the information from the developer as a guide, but in view of the history there, you should really verify it against the hex view of the payload.

The originator of the payload would seem to be the best source of the ccsid identity. If the developer won't tell you, then go over his/her head, and find out. This should not be a secret.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.

vinothkhannas

Posted: Sat Aug 13, 2011 12:00 am Post subject:

Novice

Joined: 09 Aug 2011
Posts: 13

Hi All,

Thanks for your replies. The sender had given me wrong values and i took old value for comparison.
So the scenario now is am getting the same hex values which the sender is using. The CCSID at my end and sender end are same now (ie) 500. So when i read the bytes am getting the same hex values.

I'll be processing these information as bytes, so guess this is enough for me.I learnt a lot in this thread.
Btw, just in case if any one knows how to make those japanese character viewable in IE, please help me. I don't exactly have to do this, but jus curious why its not displaying in hex though i have given the header as UTF-8 in the top of the xml.

Thanks all:)

Vitor

Posted: Sat Aug 13, 2011 3:51 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

contact admin wrote:

but jus curious why its not displaying in hex though i have given the header as UTF-8 in the top of the xml.

Putting UTF-8 in the XML declaration is like setting a CCSID; it doesn't set the code page, it describes it. So your document is still has whatever content you've written out.

And UTF-8 doesn't display as hex. Ever.
_________________
Honesty is the best policy.
Insanity is the best defence.

vinothkhannas

Posted: Sat Aug 13, 2011 4:09 am Post subject:

Novice

Joined: 09 Aug 2011
Posts: 13

Vitor wrote:

And UTF-8 doesn't display as hex. Ever.

Hey guess i've mistakenly given this. Concept is, i have the proper hex values, and in IE it turns out to be the following.
Â»Ã„Ã…Ã®Ã„Â»

But the hex values of the above string after CP500 or 500 encoding corresponds to original japanese values, so was wondering how to bring back the original japanese values from the above string.
Original content : ﾄｳｷｮｳﾄ

Is there a way ? or should i do a series of encoding in .NET after retrieving it ?

fjb_saper

Posted: Sat Aug 13, 2011 5:04 am Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

Well to display with IE you may want to transform to CCSID 1208 or UTF-8 before you serve it up to IE. And set the content encoding tag in IE to UTF-8.

Have fun

_________________
MQ & Broker admin

rekarm01

Posted: Mon Aug 22, 2011 3:56 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 1415

bruce2359 wrote:

This is from the WMQ Application Programming Reference, which identifies WMQ-supported ccsids, AND to which other ccsids thay be converted by WMQ:
Japanese Latin SBCS ccsid 1027 (converts to 500)
Japanese Katakana SBCS ccsid 290 (converts to 500)
Japanese Kanji/ Latin Mixed ccsids 1399, 5035 (do not convert to 500)
Japanese Kanji/ Katakana Mixed ccsids 1390, 5026 (do not convert to 500)

Vitor wrote:

This demonstrates how much direct experience I have of Japanese...

This list just identifies whether or not MQGET will return an MQCC_WARNING / MQRC_NOT_CONVERTED reason code for unsupported conversions.

It does not imply that ccsid=500 can represent Japanese characters.

For supported conversions, MQGET could still substitute all Japanese characters with '?', and happily return an MQCC_OK / MQRC_NONE.

rekarm01

Posted: Mon Aug 22, 2011 4:04 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 1415

contact admin wrote:

I checked with amqsbcg and the hex values in my queue are correct.

Code:

ﾄｳｷｮｳﾄ
8B 63 67 56 63 8B

Whether these hex values are correct depends on the given ccsid in the MQ header:

Code:

ccsid=290,930: X'95 83 87 55 83 95' (Japanese EBCDIC)
ccsid=939,1027: X'8b 63 67 56 63 8b' (Japanese Latin EBCDIC)
ccsid=897,932,943,1041: X'c4 b3 b7 ae b3 c4' (Japanese PC-DATA)
ccsid=1200: X'ff84 ff73 ff77 ff6e ff73 ff84' (UTF-16BE)
ccsid=1208: X'efbe84 efbdb3 efbdb7 efbdae efbdb3 efbe84'(UTF-8)

ccsid=500: X'?? ... ??' (International EBCDIC)

Converting from one ccsid to another should actually change the hex values to match the new ccsid.

As others have stated, ccsid=500 cannot represent Japanese characters. That explains why:

contact admin wrote:

If i convert the above string to 500 am getting 6F 6F...6F (six times)

So, stop doing that.

contact admin wrote:

The CCSID at my end and sender end are same now (ie) 500. So when i read the bytes am getting the same hex values.

I'll be processing these information as bytes, so guess this is enough for me.

Ok, that's another way to go. Just treat the data as bytes, and ignore the bad ccsid. Maybe that's enough?

contact admin wrote:

Btw, just in case if any one knows how to make those japanese character viewable in IE, please help me. I don't exactly have to do this, but jus curious why its not displaying in hex though i have given the header as UTF-8 in the top of the xml.

Then again, maybe that's not enough? This is the same problem. The XML declaration needs to describe the actual byte encoding. If the byte encoding isn't UTF-8, then either fix the declaration, or convert the bytes to UTF-8.

Display posts from previous:

Goto page Previous 1, 2, 3, 4 Next

Page 3 of 4

MQSeries.net Forum Index » IBM MQ API Support » Reading japanese character through MQ

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP