Author |
Message
|
siljcjose |
Posted: Tue Mar 10, 2009 3:03 am Post subject: ANSI Supporting CCSID values |
|
|
Apprentice
Joined: 18 Aug 2005 Posts: 27
|
Hi,
I have a requirement where in the data need to be send to the destination in the ANSI format. The data in the message includes Korean and English characters. Currently i am using CCSID 1208, and the data is reaching the destination with both Korean and English Characters, but it is in UTF-8. The destination application needs it in ANSI format. Please can any one help me with the CCSID, which Supports Korean and English and is ANSI
Thanks |
|
Back to top |
|
 |
Vitor |
Posted: Tue Mar 10, 2009 3:07 am Post subject: Re: ANSI Supporting CCSID values |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
siljcjose wrote: |
Currently i am using CCSID 1208, and the data is reaching the destination with both Korean and English Characters, but it is in UTF-8. The destination application needs it in ANSI format. Please can any one help me with the CCSID, which Supports Korean and English and is ANSI |
You could sensibly say that the best practice model is "receiver makes good" and that the destination should convert into whatever CCSID they desire. You could facilitate this by using UTF-16 rather than UTF-8 and giving them the widest possible choice.
Failing that, ask them what ANSI CCSID they'd like.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Mar 10, 2009 5:15 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
You could facilitate this by using UTF-16 rather than UTF-8 and giving them the widest possible choice |
Not sure what you mean there. Can you elaborate please? |
|
Back to top |
|
 |
Vitor |
Posted: Tue Mar 10, 2009 5:20 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
kimbert wrote: |
Quote: |
You could facilitate this by using UTF-16 rather than UTF-8 and giving them the widest possible choice |
Not sure what you mean there. Can you elaborate please? |
It could mean I've had too much coffee this morning, but I was indicating that, if there's any question about the destination site's requirements, you could send it as double byte Unicode and be sure that every possible character type is represented. Overkill when you're going to translate it into 8-bit ASCII but why kill a little when you can kill a lot? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Mar 10, 2009 7:54 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
UTF-8 can represent any Unicode character. So can UTF-32, UCS and UTF-16. They are all encodings of Unicode.
UTF-8 is a multi-byte encoding, and handles non-ASCII characters by using 2,3 or 4 bytes.
UTF-16 can handle almost any character using 16-bits. For the rest, it uses a 'surrogate pair' consisting of two 16-bit characters.
UTF-32 is fixed-width, but very wasteful of space. |
|
Back to top |
|
 |
siljcjose |
Posted: Tue Mar 10, 2009 8:00 pm Post subject: |
|
|
Apprentice
Joined: 18 Aug 2005 Posts: 27
|
Thanks for your answers.
When i am using UTF-8, i have all the characters appearing fine, but the client has a strange requirement that they need the file as ANSI. We are FTPing the file and when they open the file they can see it as UTF-8 and they want it as ANSI.
When i use a CCSID say 819, i can see that the file is saved as ANSI, but the Korean characters are lost.
So i was wondering do we have some CCSID value which can make the file an ANSI file and also preserve the English and Korean characters. |
|
Back to top |
|
 |
smdavies99 |
Posted: Tue Mar 10, 2009 10:59 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
a file encoded as ANSI will not show any Korean Characters as they are not defined in the ANSI spec. ANSI = American National Standards Institute. This was (back in the days when I did some Dec VT102 Custom Characterset Microcoding) a 7 bit character set so you were even more limited in what you could send. We switched to 8bit so we could at least draw lines on the screen with FMS-11. (Forms on a dumb terminal etc)
There is no way (AFAIK) that you can satisfy the requirement to store the data with both languages using ANSI.
The ONLY way (IMHO) is to get the requirement changed in some way or the other.
There are two choices.
a) Drop the Korean
b) Change to UTF-8
You could be sly and make your flow do either and the choice made using a promoted property.... _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Mar 11, 2009 12:23 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
siljcjose wrote: |
So i was wondering do we have some CCSID value which can make the file an ANSI file and also preserve the English and Korean characters. |
Whoever's receiving the file must. Again I say ask them what CCSID they're using. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
rekarm01 |
Posted: Wed Mar 11, 2009 2:35 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
siljcjose wrote: |
... but the client has a strange requirement that they need the file as ANSI. |
It's not clear what "ANSI" is supposed to mean here, but IBM defines several dozen ASCII-based Korean CCSIDs, that make use of one ANSI-based standard or another. It's really a bad idea to guess here; so many CCSIDs are similar, but might have subtle differences in how they behave. The client ought to be much more specific about which CCSID it requires.
What is the source application's CCSID? Is it 1208 (UTF-8) or something else, (which would limit what CCSIDs are available to the destination)? What application does the destination use to "open the file", and what can the client tell you about how it's configured? Does the destination expect just the English and Korean letters, or can it accept hangul syllables or hanja characters (or user-defined characters) too?
Can the client provide a sample file that meets its requirements, amenable to a hex-dump? Here are some common Korean CCSIDs to compare against:- single-byte: 891, 1040
- mixed-byte: 970, 949, 1363
And here are some more. |
|
Back to top |
|
 |
|