|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
UTF-8 to ISO-8859-8 EBCDIC Conversion |
« View previous topic :: View next topic » |
Author |
Message
|
18tillidie |
Posted: Thu Sep 03, 2020 2:12 pm Post subject: UTF-8 to ISO-8859-8 EBCDIC Conversion |
|
|
Newbie
Joined: 04 May 2020 Posts: 9
|
Hi,
I need to convert a file on Unix UTF-8 to ebcdic ISO-8859-8 format. Need help to identify the exact ccsid and encoding for conversion.
I am using SFTP mechanism of the file input( pull utf-8 ) and output node(Push EBCDIC). SFTP uses binary transfer by default.
Earlier I was able to convert EBCDIC file to readable UTF-8 using below CCSID and encoding.
DECLARE fileData CHARACTER CAST(InputRoot.BLOB.BLOB AS CHARACTER CCSID 37 ENCODING 546);
Now I need help to do the conversion the other way round.
Any help will be much appreciated.
Thanks in advance. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Sep 03, 2020 6:46 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
AFAIK you don't get full UTF-8 to EBCDIC conversion.
You will get a subset of UTF-8 to EBCDIC depending on the EBCDIC code page (arab, israeli, hindu etc...)
So be prepared that some chars of your UTF-8 file might not be convertible into your EBCDIC target code page (37 is standard US and 500 is International engiish)... :enjoy: _________________ MQ & Broker admin |
|
Back to top |
|
 |
rekarm01 |
Posted: Fri Sep 04, 2020 11:28 am Post subject: Re: UTF-8 to ISO-8859-8 EBCDIC Conversion |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
18tillidie wrote: |
I need to convert a file on Unix UTF-8 to ebcdic ISO-8859-8 format. Need help to identify the exact ccsid and encoding for conversion. |
ISO-8859-8 is ASCII-based, not EBCDIC. It uses a Latin/Hebrew character set. Some possibly relevant ccsids are:
- ccsid=819, Latin-1 USA/Canada/W.Europe (ISO 8859-1)
- ccsid=037, Latin-1 USA/Canada/W.Europe (EBCDIC)
- ccsid=916, Latin/Hebrew (ISO 8859-8)
- ccsid=424, Latin/Hebrew (EBCDIC)
- ccsid=1208, UTF-8
Converting between UTF-8 and single byte character sets does not require an Encoding, but it is typically 273 for big-endian platforms (like Unix), 546 for little-endian platforms (like Windows), or 785 for z/OS.
18tillidie wrote: |
I am using SFTP mechanism of the file input( pull utf-8 ) and output node(Push EBCDIC). SFTP uses binary transfer by default. |
When using character-based parsers, (like XMLNSC, or JSON), it's usually sufficient for input to specify the input ccsid on the FileInput node, or for output to set the OutputRoot.Properties.CodedCharSetID, and let the parser performs the conversion automatically. But BLOB requires an explicit CAST. For input, something like:
Code: |
DECLARE fileData CHARACTER CAST(InputRoot.BLOB.BLOB AS CHARACTER CCSID InputRoot.Properties.CodedCharSetID); |
Or for output, something like:
Code: |
SET OutputRoot.Properties.CodedCharSetID = 37; -- Latin-1, EBCDIC
SET OutputRoot.BLOB.BLOB = CAST(fileData AS BLOB CCSID OutputRoot.Properties.CodedCharSetID); |
And, as fjb_saper mentioned, if the message contains characters outside of the source/target character set, then the CAST will throw an exception. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|