|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Character Conversion Problem |
« View previous topic :: View next topic » |
Author |
Message
|
jonew |
Posted: Tue Oct 17, 2006 5:03 am Post subject: Character Conversion Problem |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
Environment details:
Source System : SAP R/3 4.6c, Aix, MQ Series 5.3 CSD 8
Broker System : Aix, MQ Series 5.3 CSD 8, MQSI 2.1
Destination : Windows Server 2003, MQ Series 5.3 CSD 8
Data coming from SAP has product descriptions in Russian Text. The Aix systems are set up in English (Codepage 819). The destination is set up as English and Russian, and uses Codepage 866 (or 1251).
The flow makes no assumptions, and is just to map the input data to output. If we place a trace node after the compute node, but before the MQOutput node, the data in the product description field is still in Russian format, however, once it is placed on the Xmit queue it has converted these codes into 0x1A. This is before crossing the Channel, i.e. Channel is stopped. Also, Channel has Data Conversion(No). We set the MQMD Coded CharSetID to 1251 in the flow to try and prevent conversion when transferred to destination, but on the Xmit Queue it shows as 850.
Below is extract from trace file (trace node after Process node, before MQOutput):
BIP4060I: Data '(
(0x1000000)Properties = (
(0x3000000)MessageSet = 'DVF3T7S0LM001'
(0x3000000)MessageType = 'R1.EWR5.BASE_UNITS.01'
(0x3000000)MessageFormat = 'TDS'
(0x3000000)Encoding = 273
(0x3000000)CodedCharSetId = 1251
(0x3000000)Transactional = TRUE
(0x3000000)Persistence = TRUE
(0x3000000)CreationTime = GMTTIMESTAMP '2006-10-17 10:48:09.650'
(0x3000000)ExpirationTime = -1
(0x3000000)Priority = 0
(0x3000000)ReplyIdentifier = X'000000000000000000000000000000000000000000000000'
(0x3000000)ReplyProtocol = 'MQ'
(0x3000000)Topic = NULL
)
(0x1000000)MQMD = (
(0x3000000)SourceQueue = 'MQSI.INBOUND.IFF3088'
(0x3000000)Transactional = TRUE
(0x3000000)Encoding = 273
(0x3000000)CodedCharSetId = 1251
(0x3000000)Format = ' '
(0x3000000)Version = 2
(0x3000000)Report = 0
(0x3000000)MsgType = 8
(0x3000000)Expiry = -1
(0x3000000)Feedback = 0
(0x3000000)Priority = 0
(0x3000000)Persistence = 1
(0x3000000)MsgId = X'414d5120512e45313130444145302e32450134dc2001ccbf'
(0x3000000)CorrelId = X'000000000000000000000000000000000000000000000000'
(0x3000000)BackoutCount = 0
(0x3000000)ReplyToQ = ' '
(0x3000000)ReplyToQMgr = 'Q.E110DAE0.2 '
(0x3000000)UserIdentifier = 'mqm '
(0x3000000)AccountingToken = X'0532393030300000000000000000000000000000000000000000000000000006'
(0x3000000)ApplIdentityData = ' '
(0x3000000)PutApplType = 6
(0x3000000)PutApplName = ' '
(0x3000000)PutDate = DATE '2006-10-17'
(0x3000000)PutTime = GMTTIME '10:48:09.650'
(0x3000000)ApplOriginData = ' '
(0x3000000)GroupId = X'000000000000000000000000000000000000000000000000'
(0x3000000)MsgSeqNumber = 1
(0x3000000)Offset = 0
(0x3000000)MsgFlags = 0
(0x3000000)OriginalLength = -1
)
(0x1000021)MRM = (
(0x1000000)R1.EWR5.BASE_UNITS.DETAIL.01 = (
(0x3000000)R1.EWR5.Product_ID = 'A001110101122147256'
(0x3000000)R1.EWR5.Promo_Value = '00'
(0x3000000)R1.EWR5.Loading_Point = '2801'
(0x3000000)R1.EWR5.Material_Description = '100³ Å 12 º¾¼¿»¸¼µ½Â ³ÞÀÌÚØÙ '
(0x3000000)R1.EWR5.Material_Number = '323'
(0x3000000)R1.EWR5.Product_Hierarchy = '001110'
(0x3000000)R1.EWR5.Width = ''
(0x3000000)R1.EWR5.Depth = ''
(0x3000000)R1.EWR5.Height = ''
(0x3000000)R1.EWR5.Gross_Weight = ''
(0x3000000)R1.EWR5.Net_Weight = ''
(0x3000000)R1.EWR5.Net_Measure = 0
(0x3000000)R1.EWR5.Unit_Measure = ''
(0x3000000)R1.EWR5.Product_Code_Type = '02'
)
)
)
' from trace node 'IFF3088.SAP.EWR5.BASE_UNITS_AND_LOCATIONS.02.Trace1'.
The material description shows the text (my machine does not display Russian, but you can see they are different characters). The Hex string looks like :
BA BE BC BF BB B8 BC B5 BD C2 20 B3 DE C0 CC DA D8 D9
On the Xmit queue, these characters have been converted, so the Hex string now looks like:
1A 1A 1A 1A BB 1A 1A B5 1A 1A 20 1A 1A 1A 1A 1A 1A 1A
Also the MQMD code page has changed to 850.
We have also tried to set the MQMD Format to MQFMT_NONE, which is shown in the trace above ( (0x3000000)Format = ' ').
How do we stop data being converted ?
Note: The output format is an MRM message set, and this field is set up as a STRING. Can this be the problem ? If I change to BINARY it does not convert, but changes to double byte, i.e. BABEBC becomes BA00BE00BC00. |
|
Back to top |
|
 |
jonew |
Posted: Tue Oct 17, 2006 6:42 am Post subject: |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
We managed to fix it through a number of trial and error scenarios, but basically as follows:
-- In the flow when setting up the output message
SET OutputRoot.MQMD.Format = MQFMT_NONE;
SET OutputRoot.MQMD.Encoding = MQENC_NATIVE;
-- and do not set any CCSID on the MQMD
Seems to be the Format value, which we had originally set to MQFMT_STRING with the CCSID being set to native Russian (Cyrillic 1251).
I had thought we had already tested this scenario (MQFMT_NONE), but I may have still been setting the CCSID to 1251.
Hope this one may be of use to others having trouble with CodePages, and happy coding........ |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Oct 17, 2006 2:20 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
We solved it differently with UTF-8 info (CCSID 1208).
Apparently when you copy the headers it thinks that properties CCSID is really default and may get changed.
Just make sure you set it explicitely (from memory):
SET OutputRoot.Properties.CodedCharacterSetId= InputRoot.Properties.CodedCharacterSetId;
And the message's CCSID was passed on with format MQSTR....
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jonew |
Posted: Thu Oct 19, 2006 4:20 am Post subject: |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
Well, according to my colleagues in Russia, what we have is not Russian. We do have the same Hex code values in the destination as we had from the source, but this is not correct. Not sure if source is right now !!
Tried sjb_saper method and that did the same thing, in that we had same Hex at destination as at Source, so either method will do what I set out to do, however, that now appears not to be what is required!
Back to the drawing board for me. |
|
Back to top |
|
 |
jefflowrey |
Posted: Thu Oct 19, 2006 4:24 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
If you're duplicating the source data exactly, and your output data is wrong, then either the source data is wrong, the receiving program is wrong, or you're still not properly identifying which pieces of data belong to which character sets. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
jonew |
Posted: Thu Oct 19, 2006 10:27 am Post subject: |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
OK, the source data is using character set ISO8859-5 and this is not converting to the native Windows codepage 1251.
Looking at Cyrillic codepage entry for MQ it suggests that CCSID 915 is correct one for Aix, and if I set MQMD.CCSID to 915 and Format to MQSTR, that when passed over the channel to the Windows system running 1251, it should convert as required, except I get most of the characters converted to 0x1A. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Oct 19, 2006 1:32 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Is windows 1251 the right codepage for cyrillic?
What does that translate to in terms of CCSID? _________________ MQ & Broker admin |
|
Back to top |
|
 |
jonew |
Posted: Fri Oct 20, 2006 12:49 am Post subject: |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
Sorry, 1251 is the CCSID for Windows Cyrillic codepage. On Aix it is 915, but the two refer to different character sets, 915 is the ISO8859-5 codepage, and 1251 is a Windows codepage, which seems to have the upper and lower case cyrillic characters swapped over in their numeric position. i.e. in ISO8859-5 the lowercase ones come first, followed by Uppercase, but in Windows the Uppercase are first.
I had thought by setting CCSID to 915 in Aix that when it crosses the channel it woud convert to 1251. However, it appears to convert to the codepage of 850 BEFORE it hits the channel (Stopping the channel and capturing the message on the xmit queue showed 850 as CCSID, and the characters had been replaced by 0x1A). 819/850 is a native codepage for our broker server. |
|
Back to top |
|
 |
jonew |
Posted: Fri Oct 20, 2006 9:27 am Post subject: |
|
|
Newbie
Joined: 29 May 2003 Posts: 6
|
Decided to pass the data through unchanged by setting MQFMT_NONE on the MQMD.Format, and updated the MQ extraction program (in C) to do the conversion for me:
/*****************************************************************************
FUNCTION: XECUIF_ConvertRussian
PARAMETERS: pMessage The message retrieved from the queue to be written to the file within the UIF
nMessageLength The length of the message.
DESCRIPTION: This function is used to convert any Cyrillic (Russian) characters
presented from SAP in ISO8859-5 character set to the character set
used by Windows, known as Cyrillic 1251.
*****************************************************************************/
void XECUIF_ConvertRussian(char* pMessage, int nMessageLength)
{
int i;
unsigned char sConvTab[256] = {0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0A,0x0B,0x0C,0x0D,0x0E,0x0F,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1A,0x1B,0x1C,0x1D,0x1E,0x1F,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2A,0x2B,0x2C,0x2D,0x2E,0x2F,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x4B,0x4C,0x4D,0x4E,0x4F,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5A,0x5B,0x5C,0x5D,0x5E,0x5F,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x6B,0x6C,0x6D,0x6E,0x6F,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7A,0x7B,0x7C,0x7D,0x7E,0x7F,
0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,
0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,0x98,
0xA0,0xA8,0x80,0x81,0xAA,0xBD,0xB2,0xAF,0xA3,0x8A,0x8C,0x8E,0x8D,0xAD,0xA1,0x8F,
0xC0,0xC1,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,0xC8,0xC9,0xCA,0xCB,0xCC,0xCD,0xCE,0xCF,
0xD0,0xD1,0xD2,0xD3,0xD4,0xD5,0xD6,0xD7,0xD8,0xD9,0xDA,0xDB,0xDC,0xDD,0xDE,0xDF,
0xE0,0xE1,0xE2,0xE3,0xE4,0xE5,0xE6,0xE7,0xE8,0xE9,0xEA,0xEB,0xEC,0xED,0xEE,0xEF,
0xF0,0xF1,0xF2,0xF3,0xF4,0xF5,0xF6,0xF7,0xF8,0xF9,0xFA,0xFB,0xFC,0xFD,0xFE,0xFF,
0xB9,0xB8,0x90,0xB4,0xBA,0xBE,0xB3,0xBF,0xBC,0x9A,0x9C,0x9E,0x9D,0xA7,0xA2,0x9F};
for (i=0;i<nMessageLength;i++)
{
*pMessage = sConvTab[(unsigned char)*pMessage];
pMessage++;
}
}
Seems to work fine. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|