ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Characters modified between two Queue Managers

Post new topic  Reply to topic Goto page 1, 2  Next
 Characters modified between two Queue Managers « View previous topic :: View next topic » 
Author Message
atedone
PostPosted: Wed Nov 08, 2017 2:19 am    Post subject: Characters modified between two Queue Managers Reply with quote

Newbie

Joined: 31 Oct 2017
Posts: 5

Hi there!

I tried to search for CCSID/Encoding posts (as I suppose the issue is there), but I didn't find any useful hint.

Scenario: we are posting a message starting with the "£" pound symbol on a remote queue definition queue on a Unix machine, this is then routed to another queue manager (I don't know the OS but I can get it) that is receiving a different character, an "É" instead.

What should we do to avoid this happening?

We know that once in production, the messages will be transferred in binary mode (as many other flows already in production) with no issues, the problem now is that we are putting these messages manually, as we are in early test mode.

Thanks in advance for any hint from you gurus out there

Cheers
Testo
Back to top
View user's profile Send private message
zpat
PostPosted: Wed Nov 08, 2017 3:04 am    Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5849
Location: UK

MQ does not convert characters unless you ask it to with CONVERT(YES) on a sender channel or MQGMO_CONVERT on a MQGET.

However what you perceive a character to be depends on how you view it.

What is the hex representation?

What is the CCSID of the QMs?

What is the CCSID id of the message?

How are you viewing this message?

If you unload the message to a file with MO71 unload (or dmpmqmsg or qload) you can see the original hex.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
atedone
PostPosted: Wed Nov 08, 2017 3:51 am    Post subject: I will dig into based on your questions... Reply with quote

Newbie

Joined: 31 Oct 2017
Posts: 5

Thanks zpat for your reply and your questions, which are triggering some focused investigation.

Cheers
T
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Wed Nov 08, 2017 3:18 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2491
Location: Melbourne, Australia

A common technique is to leave a message sitting on a local queue, and then use amqsbcg to browse the message. This will show the CCSID and Format in the MQMD, and the hex code representation of the message data. Check that a character hex code (in the CCSID) is the character that you are expecting. If its not, its an issue with the app that put the message.
_________________
Glenn
Back to top
View user's profile Send private message
tczielke
PostPosted: Thu Nov 09, 2017 6:20 am    Post subject: Reply with quote

Guardian

Joined: 08 Jul 2010
Posts: 939
Location: Illinois, USA

Data conversion issues can be tricky to debug. It helps to run traces, track the CCSID/Encoding being used, and get to the byte level of the message data in the trace. However, I understand that this is an advanced thing to do. This MQ session below tries to provide some guidance on how to do that.

http://www.mqtechconference.com/sessions_v2016/MQTC_v2016_DataConversion.pdf
_________________
Working with MQ since 2010.
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Thu Nov 09, 2017 3:49 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2491
Location: Melbourne, Australia

Yes, trace can help. Using high level tools to view messages data as characters can be misleading.
1) The tool may or may not be converting the message.
2) The tool / terminal emulator / window might be displaying character glyphs in its own character set, which is different to the MQMD CCSID or the converted data.
Hex is best!
_________________
Glenn
Back to top
View user's profile Send private message
atedone
PostPosted: Fri Nov 10, 2017 2:19 am    Post subject: Thanks a lot to all of you! Reply with quote

Newbie

Joined: 31 Oct 2017
Posts: 5

I restarted recently to browse this forum (last time was in 2005) and it's a great pleasure to see that is still plenty of kind, collaborative and competent human being.

Thanks a lot for all your hints, they are proving to be useful!

Have a great day
T
Back to top
View user's profile Send private message
rekarm01
PostPosted: Wed Nov 15, 2017 8:48 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

tczielke wrote:
Data conversion issues can be tricky to debug. It helps to run traces, track the CCSID/Encoding being used, and get to the byte level of the message data in the trace...

I had not thought to use traces for that, (possibly because the applications in question were often running on remote servers I didn't have access to). I have mostly relied on message browsing tools, like "amqsbcg0", and other Unix utilities like "od" or "iconv".

tczielke wrote:
http://www.mqtechconference.com/sessions_v2016/MQTC_v2016_DataConversion.pdf

IBM's "Data Conversion Under WebSphere MQ" document has a lot of useful information, but a few parts can be misleading, or inaccurate. The IBM Character Data Representation Architecture (CDRA) is another useful resource.

tczielke wrote:
The Coded Character Set ID (CCSID) or Code Page is a table of assigning glyphs to a number

No glyphs, just abstract characters, identified by label, either an IBM-specific "Graphic Character Global Id" (GCGID) with a short description, or a Unicode-based "Graphic Character UCS Id" (GCUID), or both. Glyphs, graphemes, character shapes, physical representations, or implied meanings of graphic characters are either non-normative, or outside the scope of the IBM CDRA.

And a Coded Character Set ID (CCSID) is more than just a code page; it describes, (among other things), one or more character set (CS) / code page (CP) pairs, to map between characters and non-negative integers (code points), and an encoding scheme (ES), to map between code points and physical bytes.

For example, Unicode has multiple transformation formats, so IBM provides multiple ccsids, with the same character set / code page pairs, but different encoding schemes. And there are two versions of the windows-1252 character encoding, (one with an added Euro character), so IBM provides two different ccsids, with the same code page, but different character sets:

Code:
CCSID=1200 (UTF-16BE)                       CCSID=1208 (UTF-8)
  - ES=7200 (UTF-16BE CES)                    - ES=7807 (UTF-8 CES)
  - CS=65535 / CP=1400 (Plane  0: BMP)        - CS=65535 / CP=1400 (Plane  0: BMP)
  - CS=65535 / CP=1401 (Plane  1: SMP)        - CS=65535 / CP=1401 (Plane  1: SMP)
  - CS=65535 / CP=1402 (Plane  2: SIP)        - CS=65535 / CP=1402 (Plane  2: SIP)
  - CS=65535 / CP=1414 (Plane 14: SSP)        - CS=65535 / CP=1414 (Plane 14: SSP)
  - ...                                       - ...

CCSID=1252 (MS Windows, Latin-1)            CCSID=5348 (MS Windows, Latin-1, Version 2)
  - ES=4105                                   - ES=4105
  - CS=1402 / CP=1252 (Windows, Latin-1)      - CS=1412 / CP=1252 (Windows, Latin-1 + euro)

(Side note: IBM MQ handles UTF-16 endianness differently from the example above.)

tczielke wrote:
Java PUT of String Message: For using IBM MQ Classes for Java, a Java String is encoded in UTF-16. Since the String has an [inherent] CCSID, ...

Java Strings do not have an inherent CCSID. CCSIDs describe physical bytes, but they don't describe the hidden representation of abstract characters.

tczielke wrote:
Java GET of String Message: //Unconverted GET and then Java converts from EBCDIC to UTF-8 and then from UTF-8 to UTF-16

The given example converts directly from EBCDIC to UTF-16; it does not convert to or from UTF-8.
Back to top
View user's profile Send private message
tczielke
PostPosted: Thu Nov 16, 2017 6:04 am    Post subject: Reply with quote

Guardian

Joined: 08 Jul 2010
Posts: 939
Location: Illinois, USA

rekarm01 wrote:
tczielke wrote:
Data conversion issues can be tricky to debug. It helps to run traces, track the CCSID/Encoding being used, and get to the byte level of the message data in the trace...

I had not thought to use traces for that, (possibly because the applications in question were often running on remote servers I didn't have access to). I have mostly relied on message browsing tools, like "amqsbcg0", and other Unix utilities like "od" or "iconv".

tczielke wrote:
http://www.mqtechconference.com/sessions_v2016/MQTC_v2016_DataConversion.pdf

IBM's "Data Conversion Under WebSphere MQ" document has a lot of useful information, but a few parts can be misleading, or inaccurate. The IBM Character Data Representation Architecture (CDRA) is another useful resource.

tczielke wrote:
The Coded Character Set ID (CCSID) or Code Page is a table of assigning glyphs to a number

No glyphs, just abstract characters, identified by label, either an IBM-specific "Graphic Character Global Id" (GCGID) with a short description, or a Unicode-based "Graphic Character UCS Id" (GCUID), or both. Glyphs, graphemes, character shapes, physical representations, or implied meanings of graphic characters are either non-normative, or outside the scope of the IBM CDRA.

And a Coded Character Set ID (CCSID) is more than just a code page; it describes, (among other things), one or more character set (CS) / code page (CP) pairs, to map between characters and non-negative integers (code points), and an encoding scheme (ES), to map between code points and physical bytes.

For example, Unicode has multiple transformation formats, so IBM provides multiple ccsids, with the same character set / code page pairs, but different encoding schemes. And there are two versions of the windows-1252 character encoding, (one with an added Euro character), so IBM provides two different ccsids, with the same code page, but different character sets:

Code:
CCSID=1200 (UTF-16BE)                       CCSID=1208 (UTF-8)
  - ES=7200 (UTF-16BE CES)                    - ES=7807 (UTF-8 CES)
  - CS=65535 / CP=1400 (Plane  0: BMP)        - CS=65535 / CP=1400 (Plane  0: BMP)
  - CS=65535 / CP=1401 (Plane  1: SMP)        - CS=65535 / CP=1401 (Plane  1: SMP)
  - CS=65535 / CP=1402 (Plane  2: SIP)        - CS=65535 / CP=1402 (Plane  2: SIP)
  - CS=65535 / CP=1414 (Plane 14: SSP)        - CS=65535 / CP=1414 (Plane 14: SSP)
  - ...                                       - ...

CCSID=1252 (MS Windows, Latin-1)            CCSID=5348 (MS Windows, Latin-1, Version 2)
  - ES=4105                                   - ES=4105
  - CS=1402 / CP=1252 (Windows, Latin-1)      - CS=1412 / CP=1252 (Windows, Latin-1 + euro)

(Side note: IBM MQ handles UTF-16 endianness differently from the example above.)

tczielke wrote:
Java PUT of String Message: For using IBM MQ Classes for Java, a Java String is encoded in UTF-16. Since the String has an [inherent] CCSID, ...

Java Strings do not have an inherent CCSID. CCSIDs describe physical bytes, but they don't describe the hidden representation of abstract characters.

tczielke wrote:
Java GET of String Message: //Unconverted GET and then Java converts from EBCDIC to UTF-8 and then from UTF-8 to UTF-16

The given example converts directly from EBCDIC to UTF-16; it does not convert to or from UTF-8.


Thank you for the feedback! I will review this and adjust the presentation where appropriate, the next time I give it. For the "Java GET of String Message" issue that you pointed out, I coincidentally caught that earlier this week and it was corrected yesterday on the MQTC website.

Out of curiousity, did you help write the "Data Conversion Under WebSphere MQ" document?
_________________
Working with MQ since 2010.
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Thu Nov 16, 2017 2:55 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2491
Location: Melbourne, Australia

Another common issue is that the app constructs message data using a particular CCSID (usually the compiler / runtime native CCSID), but the queued message has a different effective value for CCSID in its MQMD.

I encountered this issue regularly when I used to support z/OS MQ. The app was using CCSID 37 internally, but messages were queued as CCSID 500 (the qmgrs default CCSID). There are a number of hex codes that have different character representations in these EBCDIC code sets.
_________________
Glenn
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Thu Nov 16, 2017 5:13 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7717

gbaddeley wrote:

I encountered this issue regularly when I used to support z/OS MQ. The app was using CCSID 37 internally, but messages were queued as CCSID 500 (the qmgrs default CCSID). There are a number of hex codes that have different character representations in these EBCDIC code sets.


Like ! and |

And Kia decided to name one of the trim level for their Kia Soul cars "!".

Why did you send KiaSoul|? We didn't, we sent KiaSoul!. No you didn't. Yes we did. No you didn't. Yes we did. Normally I advise to let the CCSID default on the MQPUT, but when dealing with mainframe apps running in an environment where there might be a mix of 037 and 500, I tell 'em to learn what code page their app runs as, and specify that in the MQMD CCSID when putting the message.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
rekarm01
PostPosted: Tue Nov 21, 2017 6:42 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

tczielke wrote:
Out of curiousity, did you help write the "Data Conversion Under WebSphere MQ" document?

Sorry, no. If I did help, I would have recommended rewriting the misleading or inaccurate bits.

gbaddeley wrote:
Another common issue is that the app constructs message data using a particular CCSID (usually the compiler / runtime native CCSID), but the queued message has a different effective value for CCSID in its MQMD.

For example, how many apps set the MQMD.CodedCharSetId to MQCCSI_Q_MGR for an MQPUT, rather than whichever CCSID represents the string data in the message, because that's what the "Data Conversion" doc instructed them to do?
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Wed Nov 22, 2017 5:57 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20695
Location: LI,NY

rekarm01 wrote:
tczielke wrote:
Out of curiousity, did you help write the "Data Conversion Under WebSphere MQ" document?

Sorry, no. If I did help, I would have recommended rewriting the misleading or inaccurate bits.

gbaddeley wrote:
Another common issue is that the app constructs message data using a particular CCSID (usually the compiler / runtime native CCSID), but the queued message has a different effective value for CCSID in its MQMD.

For example, how many apps set the MQMD.CodedCharSetId to MQCCSI_Q_MGR for an MQPUT, rather than whichever CCSID represents the string data in the message, because that's what the "Data Conversion" doc instructed them to do?


That's because there are a number of assumptions behind that recommendation, none of which may be true:
  • The queue manager runs with a CCSID representing the default CCISD of the platform.
  • The compiler uses the default CCSID of the platform
  • The program creates the message in the default CCSID of the platform
  • The program writes the message in the default CCSID of the platform

Any of those being false and you are probably better off explicitly setting the CCSID of the message instead of using the qmgr default.
Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
tczielke
PostPosted: Mon Nov 27, 2017 3:51 pm    Post subject: Reply with quote

Guardian

Joined: 08 Jul 2010
Posts: 939
Location: Illinois, USA

rekarm01 wrote:
tczielke wrote:
Data conversion issues can be tricky to debug. It helps to run traces, track the CCSID/Encoding being used, and get to the byte level of the message data in the trace...

I had not thought to use traces for that, (possibly because the applications in question were often running on remote servers I didn't have access to). I have mostly relied on message browsing tools, like "amqsbcg0", and other Unix utilities like "od" or "iconv".

tczielke wrote:
http://www.mqtechconference.com/sessions_v2016/MQTC_v2016_DataConversion.pdf

IBM's "Data Conversion Under WebSphere MQ" document has a lot of useful information, but a few parts can be misleading, or inaccurate. The IBM Character Data Representation Architecture (CDRA) is another useful resource.

tczielke wrote:
The Coded Character Set ID (CCSID) or Code Page is a table of assigning glyphs to a number

No glyphs, just abstract characters, identified by label, either an IBM-specific "Graphic Character Global Id" (GCGID) with a short description, or a Unicode-based "Graphic Character UCS Id" (GCUID), or both. Glyphs, graphemes, character shapes, physical representations, or implied meanings of graphic characters are either non-normative, or outside the scope of the IBM CDRA.

And a Coded Character Set ID (CCSID) is more than just a code page; it describes, (among other things), one or more character set (CS) / code page (CP) pairs, to map between characters and non-negative integers (code points), and an encoding scheme (ES), to map between code points and physical bytes.

For example, Unicode has multiple transformation formats, so IBM provides multiple ccsids, with the same character set / code page pairs, but different encoding schemes. And there are two versions of the windows-1252 character encoding, (one with an added Euro character), so IBM provides two different ccsids, with the same code page, but different character sets:

Code:
CCSID=1200 (UTF-16BE)                       CCSID=1208 (UTF-8)
  - ES=7200 (UTF-16BE CES)                    - ES=7807 (UTF-8 CES)
  - CS=65535 / CP=1400 (Plane  0: BMP)        - CS=65535 / CP=1400 (Plane  0: BMP)
  - CS=65535 / CP=1401 (Plane  1: SMP)        - CS=65535 / CP=1401 (Plane  1: SMP)
  - CS=65535 / CP=1402 (Plane  2: SIP)        - CS=65535 / CP=1402 (Plane  2: SIP)
  - CS=65535 / CP=1414 (Plane 14: SSP)        - CS=65535 / CP=1414 (Plane 14: SSP)
  - ...                                       - ...

CCSID=1252 (MS Windows, Latin-1)            CCSID=5348 (MS Windows, Latin-1, Version 2)
  - ES=4105                                   - ES=4105
  - CS=1402 / CP=1252 (Windows, Latin-1)      - CS=1412 / CP=1252 (Windows, Latin-1 + euro)

(Side note: IBM MQ handles UTF-16 endianness differently from the example above.)

tczielke wrote:
Java PUT of String Message: For using IBM MQ Classes for Java, a Java String is encoded in UTF-16. Since the String has an [inherent] CCSID, ...

Java Strings do not have an inherent CCSID. CCSIDs describe physical bytes, but they don't describe the hidden representation of abstract characters.

tczielke wrote:
Java GET of String Message: //Unconverted GET and then Java converts from EBCDIC to UTF-8 and then from UTF-8 to UTF-16

The given example converts directly from EBCDIC to UTF-16; it does not convert to or from UTF-8.


A lot of very helpful information here, especially pointing me to the CRDA. That was the piece I was missing to better understand IBM CCSIDs. I am still digesting all of this, but I would say that the MQ CCSID (found in the message descriptor) is not the same thing as the IBM CCSID that is pointed to in those links. MQ seems to have a much looser interpretation of the IBM CCSID (e.g. MQ CCSID 1200 which collapses several IBM CCSIDs into it). You almost need two different terms here, IBM CCSID and MQ CCSID. It makes me question how well you can go to the IBM CCSID definition and then expect the MQ CCSID equivalent to competely follow it. You can't for at least 1200.
_________________
Working with MQ since 2010.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Nov 27, 2017 7:49 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20695
Location: LI,NY

tczielke wrote:
MQ seems to have a much looser interpretation of the IBM CCSID (e.g. MQ CCSID 1200 which collapses several IBM CCSIDs into it). You almost need two different terms here, IBM CCSID and MQ CCSID. It makes me question how well you can go to the IBM CCSID definition and then expect the MQ CCSID equivalent to competely follow it. You can't for at least 1200.

When talking about CCSID 1200 here are you talking loosely (including CCSID 1201 and 1202) or strictly (CCSID 1200 only)?
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » IBM MQ Installation/Configuration Support » Characters modified between two Queue Managers
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.