Author |
Message
|
sunny_30 |
Posted: Wed Feb 16, 2011 5:42 pm Post subject: Incorrect ccsid question |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Hi
I have a Broker message-flow (on AIX) receiving UTF-8 data from SAP (on windows) with incorrect CCSID mentioned in MQMD as 437
The MQInput node has convert option set to "off" (not selected)
The CCSID on sender-app performing MQPut cannot be changed
The compute-node before MQOutput has following ESQL code & no CAST statements anywhere:
Code: |
SET OutputRoot.Properties.CodedCharSetId = 1208;
SET OutputRoot.MQMD.Format = MQFMT_STRING;
SET OutputRoot.MQMD.CodedCharSetId = 1208; |
Im trying to get rid of any conversion 'attempt' inside Broker from 437 to 1208 but just looking to specify the correct CCSID in MQMD to the data going out of the flow.
My worry is that as Output & Input CCSIDs differ in my case, Broker will unnecessarily (as its asked to do so .. ) attempt to change 1208-UTF-8 data from CCSID 437 to 1208
Is there a potential to mess up data HEX due to conversion attempt using incorrect code pages? How to avoid this.. ?
please help |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Feb 16, 2011 9:29 pm Post subject: Re: Incorrect ccsid question |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
sunny_30 wrote: |
Hi
I have a Broker message-flow (on AIX) receiving UTF-8 data from SAP (on windows) with incorrect CCSID mentioned in MQMD as 437
The MQInput node has convert option set to "off" (not selected)
The CCSID on sender-app performing MQPut cannot be changed
The compute-node before MQOutput has following ESQL code & no CAST statements anywhere:
Code: |
SET OutputRoot.Properties.CodedCharSetId = 1208;
SET OutputRoot.MQMD.Format = MQFMT_STRING;
SET OutputRoot.MQMD.CodedCharSetId = 1208; |
Im trying to get rid of any conversion 'attempt' inside Broker from 437 to 1208 but just looking to specify the correct CCSID in MQMD to the data going out of the flow.
My worry is that as Output & Input CCSIDs differ in my case, Broker will unnecessarily (as its asked to do so .. ) attempt to change 1208-UTF-8 data from CCSID 437 to 1208
Is there a potential to mess up data HEX due to conversion attempt using incorrect code pages? How to avoid this.. ?
please help |
I would seriously challenge the statement that the sender-app performing MQPut cannot be changed...
Pending that you could always accept the message as a BLOB and parse it manually with CCSID 437.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
sunny_30 |
Posted: Wed Feb 16, 2011 9:47 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Code: |
I would seriously challenge the statement that the sender-app performing MQPut cannot be changed... . |
Ofcourse it should be possible.. but in my case its a little different.. its an external vendor program. The sender -app is a File-to-queue connector which sets QMgr-CCSID on all MQ-messages. But in this case the file-data happens to be generated by SAP in UTF-8 format. What I mean is file-data could vary but messages to queue always carry the static code-page of 437 coz this connector happens to be running on windows-qmgr with code-page set to 437
Code: |
Pending that you could always accept the message as a BLOB and parse it manually with CCSID 437. |
Do you mean manually parsing using 1208 ?
The inputroot.mqmd.CodedCharSetId is 437 But we know the data we are dealing with is already 1208
im thinking to use 2 compute nodes, parse manually in 1st compute using 1208 & in the later compute do the rest of processing.. so flow doesnt try to convert the message again during mqoutput ? coz for 2nd compute node.. the input & output ccsid-s will be same -1208
Also.. do data conversions also apply to BLOB data ?? i thought its only relavent to XML or string types.. OR do you mean cast BLOB to character ? |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Feb 16, 2011 10:07 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
How about doing a simple change to the windows QMGR?
Change it so that it uses 1208 as its default CCSID.
Something like
Code: |
$> runmqsc FRED
ALTER QMGR CCSID(1208)
end
|
Restart the QMGR and you are all done.
Matching QMGR CCSID's is not rocket science... _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Feb 17, 2011 3:00 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The issue you will run into is that if the message contains data that is in the wrong CCSID AND marked MQFMT_STRING, you can't guarantee that it won't get "converted" before it ever gets to your Broker.
For what you are trying to do - read the message data without any conversion - then make sure you don't set the MQInput node convert option and read the message in the BLOB Domain.
Then if you want to perform a manual "conversion" of the data, CAST it to CHARACTER and specify the correct CCSID on the CAST. CAST(... AS BLOB CCSID xyz) uses the CCSID you specify as the DESTINATION CCSID, not the SOURCE CCSID. |
|
Back to top |
|
 |
sunny_30 |
Posted: Fri Feb 18, 2011 11:09 am Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Thank you for your responses
smdavies99 wrote: |
How about doing a simple change to the windows QMGR?
Change it so that it uses 1208 as its default CCSID. |
Not all flat-files processed by sender-app are UTF-8, so we would like to keep Qmgr ccsid to windows default 437
mqjeff wrote: |
CAST(... AS BLOB CCSID xyz) uses the CCSID you specify as the DESTINATION CCSID, not the SOURCE CCSID. |
Thank you for clarifying ! I was a bit confused on how the flow handles data conversion..
The MQinput node is already set to BLOB with convert-off
MQoutput data is also BLOB but Broker correctly replaces the CCSID to 1208 & copies the format from sender as-is as MQFMT_STRING into output MQMD
No explicit CAST statements from BLOB to Character in my flow esql
Can you please confirm below statements for my understanding..:
1) Code page CCSID conversions never apply to BLOB data inside a flow but only for conversions from BLOB (hex) to CHAR or viceversa
In my scenario:
2) even though sender sets the format to STRING, but MQinput is set to BLOB & convert option not selected, we can 100% guarantee the data can never convert. Please note sender app directly writes to MQinput Q using local binding connection, data doesnt travel thru any hop to reach to the flow's source Q. (eg- via MCA channels set to convert !)
3) the following esql-
Code: |
SET OutputRoot.Properties.CodedCharSetId = 1208;
SET OutputRoot.MQMD.Format = MQFMT_STRING;
SET OutputRoot.MQMD.CodedCharSetId = 1208; |
only sets the MQMD to correct CCSID but conversion can never happen due to reason 1 above
4) In general- Using the same esql, if the data is outputted as CHAR, Shouldnt it result in automatic conversion to CCSID specified before MQouput w/o requiring any manual or explicit CAST ?
5) Even if data is output as CHAR by manual CAST from BLOB & ccsid set to 1208, it doesnt harm coz the data is already 1208..
Thanks again
Last edited by sunny_30 on Fri Feb 18, 2011 11:40 am; edited 3 times in total |
|
Back to top |
|
 |
Vitor |
Posted: Fri Feb 18, 2011 11:35 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
1) Code page CCSID conversions never apply to BLOB data inside a flow but only for conversions from BLOB (hex) to CHAR or viceversa |
Yes
sunny_30 wrote: |
2) even though sender sets the format to STRING, but MQinput is set to BLOB & convert option not selected, we can 100% guarantee the data can never convert. Please note sender app directly writes to MQinput Q using local binding connection, data doesnt travel thru any hop to reach to the flow's source Q. (eg- via MCA channels set to convert !) |
Provided all that is true. If in 6 months the sending application is moved to a different server for politicial or performance reasons, you're potentially toast.
sunny_30 wrote: |
3) the following esql-
Code: |
SET OutputRoot.Properties.CodedCharSetId = 1208;
SET OutputRoot.MQMD.Format = MQFMT_STRING;
SET OutputRoot.MQMD.CodedCharSetId = 1208; |
only sets the MQMD to correct CCSID but conversion can never happen due to reason 1 above |
Only because it's a BLOB. If OutputRoot contains CHARACTER data it gets converted to 1208 (which is the internal CCSID of the broker). So the question becomes do you want the output in 1208 or don't you?
sunny_30 wrote: |
4) In general- Using the same esql, if the data is outputted as CHAR, doesnt it result in automatic conversion to CCSID specified before MQouput w/o needing any manual CAST ? |
Yes. This is part of the parameters WMB uses to construct the output wire format. Hence you use the CCSID you want the output to be in
sunny_30 wrote: |
5) Even if data is output as CHAR by manual CAST from BLOB & ccsid set to 1208, it doesnt harm coz the data is already 1208.. |
I don't understand the "harm" in outputing data in a different CCSID. Why are you so concerned about the output being (or not being) in WMB's internal code page?
Perhaps if you explained the problem rather than banging on about how to avoid the symptom a bit more? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
sunny_30 |
Posted: Fri Feb 18, 2011 12:04 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Sorry was still editing my post.. when u were posting
Thanks for clarifying.
I was'nt sure WMB has internal codepage set to UTF-8 ->1208
Usually in a flow.. the ccsid gets copied over from the input node headers or esql code overrides it.. May be WMB uses the internal code page if MQMD headers are generated by the flow itself..
any ways- what I meant by harm is that the if data's real code-page doesnt match to the mqmd.ccsid, and if flow or the receiving app try to convert to a non-matching ccsid the special language characters may be lost..thts all
Im good now.. thnks all |
|
Back to top |
|
 |
Vitor |
Posted: Fri Feb 18, 2011 12:18 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
I was'nt sure WMB has internal codepage set to UTF-8 ->1208 |
My bad. WMB's internal code page is UTF-16, which IIRC is 1200 not 1208. Fairly similar characters though!
sunny_30 wrote: |
May be WMB uses the internal code page if MQMD headers are generated by the flow itself.. |
No. The default for a WMQ put is the CCSID of the queue manager; the default for a file is the code page of the server.
sunny_30 wrote: |
any ways- what I meant by harm is that the if data's real code-page doesnt match to the mqmd.ccsid, and if flow or the receiving app try to convert to a non-matching ccsid the special language characters may be lost..thts all |
A couple of points:
The CCSID of the MQMD should always match the data. It's a basic principle / best practice of WMQ that "receiver makes good". So any message you receive into WMB (which may contain special language characters) should be in a code page that can contain all the characters in the message and the CCSID reflects that. The reason WMB uses UTF-16 internally is because that can contain without further conversion every possible character. If WMB receives a message where the actual code page doesn't match the CCSID it's a fault with the sending application & there have been many discussions on this subject in here.
If you try and output a message from WMB that contains special language characters in a code page that doesn't support them the output node will throw an exception. So when the app that receives the message from WMB reads or converts it, it can be confident the CCSID is correct. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
smdavies99 |
Posted: Fri Feb 18, 2011 12:58 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Let me see if I have got this right?
1) The files are written using UTF-8.
2) The Queue Manager where the app that reads them had a default CCSID of 437
3) The majority of people don't know how to override the deault CCSID when sending an MQ message. (Remember that the default CCSID is to use the same CCSID as the qmgr)
4) For some strange reason you don't want to change the QMGR's default CCSID from 437 to 1208.
Sigh, Sigh, Sigh
Oh well. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
sunny_30 |
Posted: Fri Feb 18, 2011 1:13 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
Does WMB use UTF-16, ccsid-1200 if the flow itself is generating the message-data ?
Also you have mentioned that incorrect code-page conversion results in a flow exception.. but wont improper conversions some times result in replacement characters such as - "'�' (<U+FFFD>) " & no failure? |
|
Back to top |
|
 |
Vitor |
Posted: Fri Feb 18, 2011 1:18 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
sunny_30 wrote: |
Does WMB use UTF-16, ccsid-1200 if the flow itself is generating the message-data ? |
No. As I said before, WMB uses the default CCSID correct for the output message.
sunny_30 wrote: |
Also you have mentioned that incorrect code-page conversion results in a flow exception.. but wont improper conversions some times result in replacement characters such as - "'�' (<U+FFFD>) " & no failure? |
I'll accept that if the target code page has a mapping for that character position that isn't a printable character then yes it will map to that & not throw an error.
The bottom line is as my associate says. If your input data is UTF-8 and is recieved in WMB with a CCSID of 437 (and I'm not clear if WMB gets your data as a WMQ message, http put or a file) then this is a problem with the input. Payload code page should match the MQMD CCSID. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
sunny_30 |
Posted: Fri Feb 18, 2011 2:16 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
smdavies99 wrote: |
1) The files are written using UTF-8.
2) The Queue Manager where the app that reads them had a default CCSID of 437
3) The majority of people don't know how to override the deault CCSID when sending an MQ message. (Remember that the default CCSID is to use the same CCSID as the qmgr)
4) For some strange reason you don't want to change the QMGR's default CCSID from 437 to 1208. |
I have mentioned this at start in my question-
Quote: |
The CCSID on sender-app performing MQPut cannot be changed |
There is a reason why I said that & I also explained it to you later thats its one place we dont intend to make the change. Doesnt my question already imply that I knew that a change in ccsid at the sender Qmgr would have solved my specific problem I posed at the first place?
Anyways- your answer to my question only fits if my question is suited to accept your answer. How abt you try some crossword puzzles?
my answer doesnt fit
may be u grab a triple espresso & try again
Here is the explanation why its not feasible for us-
I really didnt want to get to this unnecessary level of detail as its irrelavent to my question & you forced me to.. Also i have my questions clarified already. & many thanks to all who helped
We have F2Q 'flat-file' connector senders, Q2F connector receivers deployed all over our MQ infrastructure: Linux, Windows, AIX etc
Files move from Sender connector to receiver connector. Folder to folder.
Flat-files move over MQ routed via central Hub-Broker (which does the routing using backend database). Connectors use RFH2 header to communicate to central Broker. The transfers are supposed to be binary, so we requested a change in the vendor connector code to always set the sender msg format to MQFMT_NONE instead of MQFMT_STRING in the rfh2.format. Data could be anything. Connectors have no control & actually not interested to know the code-page of the data. They always set the default QMgr-ccsid where they are running. All Channels have conversion disabled. Rceievers during MQget had MQGMO_CONVERT enabled. that was also set to OFF. Central Message Broker run on AIX. All data in Broker is processed as BLOB. So we are good with our FILE transfer needs.
We also have some F2Q messages that land up in queue to be read by other MQ apps. Grouped or single messages stripped off the RFH2. In these situations Broker keeps the CCSID of the source-file-data in the back end database. Sets the actual CCSID to the messages out appropriately inside flow. We have situations where data is written as STRING or BYTES to the queue sent out the Broker flow. Broker converts messages flowing out OR the other apps receiving messges convert on their end. We feel in these situations the CCSID has to be mentioned correctly. Where as the connectors we run for our F2F MQ transfer needs, we dont think CCSID is important as they are always Binary transfers.
Hope this explains |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Feb 18, 2011 2:35 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
sunny_30 wrote: |
We feel in these situations the CCSID has to be mentioned correctly. Where as the connectors we run for our F2F MQ transfer needs, we dont think CCSID is important as they are always Binary transfers.
|
Allow me to respectfully disagree. CCSID and endianness become a matter of worry especially when you are switching platforms...
Imagine a file being transferred binary between MSDOS and UNIX:
CRLF does not auto-magically morph into LF
Numbers saved in binary format big endian do not magically morph to little endian or the other way round...
Now just think about text when moving from EBCDIC to ANSI...
Your file will stay whole as it was originally created. But will it be usable??  _________________ MQ & Broker admin |
|
Back to top |
|
 |
sunny_30 |
Posted: Fri Feb 18, 2011 2:50 pm Post subject: |
|
|
 Master
Joined: 03 Oct 2005 Posts: 258
|
saper, correct !
Our file transfer needs over MQ strictly need to be no different from FTP-Binary behavior. Destination apps write their own bit of code to do rest of the processing after the file is picked from the folder.
If the data needs to be intervened or transformed, thats where matching CCSID for source-file-name & data is stored in database. For example all data from sap is UTF-8, we had to process the data through deprecated Idoc parser in MBv6 which sure needed all the trimming of extra CR characters, padded spaces, mapping segments to XML fields etc .. |
|
Back to top |
|
 |
|