ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Issue in parsing characters such as °, ³ (superscript)

Post new topic  Reply to topic
 Issue in parsing characters such as °, ³ (superscript) « View previous topic :: View next topic » 
Author Message
Avocado's number
PostPosted: Fri Apr 12, 2013 4:00 am    Post subject: Issue in parsing characters such as °, ³ (superscript) Reply with quote

Newbie

Joined: 12 Apr 2013
Posts: 4

I am using a CSV file as an input to a FileInput node in my flow. The records present in the file are inserted into the database (DB2 9.7).

One of the records present in the CSV carries an entry °C. Initially, I used the MRM parser at the FileInput node and the record inserted into the database looked like -C without the °. Later, I switched to the DFDL parser for the same file and used a FileOutput node to verify the output. This time, the output generated was °C, however, an insert into the database changed to C.
Is there a way to represent special characters such as ° or ³, as is, in the database via a message flow?
Thanks for your help!
Back to top
View user's profile Send private message
McueMart
PostPosted: Fri Apr 12, 2013 4:28 am    Post subject: Reply with quote

Chevalier

Joined: 29 Nov 2011
Posts: 490
Location: UK...somewhere

Sounds like your database can't deal with the ° character to me! What Character set do you have set for the field which you are storing the data in?
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Apr 12, 2013 4:30 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

you'd have to store the message in UTF-8 to be sure that all special characters are conserved...

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
rekarm01
PostPosted: Sun Apr 14, 2013 11:26 am    Post subject: Re: Issue in parsing characters such as °, ³ (superscript) Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

Avocado's number wrote:
Is there a way to represent special characters such as ° or ³, as is, in the database via a message flow?

That depends on the DB2 code set used when creating the database, and the database column type. Check the DB2 documentation for more information.

fjb_saper wrote:
you'd have to store the message in UTF-8 to be sure that all special characters are conserved

It's true that UTF-8 will preserve non-ASCII characters, but DB2 string functions use byte indexing, not character indexing, which can affect position and length calculations in certain circumstances. The InfoCenter has more information, here and here.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Apr 14, 2013 8:56 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

I was being purposefully being generic there. In the particular case of DB2 you can store the text as BLOB in UTF-8. This way no risk to do a positional byte instead of character as you'd have to import to WMB before going positional on it (DFDL?)

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Avocado's number
PostPosted: Sun Apr 14, 2013 10:25 pm    Post subject: Reply with quote

Newbie

Joined: 12 Apr 2013
Posts: 4

The database was using a code page of 1252. This has been changed to 1208 and now the representation of °C has become ┬░C. Also, the data type of the column in DB2 is CHARACTER. Any pointers, please?
Back to top
View user's profile Send private message
smdavies99
PostPosted: Sun Apr 14, 2013 11:31 pm    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

Did you delete the row and re-insert it with the new DB CCSID?
Just changing the DB CCSID won't fix the problem. (AFAIK)
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
Avocado's number
PostPosted: Mon Apr 15, 2013 1:34 am    Post subject: Reply with quote

Newbie

Joined: 12 Apr 2013
Posts: 4

I dropped and re-created the database to change the codepage, then created the tables and tried the insert via my flow.

Here is a summary of what I have tried so far-
1. Parsed the input file correctly against the DFDL test client parser.
2. Attached trace node after the FileInput node- Does not show the °C character correctly.
3. Flow wired to a FileOutput node- The output file has the correct character for °C.
4. Flow wired to a Compute node for database insert- Does not show the °C character correctly.
5. Changed the database codepage to 1208- Does not show the °C character correctly.
6. Changed ODBC settings to configure 'ConnectCodepage' for DSN to use 1208- Does not show the °C character correctly.
7. A direct insert of values into DB2 from control centre- Value is represented correctly in database.
8. Hard- coded '°C' against the column value in insert statement in ESQL- Value is represented correctly in database.
9. Passed '°C' in a character variable to the insert statement in ESQL- Value is represented correctly in database.
Back to top
View user's profile Send private message
kimbert
PostPosted: Mon Apr 15, 2013 2:20 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
1. Parsed the input file correctly against the DFDL test client parser.
2. Attached trace node after the FileInput node- Does not show the °C character correctly.
There's your problem. It looks as if the DFDL parser has been told to use the wrong text encoding when parsing this field.
That's one theory, anyway. A debug-level user trace should reveal what the DFDL parser is doing.
The most important question at this point is: what is the text encoding of the input file?
Back to top
View user's profile Send private message
Avocado's number
PostPosted: Mon Apr 15, 2013 3:58 am    Post subject: Reply with quote

Newbie

Joined: 12 Apr 2013
Posts: 4

The text encoding for the input file is UTF-8. However, the text encoding for this particular field was set to <dynamically set>. I changed this to UTF-8 and I am able to receive a correct representation of all special characters now (both in the flow and in the database).
Thanks much for your inputs.
Back to top
View user's profile Send private message
kimbert
PostPosted: Mon Apr 15, 2013 4:55 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

I'm glad you got it working.

'Dynamically set' causes the DFDL property 'encoding' to be set to '$dfdl:encoding'. That is a reference to a predefined variable. In WMB, the predefined variable $dfdl:encoding is set to the 'transport encoding'. In MQ that means the CCSID from the MQMD header. For a FileInput node it is probably the platform default encoding.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Issue in parsing characters such as °, ³ (superscript)
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.