ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Broker Unicode support for Oracle database

Post new topic  Reply to topic Goto page 1, 2  Next
 Broker Unicode support for Oracle database « View previous topic :: View next topic » 
Author Message
golam1983
PostPosted: Sat Mar 06, 2010 6:29 pm    Post subject: Broker Unicode support for Oracle database Reply with quote

Apprentice

Joined: 27 Jun 2008
Posts: 35

Currently I have requirement to handle/manipulate unicode data(UTF8) from oracle database. Here is my current environments

Broker: v6.0.0.3 in Linux X86 platform
Oracle: 9i nls_characterset UTF8

However the "Support for Unicode and DBCS data in databases" topic in the broker documentation says that -

Code:
If you are using Oracle:
v Your database must be created with NLS_CHARACTERSET of AL32UTF8.
v Your ODBC data source definition must include the setting
ColumnSizeAsCharacter=1.
On UNIX and Linux platforms, this setting must be included in the appropriate
stanza in the ODBC ini files.
On Windows platforms, this string value must be added to the ODBC data
source key in the registry.
See Enabling ODBC connections to the databases for further information.
v For 32-bit connections, you must set the variable NLS_LANG in the broker
environment to the value <yourlanguage>.<yourterritory>.AL32UTF8.


My oracle database have NLS_CHARACTERSET UTF8 and my DBA team is not agreed at all to shift this to AL32UTF8.

My locale(LC_ALL) set to en_us.UTF-8
NLG_LANG=american_america.UTF8/AL32UTF8 (tried both of these)
Set ColumnSizeAsCharacter=1 in proper ODBC datasource stanza.
Datadirect version is 5.0

These settings are not working properly while inserting/selecting data from database.

Would be interested to know if anybody handled this kind of situation or any idea/insight to enable unicode support.
[/code]
Back to top
View user's profile Send private message
Vitor
PostPosted: Sat Mar 06, 2010 7:45 pm    Post subject: Re: Broker Unicode support for Oracle database Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

golam1983 wrote:
These settings are not working properly while inserting/selecting data from database.


Given that these are not the settings given in the documentation it's not all that surprising.

golam1983 wrote:
Would be interested to know if anybody handled this kind of situation or any idea/insight to enable unicode support.


Perhaps get the DBA team to shift their position, or at least justify it?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
golam1983
PostPosted: Sat Mar 06, 2010 9:30 pm    Post subject: Reply with quote

Apprentice

Joined: 27 Jun 2008
Posts: 35

Thanks Vitor.

Just to repeat, the only difference between my database setting and ibm documented database setting is utf8(my current database charset) and al32utf8(ibm must condition charset)

I have checked the hex stream in broker before insert. They are in proper utf8 (ccsid 1208) format. As per oracle documentation if the client's NLS_LANG set to same charset like database (here utf8), there will be no data conversion will be happen. Thats what i want, no conversion. But in my case my utf8 data is getting further converted. For an example "ª" (C2B6 in hex UTF8) got further converted into some (xxxxxxxx). I just do not want this conversion.

So I am checking here if somebody have faced same type of scenario before taking any further step toward this issue(IBM support)

This clearly a incapability of broker to do a simple task simply(just forwarding utf8 data)
Back to top
View user's profile Send private message
Vitor
PostPosted: Sat Mar 06, 2010 9:43 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

golam1983 wrote:
This clearly a incapability of broker to do a simple task simply(just forwarding utf8 data)


There's nothing simple about code pages, as the number of posts on the subject here will attest to.

I suspect something interesting is happening on the interface between Oracle & WMB. This was the basis of my question about why the DBA team won't change the Oracle setting; if both code pages are UTF8 why not alter the database? Clearly it's not that simple or a) your DBAs would alter the database & b) there's would be those instructions in the broker documentation. So that's one point.

I'm also interested by your statement:

golam1983 wrote:
I have checked the hex stream in broker before insert. They are in proper utf8 (ccsid 1208) format


and

golam1983 wrote:
But in my case my utf8 data is getting further converted


How have you determined that WMB is doing the additional conversion? If the stream in broker is correct immediately prior to insert then why couldn't Oracle be remapping the characters? Likewise how are you determining this additional conversion (the 'xxxxxx' you talk about) is happening? If my assumption is correct and you're doing this with a SQL SELECT statement, could not the conversion be happening at this point?

Just 2 points that might help shine a light on this.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Mar 07, 2010 12:24 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

As well you have to realize that UTF-8 is not the same as Unicode which the broker uses internally. As you did not provide us with any code, how can we know that you are not doing something funky in the code?

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
rekarm01
PostPosted: Sun Mar 07, 2010 5:45 pm    Post subject: Re: Broker Unicode support for Oracle database Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

golam1983 wrote:
... the "Support for Unicode and DBCS data in databases" topic in the broker documentation says that -

Quote:
If you are using Oracle:
- Your database must be created with NLS_CHARACTERSET of AL32UTF8.
- For 32-bit connections, you must set the variable NLS_LANG in the broker environment to the value <yourlanguage>.<yourterritory>.AL32UTF8.

That seems fairly self-explanatory.

golam1983 wrote:
My oracle database have NLS_CHARACTERSET UTF8 and my DBA team is not agreed at all to shift this to AL32UTF8.

Oracle also recommends switching from "UTF8" to "AL32UTF8", as of Oracle 9i. Push back on the DBA team to switch to "AL32UTF8".

golam1983 wrote:
NLG_LANG=american_america.UTF8/AL32UTF8 (tried both of these)

"NLG_LANG" should be "NLS_LANG". "american_america" should be "AMERICAN_AMERICA".

golam1983 wrote:
I have checked the hex stream in broker before insert. ... As per oracle documentation if the client's NLS_LANG set to same charset like database (here utf8), there will be no data conversion will be happen.

There are at least three possible opportunities for conversion:
  1. from bitstream to ESQL CHARACTER (UCS-2)
  2. from ESQL CHARACTER (UCS-2) to ODBC client (NLS_LANG)
  3. from ODBC client (NLS_LANG) to Oracle server (NLS_CHARACTERSET)
The Oracle documentation only addresses the third conversion, (ODBC client->Oracle server).

It's the second one (ESQL->ODBC client) that requires a supported NLS_LANG characterset, (such as "ALT32UTF8", but not "UTF8" - nor "utf8").

golam1983 wrote:
For an example "ª" (C2B6 in hex UTF8) got further converted into some (xxxxxxxx).

"Some (xxxxxxxx)"? Which (xxxxxxxx)?

golam1983 wrote:
Would be interested to know if anybody handled this kind of situation or any idea/insight to enable unicode support.

Push back on the DBA team to switch to "AL32UTF8".

Or, change the relevant database columns from character types to BLOB, and insert bytes instead of characters.

Or, add a JavaCompute node, and insert using JDBC, instead of ODBC.

Or, ... push back on the DBA team to switch to "AL32UTF8".
Back to top
View user's profile Send private message
golam1983
PostPosted: Mon Mar 08, 2010 1:07 pm    Post subject: Reply with quote

Apprentice

Joined: 27 Jun 2008
Posts: 35

Thanks rekarm01...I will check with my DBA team, if anyway the database could be switched to AL32UTF8.

I have set the following -
LANG=en_us.UTF-8
NLS_LANG=AMERICAN_AMERICA.UTF8

This seems like working when I am Inserting data.

But its failing when I am trying to select 4 byte charecter data from oracle.

Here is the data stream I am using for testing - ed9fbfee8080efbfbdf48fbfbff48fbfbe

The last 8 byte represents 2 charecter each 4 bytes.

Insert/Update is working fine.

But when I am selecting the same data, the return is -

ed9fbfee8080efbfbdefbfbdefbfbdefbfbdefbfbdefbfbdefbfbdefbfbdefbfbd

Both 4 bytes char seems to be screwed while selecting from oracle. Any possible solution for this(Except switching the database to AL32UTF8)
Back to top
View user's profile Send private message
rekarm01
PostPosted: Tue Mar 09, 2010 8:52 pm    Post subject: Re: Broker Unicode support for Oracle database Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

golam1983 wrote:
I have set the following -
LANG=en_us.UTF-8
NLS_LANG=AMERICAN_AMERICA.UTF8

But its failing when I am trying to select 4 byte character data from Oracle.

supplementary characters require more than 3 bytes for UTF-8.
For "AL32UTF8", Oracle stores and retrieves a supplementary character as a 4-byte UTF-8 character.
For "UTF8", Oracle stores and retrieves a supplementary character as two 3-byte CESU-8 characters.
That's the critical difference between them.

Check the Oracle9i Database Globalization Support Guide or the Oracle Unicode database support white paper for more details.

golam1983 wrote:
Here is the data stream I am using for testing - ed9fbf ee8080 efbfbd f48fbfbf f48fbfbe

The last 8 byte represents 2 charecter each 4 bytes.

Insert/Update is working fine.

Really? That's an unusual UTF-8 data stream:
    ed9fbf = U+D7FF = contact admin code point (reserved for future use)
    ee8080 = U+EE00 = private-use character
    efbfbd = U+FFFD = <REPLACEMENT_CHARACTER> = '�'
    f48fbfbf = U+10FFFF = private-use character
    f48fbfbe = U+10FFFE = private-use character
Is that what the data stream looks like before or after inserting into the database?

golam1983 wrote:
But when I am selecting the same data, the return is -

ed9fbf ee8080 efbfbd efbfbd efbfbd efbfbd efbfbd efbfbd efbfbd efbfbd efbfbd

Both 4 bytes char seems to be screwed while selecting from oracle.

efbfbd = U+FFFD = <REPLACEMENT_CHARACTER> = '�'; these indicate byte sequences that don't map to Unicode characters, (probably due to the incompatibility between UTF-8 and CESU-8 formats).

Is the standard Oracle client any better at selecting the data with the same configuration?

golam1983 wrote:
Any possible solution for this (Except switching the database to AL32UTF8)

In the interim, Oracle's suggested work-around is to set the client-side NLS_LANG character set to "AL32UTF8", and let Oracle convert data between "AL32UTF8" client and "UTF8" database.
Back to top
View user's profile Send private message
golam1983
PostPosted: Sun Mar 14, 2010 10:01 am    Post subject: Reply with quote

Apprentice

Joined: 27 Jun 2008
Posts: 35

Thanks rekarm01, for your valuable comments on this.

Yes the Insert/Update look perfect. I have used rawhex function for oracle to verify it.

You are correct, I will use AL32UTF8 and Let oracle to convert it in UTF8.

Again, thank you very much
Back to top
View user's profile Send private message
shatvani
PostPosted: Tue Sep 07, 2010 6:29 am    Post subject: Reply with quote

Newbie

Joined: 08 Mar 2010
Posts: 6

Could you help me in the same issue.
My NLS_CHARACTERSET=UTF8,
NLS_NCHAR_CHARACTERSET=AL16UTF16.
I call a stored procedure which returns a CLOB,
I'd like to put it into an XMLNSC domain but insted of my accentuated character I get mismatch character:
Quote:
<KEP_NEV>"A" kategóriás hajóvezető</KEP_NEV>

=>
Quote:
&lt;KEP_NEV&gt;&amp;quot;A&amp;quot; kategĂƒÃ‚łriĂƒÃ‚ˇs hajĂƒÃ‚łvezetĂ…â€˜&lt;/KEP_NEV&gt;


I tried smdavies99 suggestion:
Code:
CREATE PROCEDURE DoStuff() BEGIN
      SET OutputRoot.MQMD = NULL;
      SET OutputRoot.MQRFH2 = NULL;
      CREATE NEXTSIBLING of OutputRoot.Properties DOMAIN 'MQMD';
      set OutputRoot.MQMD.Version = MQMD_CURRENT_VERSION;
      SET OutputRoot.MQMD.CodedCharSetId = 1208;
      SET OutputRoot.MQMD.Format='MQSTR';

but unfortunatelly it was useless.
Thank you in anticipation.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Sep 07, 2010 6:44 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

shatvani wrote:
Could you help me in the same issue.


How do you know it's the same issue? The same problem, on the same software level, and the same platform? This is why we prefer people to open new threads and refer to possibly connected earlier ones.

shatvani wrote:
My NLS_CHARACTERSET=UTF8,
NLS_NCHAR_CHARACTERSET=AL16UTF16.


Much of this thread discusses setting these to values you're not using. Why have you chosen these values in the face of direct instructions from the broker documentation and the advice on this thread, then been surprised when it doesn't work properly?

shatvani wrote:
I tried smdavies99 suggestion:


What suggestion? Where? I don't see it in this thread.

shatvani wrote:
but unfortunatelly it was useless.


Of course it was. If you look at what the code is doing, then trying reading both this thread and the broker documentation you'll see why it won't help with this problem. Which is why I suspect it wasn't suggested on this thread.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
shatvani
PostPosted: Tue Sep 07, 2010 10:31 am    Post subject: Reply with quote

Newbie

Joined: 08 Mar 2010
Posts: 6

Dear Vitor,

NLS_NCHAR_CHARACTERSET is used by NCLOB, NCHAR, etc. The procedure what I use returns with CLOB, so I don't understand why it would be effective to convert AL16UTF16 to UTF8?

I have read through a lot of posts, for example smdavies99 's, and I think so the problem is around the character encoding, but I don't know where.

Thank you anyway
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Sep 07, 2010 10:51 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

shatvani wrote:
I don't understand why it would be effective to convert AL16UTF16 to UTF8?


I don't think I or anyone in this thread has suggested that. Rather the reverse in fact.

shatvani wrote:
I have read through a lot of posts, for example smdavies99 's


I think there's more than one by that poster, I hold by my assertion that there's no such post on this thread for the good & sufficient reason that it won't help with this problem.

shatvani wrote:
I think so the problem is around the character encoding


You think?

shatvani wrote:
I don't know where.


Try re-reading this post & using some of the suggestions.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
shatvani
PostPosted: Tue Sep 07, 2010 10:57 am    Post subject: Reply with quote

Newbie

Joined: 08 Mar 2010
Posts: 6

Thank you.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Tue Sep 07, 2010 9:23 pm    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

Regular posters here (Like Vitor, myself and the IBM'ers) have commented on literally thousands of topics.
If you reference a post by someone it really would help to put a link to in when you reference it.
This way we can see if the thing you have tried is actually relevant to your problem or has been taken by mistake out of context.

As the person's post that is being referenced even I can't remember the exact relationship of my suggestion to this one.

Add to this, Vitors suggestion to start a new thread and reference the old one rather than re-opening it, I hope this might help you in the future.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Broker Unicode support for Oracle database
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.