Author |
Message
|
Vitor |
Posted: Wed Jul 14, 2010 5:00 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
hari.krish085 wrote: |
Sending application is RIB and RIB is publishing the JMS messages to topics |
This has nothing to do with the question at hand, i.e. the validity of the XML document. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
hari.krish085 |
Posted: Wed Jul 14, 2010 10:36 pm Post subject: |
|
|
Novice
Joined: 13 Jul 2010 Posts: 12
|
I have checked with RIB team, they don't have any idea on which CCSID they were sending. Any alternative way i can work on message flow side.
I have tried all options: 1200, 1381, 1386 and 1204 with UTF-16. but it didn't work. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Jul 14, 2010 11:31 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
As Vitor (& others) has said, they are sending you an invalid XML document. End-Of.
If you search this forum, you will see a number of similar posts to yours.
Apart from writing some contact admin C (or similar) code that will munge the bitstream in the message before it gets to broker, you have to go back to the senders and get them to sort their errors out.
It is in the best interests of everyone on your project that they do that.
The code option is IMHO, a very last resort option.
You might like to help the senders by formatting a proper message and letting them see 'the error of their ways'. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Jul 14, 2010 11:41 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
I have checked with RIB team, they don't have any idea on which CCSID they were sending |
I agree with smdavies99. The RIB team need to learn about character encodings - this stuff matters in the modern world of application integration. Wikipedia has tons of good information on the subject, so there is no excuse for ignorance.
Quote: |
Any alternative way i can work on message flow side. |
No. But you could at least find out where the RIB team need to set the CCSID, so that you can explain to them what they have to do. Where does the message flow obtain the CCSID for the XML? From the XML declaration? Or from a JMS header?
If you don't know the answer yet, you should take a user trace and find out. |
|
Back to top |
|
 |
hari.krish085 |
Posted: Thu Jul 15, 2010 12:10 am Post subject: |
|
|
Novice
Joined: 13 Jul 2010 Posts: 12
|
Thank you so much for all Suggestion. I will back to sender team and i will work with them. |
|
Back to top |
|
 |
rekarm01 |
Posted: Thu Jul 15, 2010 2:29 am Post subject: Re: RecoverableException BIP2136E: Source character ''4e0a' |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
hari.krish085 wrote: |
My flow has a JMSInput Node --> ComputeNode --> JMSMQTransformation node --> ComputeNode and MQOutput Node
The JMS input message has Chinese character "件数" . While publishing the output xml, it tries to parse the xml and since it is not able to parse this Chinese characters, it throws an error.
I tried using code page 1208, 819, 437, in both properties and MQMD header of the output message. It didn't work. |
Within a message flow, the source ccsid describes the input message encoding (before the JMSInput node).
The target ccsid describes the output message encoding (after the MQOutput node).
To be more precise, you tried changing the target ccsid. That won't work if it's the source ccsid that's wrong.
hari.krish085 wrote: |
The user trace is pasted below
Code: |
RecoverableException BIP2136E: Source character ''4e0a'' in field ''003c002f006500730062003a0……………'' cannot be converted from unicode to codepage '819'.
The source character is an invalid code point within the given codepage.
Correct the application or message flow that generated the message to ensure that all data within the message
can be represented in the target codepage. |
|
It looks like the source ccsid ought to be 1200 (UTF-16BE), but only the sender knows for sure:- the byte sequence for '</esb:...' would be ''003c 002f 0065 0073 0062 003a ...''
- the byte sequence for '上 would be ''4e0a''; that's the character that caused the exception
- if '件数' is part of the input message, the message flow didn't get that far
What happens when the message flow doesn't change the target ccsid?
hari.krish085 wrote: |
What is the *real* encoding of the XML? --> the real encoding of xml is *UTF-8* . |
No. ''4e0a'' and ''003c002f006500730062003a...'' are definitely not UTF-8.
hari.krish085 wrote: |
Is there is any other alternative we have on message flow side? |
Not before the JMSInput node.
hari.krish085 wrote: |
I have checked with RIB team, they don't have any idea on which CCSID they were sending |
It might be useful to add a Trace node after the JMSInput node, to display the contents of ${Root}. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Jul 15, 2010 7:28 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
As you are using a JMS Input node, can you please provide the makeup of the connection factory?
Also what type of message is being sent BytesMessage or TextMessage?  _________________ MQ & Broker admin |
|
Back to top |
|
 |
AndreasMartens |
Posted: Thu Jul 15, 2010 7:41 am Post subject: Bit more information |
|
|
 Acolyte
Joined: 30 Jan 2006 Posts: 65 Location: Hursley, UK
|
Ok, lets try to explain a couple things with an example...
件数 are *two* Chinese characters (well, they're valid in Japanese and Korean too, so let's call them CJK characters):
件 ("jian") has several representations, lets cover the main ones:
UTF-16: 0x4EF6
UTF-8: 0xE4 0xBB 0xB6
XML decimal entity: & # 2 0 2 1 4 ; (spaces added to prevent interpretation)
Ditto 数 ("shu"):
UTF-16: 0x6570
UTF-8: 0xE6 0x95 0xB0
XML decimal entity: & # 2 5 9 6 8 ; (spaces added to prevent interpretation)
So those are the main candidates to look for in your bitstreams.
Your error is quite explicit in telling you what's going wrong. It is saying that it can't convert "4e0a" *from* Unicode to 819, as someone has pointed out 0x4e0a in Unicode is 上, so we can already determine that the unicode representation has been created wrongly. (Conclusion 1, in accordance with previous posts, your input has been misinterpreted).
As an aside, are you sure you're looking for 件数? 上 roughly translates to "shang", as in the first character of Shanghai...
So a bit of info on 上 :
UTF-16: 0x4E0A
UTF-8: 0xE4 0xB8 0x8A
XML decimal entity: & # 1 9 9 7 8 ; (spaces added to prevent interpretation)
As you see, it can perfectly easily be represented in UTF-8 (ccsid 1208)
In your posted user-trace you're trying to convert to ccsid 819, Latin-1, which cannot represent any of the above characters, hence the errors.
So what I'm saying is that the input may not be invalid, the error you see is not from parsing xml, but from writing it. So the best steps taken are:
1. Place a Trace node after your input with ${Root} and look at the CCSID in the Properties folder.
2. Setting the CCSID in MQMD is a little specific, the Properties folder should take care of this for you, so you might as well set Properties.CodedCharSetId
3. Give us more details of the user trace, you don't explicitly tell us which node threw the error.
Hope this helps,
Andreas |
|
Back to top |
|
 |
rekarm01 |
Posted: Fri Jul 16, 2010 1:28 am Post subject: Re: Bit more information |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
AndreasMartens wrote: |
2. Setting the CCSID in MQMD is a little specific, the Properties folder should take care of this for you, so you might as well set Properties.CodedCharSetId |
... assuming the message flow needs to set it at all. It hasn't been established whether the receiving application requires the message flow to convert the message from one ccsid to another. |
|
Back to top |
|
 |
hari.krish085 |
Posted: Sun Jul 18, 2010 8:56 pm Post subject: |
|
|
Novice
Joined: 13 Jul 2010 Posts: 12
|
The issue was resolved... thank you so much for all the comments.
I used the CCSID, ENCODING from the JMS headers and used them to parse.
1) Read the input message as BLOB.
2) Parse using InputRoot.Properties.Encoding, InputRoot.Properties.CodedCharSetId
CREATE LASTCHILD OF OutputRoot
DOMAIN('XMLNS')
PARSE (InputRoot.BLOB.BLOB
ENCODING InputRoot.Properties.Encoding
CCSID InputRoot.Properties.CodedCharSetId);
Thanks... |
|
Back to top |
|
 |
smdavies99 |
Posted: Sun Jul 18, 2010 9:18 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
I'm glad you got the issue solved.
however, is there any reason you are using the XMLNS rather than the XMLNSC domains? _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
hari.krish085 |
Posted: Wed Jul 21, 2010 9:57 pm Post subject: |
|
|
Novice
Joined: 13 Jul 2010 Posts: 12
|
Hi,
No, but now we have changed xmlns to BLOB domain. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed Jul 21, 2010 10:02 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
hari.krish085 wrote: |
CREATE LASTCHILD OF OutputRoot
DOMAIN('XMLNS')
PARSE (InputRoot.BLOB.BLOB
ENCODING InputRoot.Properties.Encoding
CCSID InputRoot.Properties.CodedCharSetId);
|
The bit in bold above is what I meant. Why not XMLNSC? _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
hari.krish085 |
Posted: Wed Jul 21, 2010 10:12 pm Post subject: |
|
|
Novice
Joined: 13 Jul 2010 Posts: 12
|
The connecting flows used xmlns domains. we are parsing the same here. and the connecting flows starts with MQInput node. |
|
Back to top |
|
 |
rekarm01 |
Posted: Thu Jul 22, 2010 11:41 am Post subject: Re: RecoverableException BIP2136E: Source character ''4e0a'' |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
hari.krish085 wrote: |
The issue was resolved... :) thank you so much for all the comments.
I used the CCSID, ENCODING from the JMS headers and used them to parse.
1) Read the input message as BLOB.
2) Parse using InputRoot.Properties.Encoding, InputRoot.Properties.CodedCharSetId
Code: |
CREATE LASTCHILD OF OutputRoot
DOMAIN('XMLNS')
PARSE (InputRoot.BLOB.BLOB
ENCODING InputRoot.Properties.Encoding
CCSID InputRoot.Properties.CodedCharSetId); |
|
This is identical to simply reading the input message as XMLNS. There's no obvious reason why one would work and the other wouldn't ... unless something else changed, and that's what really resolved the issue.
hari.krish085 wrote: |
The connecting flows used xmlns domains. we are parsing the same here. and the connecting flows starts with MQInput node. |
That's not a compelling reason to choose XMLNS over XMLNSC. Check the InfoCenter for better reasons. |
|
Back to top |
|
 |
|