ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Error Parsing BLOB to JSON

Post new topic  Reply to topic
 Error Parsing BLOB to JSON « View previous topic :: View next topic » 
Author Message
maus
PostPosted: Fri Oct 26, 2018 11:54 am    Post subject: Error Parsing BLOB to JSON Reply with quote

Newbie

Joined: 26 Oct 2018
Posts: 4

Hello, I'm having a tough time converting an HTTP input to JSON. My flow has an HTTP Input node with the Message domain set to BLOB. If I put a breakpoint after the input node, I can inspect the bytes and everything looks normal:
Quote:

20202020202020202022627573696e6573734e616d65223a202257617272696f72204c61f1657320426f776c696e67204c65616775e9222c0a
ñ é " ,


I understand it to mean "f1" and "e9" are the UTF-16 hex representations of the special characters "ñ" and "é", respectively. On the HTTP header of the test request, I'm setting the Content-Type to utf-16, so the InputRoot.Properties.CodedCharSetId = 1204, which is what I would expect.

Later in the flow, I'm trying to parse the input BLOB to JSON, via a Reset Content Descriptor (RCD) node and I'm getting "JSON parsing errors have occurred". The error appears in the message root under the JSON folder; there is nothing in the ExceptionList. The flow proceeds and fails later because of this. This fails whether or not the special characters are included in request, and I'm not sure why. If I do not set the charset on the input request, Broker sets the char set ID to 1208. Then, when I try to run the message through the RCD node, I get the unconvertable character error. With the char set ID set to 1208, if there are no special characters in the request, it parses just fine.

I haven't done any special handling of the bitstream at all, I'm just taking the message in and running it through the RCD node; am I missing something somewhere? Is there something I should be checking?
Back to top
View user's profile Send private message
maus
PostPosted: Fri Oct 26, 2018 12:12 pm    Post subject: Reply with quote

Newbie

Joined: 26 Oct 2018
Posts: 4

Just to add, the JSON being passed into the flow is valid:
Code:
{
   "client": {
      "names": [{
         "businessName": "Warrior Lañes Bowling Leagué",
         "updateDate": "2017-09-29T15:45:50.082-04:00"
      }]
   }
}
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Oct 26, 2018 8:53 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

CCSID 1204 is a new one to me for UTF-16. I thought the ones to use were 1200,1201,1202 depending on whether or not you also use a byte level indicator...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
timber
PostPosted: Sat Oct 27, 2018 12:46 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

That is not a UTF-16 character stream. I know that because most of the characters occupy 8 bits whereas in UTF-16 all characters are at least 16-bit.
It looks more like a UTF-8 character stream, but you claim that it doesn't work when you set the encoding to UTF-8 either. So it might be a badly-constructed UTF-8 character stream. Either way, it's definitely not UTF-16.
Back to top
View user's profile Send private message
rekarm01
PostPosted: Sun Oct 28, 2018 2:08 pm    Post subject: Re: Error Parsing BLOB to JSON Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

maus wrote:
I understand it to mean "f1" and "e9" are the UTF-16 hex representations of the special characters "ñ" and "é", respectively.

That's neither UTF-16, nor UTF-8. It could be "iso-8859-1" or "windows-1252", but only the sender knows for sure.

maus wrote:
On the HTTP header of the test request, I'm setting the Content-Type to utf-16

The HTTP header does not convert the data, it only describes it. If the description doesn't match the data, then that can cause problems for the receiver, including JSON parser errors, and unconvertable characters.
Back to top
View user's profile Send private message
maus
PostPosted: Tue Oct 30, 2018 4:48 am    Post subject: Reply with quote

Newbie

Joined: 26 Oct 2018
Posts: 4

The request is coming from a Java application, so I assumed it was UTF-16, but changing the OutputRoot.Properties.CodedCharSetId = 819 made it so the Reset Content Descriptor node can parse the BLOB to JSON, and I'm now seeing the special characters in the message. It must have been ISO.

The RCD node must be setting OutputRoot.Properties.CodedCharSetId = 1208 during the translation. Does that mean my message has been encoded as UTF-8? The reason I think this is because if I put a breakpoint directly after the RCD node, the CodedCharSetId is set to 1208.

The flow is converting the JSON into a SOAP request for the target application, and the target application is expecting UTF-8 anyway, so if I don't need to do any further processing of the message, that would be ideal.
Back to top
View user's profile Send private message
timber
PostPosted: Tue Oct 30, 2018 7:55 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Quote:
The request is coming from a Java application, so I assumed it was UTF-16
That was not a valid assumption. Java strings are UTF-16 internally but Java can read and write characters streams in any valid encoding.
In any case, you should never try to guess the encoding because there is no reliable algorithm for doing that in most cases. A sample document can look identical in multiple different encodings. The sender should always specify the encoding - preferably along with the message but if not then it should be specified in the design document.
Quote:
The RCD node must be setting OutputRoot.Properties.CodedCharSetId = 1208 during the translation. Does that mean my message has been encoded as UTF-8?
When IIB writes a message, it selects the output encoding based on OutputRoot.Properties.CodedCharSetId. So yes, if it is still set to 1208 when the message tree reaches the SOAPRequest node then the output XML will be encoded in UTF-8.
Back to top
View user's profile Send private message
rekarm01
PostPosted: Tue Oct 30, 2018 5:14 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

maus wrote:
The RCD node must be setting OutputRoot.Properties.CodedCharSetId = 1208 ...

The RCD node does not directly set OutputRoot.Properties.CodedCharSetId. The message flow must have set it some other way.
Back to top
View user's profile Send private message
maus
PostPosted: Tue Oct 30, 2018 5:43 pm    Post subject: Reply with quote

Newbie

Joined: 26 Oct 2018
Posts: 4

Thanks you guys for the great information!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Error Parsing BLOB to JSON
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.