ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Encoding problems in HTTPRequest Node

Post new topic  Reply to topic
 Encoding problems in HTTPRequest Node « View previous topic :: View next topic » 
Author Message
elenzo
PostPosted: Tue Oct 29, 2013 11:20 am    Post subject: Encoding problems in HTTPRequest Node Reply with quote

Acolyte

Joined: 22 Aug 2006
Posts: 53

Hi, I have a simple flow that makes an http request. The web service's response is encoded in ISO-8859-1 but the message flow tries to parse it as UTF-8.
I have 2 trace nodes, the first one is before the http request node and the properties looks like this:

Code:
( ['MQROOT' : 0x6568b50]
                                         (0x01000000:Name  ):Properties        = ( ['MQPROPERTYPARSER' : 0x656b650]
                                           (0x03000000:NameValue):MessageSet             = '' (CHARACTER)
                                           (0x03000000:NameValue):MessageType            = '' (CHARACTER)
                                           (0x03000000:NameValue):MessageFormat          = '' (CHARACTER)
                                           (0x03000000:NameValue):Encoding               = 0 (INTEGER)
                                           (0x03000000:NameValue):CodedCharSetId         = 819 (INTEGER)


the second trace, immediatly after the http request node, and the CodedCharSetId has a new value, 1208

This CodedCharSetId=1208 generates a parsing error, to solve this the only why I found was:
Set httpRequest node response message parsing to BLOB, add a compute node to set CodedCharSetId=819 and a resent content descriptor to parse the message as XMLNSC.

The question is, why is the httpRequest Node changing the CodedCharSetId value to 1208? How can I avoid this? I 've been reading the infocenter but didnt get the answers to my questions...

Any help will be very appreciated!
Back to top
View user's profile Send private message
gs
PostPosted: Wed Oct 30, 2013 2:35 am    Post subject: Reply with quote

Master

Joined: 31 May 2007
Posts: 254
Location: Sweden

Have you verified that the web service HTTP response headers match the response? Could it be that Content-Type charset is set to utf-8?
Back to top
View user's profile Send private message
elenzo
PostPosted: Wed Oct 30, 2013 9:20 am    Post subject: Reply with quote

Acolyte

Joined: 22 Aug 2006
Posts: 53

This is the web services response

Code:
)
                                         (0x01000000:Name):HTTPResponseHeader = ( ['WSRSPHDR' : 0x654f1f0]
                                           (0x03000000:NameValue):X-Original-HTTP-Status-Line = 'HTTP/1.1 200 OK' (CHARACTER)
                                           (0x03000000:NameValue):X-Original-HTTP-Status-Code = 200 (INTEGER)
                                           (0x03000000:NameValue):Date                        = 'Wed, 30 Oct 2013 17:16:36 GMT' (CHARACTER)
                                           (0x03000000:NameValue):Server                      = 'Microsoft-IIS/6.0' (CHARACTER)
                                           (0x03000000:NameValue):Content-Length              = '381' (CHARACTER)
                                           (0x03000000:NameValue):Content-Type                = 'text/xml' (CHARACTER)
                                           (0x03000000:NameValue):Set-Cookie                  = 'ASPSESSIONIDCQSDSBSR=OIHLAPJADDBAJCCILOILOKJN; path=/' (CHARACTER)
                                           (0x03000000:NameValue):Cache-control               = 'private' (CHARACTER)
                                         )
                                         (0x01000000:Name):BLOB               = ( ['none' : 0x654eee0]
                                           (0x03000000:NameValue):UnknownParserName = '' (CHARACTER)
                                           (0x03000000:NameValue):BLOB              = X'3c3f786d6c207665727


There is no encoding definition in content-type
Back to top
View user's profile Send private message
mgk
PostPosted: Wed Oct 30, 2013 4:09 pm    Post subject: Reply with quote

Padawan

Joined: 31 Jul 2003
Posts: 1642

So a charset of utf-8 will be assumed if the remote server does not set the charset correctly.

Kind regards,
_________________
MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
Back to top
View user's profile Send private message
gs
PostPosted: Thu Oct 31, 2013 2:45 am    Post subject: Reply with quote

Master

Joined: 31 May 2007
Posts: 254
Location: Sweden

mgk wrote:
So a charset of utf-8 will be assumed if the remote server does not set the charset correctly.


My thought too, but shouldn't ISO-8859-1 be the default when charset is missing from the header?

http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.4
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu Oct 31, 2013 2:52 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

gs wrote:
mgk wrote:
So a charset of utf-8 will be assumed if the remote server does not set the charset correctly.


My thought too, but shouldn't ISO-8859-1 be the default when charset is missing from the header?

http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.4


Section 3.7.1 says that media types of the "text" type have a default char set of ISO-8859-1, but your media is "text/xml", and I think you'll find that XML has a default charset of utf-8. Although in theory, if the MIME part doesn't include a charset, then the charset in the xml declaration should be used.

I bet you'll find that the doc is being sent in ISO-8859-1, but the xml declaration says "utf-8"....
Back to top
View user's profile Send private message
gs
PostPosted: Thu Oct 31, 2013 3:06 am    Post subject: Reply with quote

Master

Joined: 31 May 2007
Posts: 254
Location: Sweden

mqjeff wrote:

Section 3.7.1 says that media types of the "text" type have a default char set of ISO-8859-1, but your media is "text/xml", and I think you'll find that XML has a default charset of utf-8. Although in theory, if the MIME part doesn't include a charset, then the charset in the xml declaration should be used


text/xml is a subtype of text, meaning it should be defaulted to ISO-8859-1.

Quote:
When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP.
Back to top
View user's profile Send private message
elenzo
PostPosted: Thu Oct 31, 2013 9:42 am    Post subject: Reply with quote

Acolyte

Joined: 22 Aug 2006
Posts: 53

Thanks for the replys, indeed the httpresponseheader has text/xml as content-tpye and according to what you said it should be set to iso-8859-1 as ccsid, but thats not what it is doing.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu Oct 31, 2013 9:51 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Ok... again, I thought the rules were different for text/xml, but they're not [url=http://www.ietf.org/rfc/rfc3023.txt]

8.5 Text/xml with Omitted Charset

Content-type: text/xml

{BOM}<?xml version="1.0" encoding="utf-16"?>

or

{BOM}<?xml version="1.0"?>

This example shows text/xml with the charset parameter omitted. In
this case, MIME and XML processors MUST assume the charset is "us-
ascii", the default charset value for text media types specified in
[RFC2046]. The default of "us-ascii" holds even if the text/xml
entity is transported using HTTP.

Omitting the charset parameter is NOT RECOMMENDED for text/xml. For
example, even if the contents of the XML MIME entity are UTF-16 or
UTF-8, or the XML MIME entity has an explicit encoding declaration,
XML and MIME processors MUST assume the charset is "us-ascii".[/url]

So I'd pursue a PMR.
Back to top
View user's profile Send private message
mgk
PostPosted: Thu Oct 31, 2013 11:18 am    Post subject: Reply with quote

Padawan

Joined: 31 Jul 2003
Posts: 1642

Can you post more of the BLOB response message - the XML-decl may say UTF-8 which would influence the parsing...

Kind regards,
_________________
MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
Back to top
View user's profile Send private message
elenzo
PostPosted: Thu Oct 31, 2013 11:20 am    Post subject: Reply with quote

Acolyte

Joined: 22 Aug 2006
Posts: 53

mgk wrote:
Can you post more of the BLOB response message - the XML-decl may say UTF-8 which would influence the parsing...


The XML declaration of the response is ISO-8859-1, I've already checked
Back to top
View user's profile Send private message
mgk
PostPosted: Thu Oct 31, 2013 12:18 pm    Post subject: Reply with quote

Padawan

Joined: 31 Jul 2003
Posts: 1642

So, I've checked the code and this behaviour is by design, since pragmatically the vast majority of customers who omit a charset are actually using utf-8. In fact, this case here is the first one I've seen that does seem to require ISO-8859-1. Is it possible to get the remote end to send the charset with the response?

Kind regards,
_________________
MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
Back to top
View user's profile Send private message
elenzo
PostPosted: Thu Oct 31, 2013 12:22 pm    Post subject: Reply with quote

Acolyte

Joined: 22 Aug 2006
Posts: 53

mgk wrote:
So, I've checked the code and this behaviour is by design, since pragmatically the vast majority of customers who omit a charset are actually using utf-8. In fact, this case here is the first one I've seen that does seem to require ISO-8859-1. Is it possible to get the remote end to send the charset with the response?
Kind regards,


Thanks for the information. Its not possible to get the remote end, it is an external web service. The important is that you confirm how WMB works, that is enough for me. I 'll fix it with my workaround.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Encoding problems in HTTPRequest Node
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.