ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Problem parsing XML with FileInputNode.

Post new topic  Reply to topic Goto page Previous  1, 2
 Problem parsing XML with FileInputNode. « View previous topic :: View next topic » 
Author Message
jborella
PostPosted: Tue Nov 02, 2010 12:56 am    Post subject: Reply with quote

Apprentice

Joined: 04 Jun 2009
Posts: 26

kimbert wrote:
If I have understood this correctly:
- the XML has an XML declaration which accurately describes the encoding of the XML document.
- the FileInput node is not reading the XML declaration to determine the encoding.

Exactly.
kimbert wrote:
That may be a deliberate design decision aimed at making the FileInput node behave consistently with other nodes.

Can You elaborate on that? I'm not sure I understand what You mean.
kimbert wrote:
I think we should wait and see what the response to the PMR is.

Good idea, though I'm interested in Your considerations about consistency with other nodes. I've posted a PMR and are in the process of providing details of the problems I'm experiencing.
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Nov 02, 2010 1:21 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term).

The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding.
Back to top
View user's profile Send private message
jborella
PostPosted: Tue Nov 02, 2010 1:31 am    Post subject: Reply with quote

Apprentice

Joined: 04 Jun 2009
Posts: 26

kimbert wrote:
I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term).

I think thats spot on, in what the discussion is about. In MQ the MQMD header is an external header, and thus it feels natural to use the MQMD.CodedCharSetId field to decide the character encoding. How can an internal hardcoded value be considered an external header?

kimbert wrote:
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding.

Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I wan't to parse XML.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Tue Nov 02, 2010 11:52 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

jborella wrote:
kimbert wrote:
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding.

Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I want to parse XML.


You are right of course but the XMLNS / XMLNSC parsers will also be able to parse xml data without any encoding declaration as long as the CCSID of the data is provided to the parser.

The problem for you as I see it, is that you would want the node to behave in following fashion:
  1. global default the CCSID of the box
  2. 1st override the CCSID on the node
  3. 2nd override the CCSID in the encoding xml declaration on the file


Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
jborella
PostPosted: Tue Nov 23, 2010 12:34 am    Post subject: Reply with quote

Apprentice

Joined: 04 Jun 2009
Posts: 26

As expected I got a reply from IBM stating:

"The CCSID is mandatory property on File Input node and defaults to broker system default value. FileInput node uses stream based parser which interprets the physical bit stream of an incoming message based on the CCSID configured on the node and hence there is a need to know the CCSID before the node parsers the message data. There is currently no functionality in the broker product to allow the XML parsers to change CCSID and Encoding based on the contents of the XML prolog. This is a current limitation in the product."

Only good thing is, that they admit that it's a limitation in the product, but I can't use that to very much.

We now use the following workaround. We read in the XML with the FileInput node as a BLOB, and then use the ESQL:

Code:
CREATE COMPUTE MODULE LOF_Mainflow_mf_v1_SetCharacterSet
           CREATE FUNCTION Main() RETURNS BOOLEAN
           BEGIN
                      CALL CopyMessageHeaders();
                      DECLARE encoding INTEGER 546;
                      DECLARE cssid INTEGER 850;
                      DECLARE blobData BLOB InputRoot.BLOB.BLOB;
 
                      -- Remove Byte Order Mark
                      DECLARE tmpStr CHAR LTRIM(CAST(blobData as CHAR CCSID 850 ENCODING 546));
                      IF STARTSWITH(tmpStr,'<') = FALSE THEN
                                 SET tmpStr = SUBSTRING(tmpStr FROM POSITION('<' IN tmpStr));
                                 SET blobData = CAST(tmpStr AS BLOB CCSID cssid ENCODING encoding);
                      END IF;
 
                      -- Parse with CP850 to find XmlEncoding
                      CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
 
                      -- If the XML defines an encoding, use that
                      -- If not, use UTF-8
                      IF UPPER(OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Encoding) = 'WINDOWS-1252' THEN
                                 SET cssid = 1252;
                      ELSE
                                 SET cssid = 1208;
                      END IF;
 
                      -- Reparse BLOB with XML defined encoding
                      SET OutputRoot.XMLNSC = NULL;
                      CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
                      SET OutputRoot.Properties.Encoding = encoding;
                      SET OutputRoot.Properties.CodedCharSetId = cssid;
 
                      RETURN TRUE;
           END;
 
           CREATE PROCEDURE CopyMessageHeaders() BEGIN
                      DECLARE I INTEGER 1;
                      DECLARE J INTEGER;
                      SET J = CARDINALITY(InputRoot.*[]);
                      WHILE I < J DO
                                 SET OutputRoot.*[I] = InputRoot.*[I];
                                 SET I = I + 1;
                      END WHILE;
           END;
 
           CREATE PROCEDURE CopyEntireMessage() BEGIN
                      SET OutputRoot = InputRoot;
           END;
END MODULE;
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Nov 23, 2010 3:35 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Glad you got it working. Thank you for being a good citizen and posting the solution.
Back to top
View user's profile Send private message
jborella
PostPosted: Wed Dec 01, 2010 1:04 am    Post subject: Reply with quote

Apprentice

Joined: 04 Jun 2009
Posts: 26

kimbert wrote:
Glad you got it working. Thank you for being a good citizen and posting the solution.

You are welcome. I always find it frustrating myself, when a thread isn't closed with some kind of solution or conclusion. Thank You all for promt and insightfull responses.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page Previous  1, 2 Page 2 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Problem parsing XML with FileInputNode.
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.