Author |
Message
|
jborella |
Posted: Tue Nov 02, 2010 12:56 am Post subject: |
|
|
Apprentice
Joined: 04 Jun 2009 Posts: 26
|
kimbert wrote: |
If I have understood this correctly:
- the XML has an XML declaration which accurately describes the encoding of the XML document.
- the FileInput node is not reading the XML declaration to determine the encoding. |
Exactly.
kimbert wrote: |
That may be a deliberate design decision aimed at making the FileInput node behave consistently with other nodes. |
Can You elaborate on that? I'm not sure I understand what You mean.
kimbert wrote: |
I think we should wait and see what the response to the PMR is. |
Good idea, though I'm interested in Your considerations about consistency with other nodes. I've posted a PMR and are in the process of providing details of the problems I'm experiencing. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Nov 02, 2010 1:21 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term).
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. |
|
Back to top |
|
 |
jborella |
Posted: Tue Nov 02, 2010 1:31 am Post subject: |
|
|
Apprentice
Joined: 04 Jun 2009 Posts: 26
|
kimbert wrote: |
I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term). |
I think thats spot on, in what the discussion is about. In MQ the MQMD header is an external header, and thus it feels natural to use the MQMD.CodedCharSetId field to decide the character encoding. How can an internal hardcoded value be considered an external header?
kimbert wrote: |
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. |
Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I wan't to parse XML. |
|
Back to top |
|
 |
fjb_saper |
Posted: Tue Nov 02, 2010 11:52 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
jborella wrote: |
kimbert wrote: |
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. |
Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I want to parse XML. |
You are right of course but the XMLNS / XMLNSC parsers will also be able to parse xml data without any encoding declaration as long as the CCSID of the data is provided to the parser.
The problem for you as I see it, is that you would want the node to behave in following fashion:
- global default the CCSID of the box
- 1st override the CCSID on the node
- 2nd override the CCSID in the encoding xml declaration on the file
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jborella |
Posted: Tue Nov 23, 2010 12:34 am Post subject: |
|
|
Apprentice
Joined: 04 Jun 2009 Posts: 26
|
As expected I got a reply from IBM stating:
"The CCSID is mandatory property on File Input node and defaults to broker system default value. FileInput node uses stream based parser which interprets the physical bit stream of an incoming message based on the CCSID configured on the node and hence there is a need to know the CCSID before the node parsers the message data. There is currently no functionality in the broker product to allow the XML parsers to change CCSID and Encoding based on the contents of the XML prolog. This is a current limitation in the product."
Only good thing is, that they admit that it's a limitation in the product, but I can't use that to very much.
We now use the following workaround. We read in the XML with the FileInput node as a BLOB, and then use the ESQL:
Code: |
CREATE COMPUTE MODULE LOF_Mainflow_mf_v1_SetCharacterSet
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
CALL CopyMessageHeaders();
DECLARE encoding INTEGER 546;
DECLARE cssid INTEGER 850;
DECLARE blobData BLOB InputRoot.BLOB.BLOB;
-- Remove Byte Order Mark
DECLARE tmpStr CHAR LTRIM(CAST(blobData as CHAR CCSID 850 ENCODING 546));
IF STARTSWITH(tmpStr,'<') = FALSE THEN
SET tmpStr = SUBSTRING(tmpStr FROM POSITION('<' IN tmpStr));
SET blobData = CAST(tmpStr AS BLOB CCSID cssid ENCODING encoding);
END IF;
-- Parse with CP850 to find XmlEncoding
CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
-- If the XML defines an encoding, use that
-- If not, use UTF-8
IF UPPER(OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Encoding) = 'WINDOWS-1252' THEN
SET cssid = 1252;
ELSE
SET cssid = 1208;
END IF;
-- Reparse BLOB with XML defined encoding
SET OutputRoot.XMLNSC = NULL;
CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
SET OutputRoot.Properties.Encoding = encoding;
SET OutputRoot.Properties.CodedCharSetId = cssid;
RETURN TRUE;
END;
CREATE PROCEDURE CopyMessageHeaders() BEGIN
DECLARE I INTEGER 1;
DECLARE J INTEGER;
SET J = CARDINALITY(InputRoot.*[]);
WHILE I < J DO
SET OutputRoot.*[I] = InputRoot.*[I];
SET I = I + 1;
END WHILE;
END;
CREATE PROCEDURE CopyEntireMessage() BEGIN
SET OutputRoot = InputRoot;
END;
END MODULE; |
|
|
Back to top |
|
 |
kimbert |
Posted: Tue Nov 23, 2010 3:35 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Glad you got it working. Thank you for being a good citizen and posting the solution. |
|
Back to top |
|
 |
jborella |
Posted: Wed Dec 01, 2010 1:04 am Post subject: |
|
|
Apprentice
Joined: 04 Jun 2009 Posts: 26
|
kimbert wrote: |
Glad you got it working. Thank you for being a good citizen and posting the solution. |
You are welcome. I always find it frustrating myself, when a thread isn't closed with some kind of solution or conclusion. Thank You all for promt and insightfull responses. |
|
Back to top |
|
 |
|