Author |
Message
|
mattfarney |
Posted: Thu Jan 12, 2017 11:20 am Post subject: Trailing space causes XMLNS parser to fail to parse. |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
I'm noticing that a trailing space in at the end of an xml message causes the XMLNS parser to fail to parse.
These are contrived, simple examples, but they illustrate my point better.
This works:
Code: |
<?xml version="1.0" encoding="UTF-8" ?><GetDateInfo lang="en-US" environment="Production" revision="1.0" xmlns:oa="http://www.openapplications.org/oagis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><oa:ApplicationArea></oa:ApplicationArea><DataArea><GetDate><Date>January 12, 2017</Date></GetDate></DataArea></GetDateInfo> |
This does not work.
Code: |
<?xml version="1.0" encoding="UTF-8" ?><GetDateInfo lang="en-US" environment="Production" revision="1.0" xmlns:oa="http://www.openapplications.org/oagis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><oa:ApplicationArea></oa:ApplicationArea><DataArea><GetDate><Date>January 12, 2017</Date></GetDate></DataArea></GetDateInfo>* |
where * is a single whitespace.
The code uses fieldname(Root.XMLNS.*[<]) to find the BOD name.
In the first example, it would return GetDateInfo. In the second, it returns NULL.
I have two questions.
1. Is this a known thing? I would have expected the trailing whitespace to simply have been ignored.
2. I can certainly detect this and handle it, but since I would have to do this before the parser is called, I would have to do it for a significantly larger set of transactions than I would like to. I didn't see any options at the parser level that look like they would help.
-mf |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 12, 2017 11:26 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Does it fail with the XMLNSC parser? _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
mattfarney |
Posted: Thu Jan 12, 2017 11:36 am Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
Based on a quick test, it looks like the XMLNSC parser does not have that issue.
It correctly reports the BOD name.
-mf |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 12, 2017 12:56 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
mattfarney wrote: |
Based on a quick test, it looks like the XMLNSC parser does not have that issue. |
So given that's there's no apparent reason to use XMLNS rather than XMLNSC you're good.
Is there a not-apparent reason? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
mattfarney |
Posted: Thu Jan 12, 2017 1:17 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
When I see something like this (trailing space causes parse failure) and the behavior isn't what I expect, I begin to doubt some of my assumptions - hence my question.
So basically, this can be summed up as:
1. XMLNS has an issue with a trailing space.
2. XMLNSC is better than XMLNS except in narrow circumstances and does not have the problem.
Solution: move to XMLNSC if possible.
After reading some of the other threads on the parser topic, this line came up:
Quote: |
Tip: If you are using XPath to access the message tree, and you require the message tree to conform as closely as possible to the XML data model, use the XMLNS domain. |
Can anyone give me an example where this would be true?
-mf |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 12, 2017 1:23 pm Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
mattfarney wrote: |
Can anyone give me an example where this would be true? |
Where the XPath relies on whitespace or something else that would be eliminated by the XMLNSC domain. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
mattfarney |
Posted: Thu Jan 12, 2017 1:30 pm Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
So an XPath like
Employee/Name/First Name
vs
Employee/Name/FirstName
-mf |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 12, 2017 1:32 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
There should be parser options for how the XMLNSC parser handles whitespace...
 _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
timber |
Posted: Thu Jan 12, 2017 2:54 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
That example is not a good one. XMLNSC will handle it in exactly the same way as XMLNS.
There are no options for controlling how XMLNSC handles white space. The only option is 'Suppress mixed content' which does exactly what it says on the tin. Mixed content is text ( not just white space ) that occurs between the closing of one tag and the open tag of the next tag.
e.g.
Code: |
<parent>
<child1> This is child1 content, including the white space <child1>
This is mixed content
<child2>This is child2 content </child2>
</parent> |
|
|
Back to top |
|
 |
timber |
Posted: Thu Jan 12, 2017 2:57 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
I remember that line being added to the info center just before XMLNSC was released in v6.1. Here are the reasons:
XMLNSC builds a different tree structure than XMLNS. Apart from the obvious stuff like omitting mixed content, XMLNSC represents a simple-valued tag like <number>1<number> using a single NameValue node instead of a Name node with a child Value node.
Normally this does not affect the ESQL, Java or XPath that you write. But if you try very hard you can write ESQL, Java or XPath that will notice the difference. IBM is very conservative (cautious) about technical edge cases like this, so the warning was put into the Knowledge Center topic to warn any customers who use large and complex XPaths in their mapping code. But simple queries of the kind that everybody uses in normal mapping code will not be affected. In fact, I would be hard pushed to come up with a good example of a real-world piece of XSLT or XPath that would execute incorrectly against an XMLNSC message tree.
There is an obvious case: look at the XML in my previous post. If the XPath uses position(/parent/child2) on then XMLNS will return 3 and XMLNSC will return 2. That is expected, and obvious - XMLNSC has discarded the mixed content which will change the position of child2 within its parent. Also, the person who wrote the XPath should be taken out and shot because they should not be relying on positional indexing in an XML document.
If anybody reading this thread wants to come up with other examples where you get different results from XMLNS and XMLNSC then please post them here. |
|
Back to top |
|
 |
rekarm01 |
Posted: Thu Jan 12, 2017 5:25 pm Post subject: Re: Trailing space causes XMLNS parser to fail to parse. |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
mattfarney wrote: |
I'm noticing that a trailing space in at the end of an xml message causes the XMLNS parser to fail to parse. |
"Fail to parse" suggests that some sort of exception occurs. That's not the case; it parses just fine, although maybe not as expected. Use a Trace node to see how the XMLNS parser handles the trailing space.
mattfarney wrote: |
The code uses fieldname(Root.XMLNS.*[<]) to find the BOD name.
In the first example, it would return GetDateInfo. In the second, it returns NULL. |
In the second example, the last child of Root.XMLNS is an unnamed field of type (XML.Whitespace), for which FIELDNAME() should return NULL.
mattfarney wrote: |
I have two questions.
1. Is this a known thing? |
Yes, this is a known and documented thing. The XMLNS parser does not discard such white space. The XMLNSC parser discards it by default, but can be configured to keep it, with the "retain mixed content" option.
mattfarney wrote: |
2. I can certainly detect this and handle it, but since I would have to do this before the parser is called, I would have to do it for a significantly larger set of transactions than I would like to. I didn't see any options at the parser level that look like they would help. |
That's not really a question, but another option is to qualify the ESQL reference with a field type; for example: FIELDNAME(Root.XMLNS.(XML.Element)*)
mattfarney wrote: |
After reading some of the other threads on the parser topic, this line came up:
Quote: |
Tip: If you are using XPath to access the message tree, and you require the message tree to conform as closely as possible to the XML data model, use the XMLNS domain. |
|
There are also other tips here, to help decide which parser to use where.
mattfarney wrote: |
Can anyone give me an example where this would be true? |
From the same page, "if you are using certain XPath expressions to access the message tree, and the relative position of parent and child nodes is important, or if you are accessing text nodes directly." For example, XPath models a simple element as a parent element node with child text node; XMLNS models it similarly, as a parent Name field with child Value field, whereas XMLNSC models it more compactly, as a single NameValue field.
mattfarney wrote: |
So an XPath like
Employee/Name/First Name |
No. XML does not allow white space as part of the element name. |
|
Back to top |
|
 |
Vitor |
Posted: Fri Jan 13, 2017 5:34 am Post subject: Re: Trailing space causes XMLNS parser to fail to parse. |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
rekarm01 wrote: |
mattfarney wrote: |
So an XPath like
Employee/Name/First Name |
No. XML does not allow white space as part of the element name. |
I echo the comments of my associates, and for the record that's not what I meant by whitespace; I meant what's been referred to as "mixed content" _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
mattfarney |
Posted: Fri Jan 13, 2017 6:37 am Post subject: |
|
|
 Disciple
Joined: 17 Jan 2006 Posts: 167 Location: Ohio
|
Thanks for all of the information, esp. rekarm01.
It was very helpful.
-mf |
|
Back to top |
|
 |
|