|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Flexible validation using XML Schema |
« View previous topic :: View next topic » |
Author |
Message
|
nize |
Posted: Mon Oct 05, 2009 3:40 am Post subject: Flexible validation using XML Schema |
|
|
Voyager
Joined: 02 Sep 2009 Posts: 90
|
Hi!
BACKGROUND
I have an mxsd file looking as follows:
Code: |
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="unqualified"
targetNamespace=".. some text.. " version="0.2"
xmlns:tns="... some text ..." xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="credR" type="tns:credR">
<xs:annotation>
<xs:appinfo source="WMQI_APPINFO">
<MRMessage messageDefinition="/credR;XSDElementDeclaration/"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:complexType name="credR">
<xs:annotation>
<xs:documentation>Morgage status</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:element maxOccurs="unbounded" minOccurs="1" name="credRRec">
<xs:complexType>
<xs:annotation>
<xs:appinfo source="WMQI_APPINFO">
<MRComplexType composition="unorderedSet" content="open"/>
</xs:appinfo>
</xs:annotation>
<xs:sequence maxOccurs="1" minOccurs="1">
<xs:element name="distrNationCode">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="2"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="distrNo">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="5"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema> |
It is used for validation of XML messages read by a WMB5 flow. The XML messages read by the flow may have additional siblings to the elements (distrNo and distrNationCode) detailed in the mxsd above. However, since I have set content="open" the parser throws an exception IF AND ONLY IF distrNo and/or distrNationCode are missing. This fulfills the requirements since it is only these elements required by the flow.
The mxsd file above was generated using a modified subpart of the original XML Schema describing the messages expected to read by the flow.
The solution is working as planned.
QUESTION
Ok, so if the message set is working as supposed, which is the problem? Well, the problem has to do with that I didnt know how to model the flexibility only using an XML Schema. Instead I needed to use the MRM functionality (using the parameter content="open"). Is there a way of modeling (in an XML Schema) that any elements (possibly with different names, all of them having any number of repeats and possibly children) may be siblings of the two elements distrNo and distrNationCode which I require and that the elements I require may come in any order compared to the other siblings and to each other? Why would I want this? Since the solution would be easier to maintain if the logic of the validation could be described only by viewing the XML Schema. |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Oct 05, 2009 4:13 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
|
Back to top |
|
 |
kimbert |
Posted: Mon Oct 05, 2009 4:24 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Requirement:
Validate that complex type 'credR' contains both 'distrNationCode' and 'distrNo'. Any other elements are allowed in any position, with any multiplicity, whether or not they are described in the message set.
XML Schema:
You would model this with wildcards ( xs:any )
Code: |
<xs:element maxOccurs="unbounded" minOccurs="1" name="credRRec">
<xs:complexType>
<xs:sequence maxOccurs="1" minOccurs="1">
<xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="distrNationCode">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="2"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="distrNo">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="5"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element> |
However, this will not work because the XML specification requires your message definition to be unambiguous. The above definition is ambiguous.
If <distrNationCode> occurs first, it could match the the first xs:any OR the explicit element definition which follows it. That violates the Unique Particle Attribution rule.
The usual way to get around this is to add namespace constraints to the xs:any. In effect, you are saying 'allow items from any of the following namespaces'. But your message doesn't use namespaces, so you can't use that workaround.
Options:
a) Stick with the MRM XML parser. Since your message definition is truly ambiguous, there's just a chance that a future change might break your scenario. However, that's unlikely, because MRM XML is not being enhanced.
b) Bite the bullet and actually model all of the elements which can occur in your input message. Then you can use xs:choice instead of xs:any, and avoid the ambiguity. If you really want to allow unknown elements in any namespace, you need to move your message definition into a target namespace, and then use an xs:any with a namespace constraint of '#other#' which means 'allow any element, as long as it is not in the target namespace of this xsd'.
In my opinion, b) is the more expensive choice, but will probably repay the investment.
If you choose b), then you must switch to XMLNSC - the MRM parser is not standards-compliant when it processes xs:any elements. |
|
Back to top |
|
 |
nize |
Posted: Tue Oct 06, 2009 12:40 am Post subject: |
|
|
Voyager
Joined: 02 Sep 2009 Posts: 90
|
Quote: |
Requirement:
Validate that complex type 'credR' contains both 'distrNationCode' and 'distrNo'. Any other elements are allowed in any position, with any multiplicity, whether or not they are described in the message set. |
well put! I am impressed with your longsuffering trying to figure out my wording and the concise and precise way you express it. A detail though: the parent of 'distrNationCode' and 'distrNo' is called 'credRRec'.
Quote: |
However, this will not work because the XML specification requires your message definition to be unambiguous. The above definition is ambiguous.
If <distrNationCode> occurs first, it could match the the first xs:any OR the explicit element definition which follows it. That violates the Unique Particle Attribution rule. |
Yes, I discovered that when trying.
Quote: |
But your message doesn't use namespaces, so you can't use that workaround. |
I studied the processContents="lax" by reviewing http://www.zvon.org/xxl/XMLSchemaTutorial/Output/ser_import_st1.html and it was interesting to learn. However, as I understand it you still need to KNOW that the "anonymous" elements belong to separate namespaces from the other known and "identified" elements. This is a limitation.
Quote: |
a) Stick with the MRM XML parser. Since your message definition is truly ambiguous, there's just a chance that a future change might break your scenario. However, that's unlikely, because MRM XML is not being enhanced.
|
Yes, I agree that the definition is ambiguous, so changes in the incoming message structures will probably not make my flow having problems. But the problem is rather that I would like, if it is possible, to solve this kind of reqs in prettier way: a way in which the architects only need to study the XML schema in order to understand the validation logic. That is the kernel of my goal.
Quote: |
b) Bite the bullet and actually model all of the elements which can occur in your input message. Then you can use xs:choice instead of xs:any, and avoid the ambiguity. If you really want to allow unknown elements in any namespace, you need to move your message definition into a target namespace, and then use an xs:any with a namespace constraint of '#other#' which means 'allow any element, as long as it is not in the target namespace of this xsd'.
|
If I understood your suggestion correctly: I hope you understand when I tried to clarify that this would not fulfill my purposes. I mean: if I detail the elements which can occur in the incoming message today, the sending app can not add new elements without having me changing the flow. I guess one way to do it would be to ask them to put all elements added to new releases in separate namespaces (maybe denoting release version).
From the info I got here it still appears to me that XML schema validation has rather big limitations! I believe that the requirement (which you express so nicely) is not too much to ask (also when adding the context that the other elements could be in the same namespace). Coding your own validator (in any language) it wouldn't be too difficult to achieve this I believe. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Oct 06, 2009 2:45 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
I agree with everything that you said apart from the last paragraph. I don't agree that your requirement is a simple one. XML Schema does allow you to do something similar, but only if you have defined a namespace for the elements in your message. That's best practice anyway.
In general, your requirement leads to ambiguous message definitions, and that is something that the W3C XML Schema panel made strenuous efforts to avoid - quite rightly, in my opinion.
You might want to consder this as an alternative to my previous suggestions:
- use XMLNSC, switch off validation, validate in the message flow ( it's only two fields, after all).
- for the future, ask the sender to add a namepace to the message. That would allow you to use xs:any with ##other to allow any element from any *other* namespace. Process the new messages in a different flow ( or a different part of the same flow ) which has validation enabled. |
|
Back to top |
|
 |
nize |
Posted: Tue Oct 06, 2009 3:44 am Post subject: |
|
|
Voyager
Joined: 02 Sep 2009 Posts: 90
|
Quote: |
XML Schema does allow you to do something similar, but only if you have defined a namespace for the elements in your message. That's best practice anyway.
...
- for the future, ask the sender to add a namepace to the message. That would allow you to use xs:any with ##other to allow any element from any *other* namespace. Process the new messages in a different flow ( or a different part of the same flow ) which has validation enabled.
|
Yes, I agree that namespaces are recommendable, but even if they were used, the sender would need to use separate namespaces for the new "unknown" elements, and then the we lose the point.
Quote: |
use XMLNSC, switch off validation, validate in the message flow ( it's only two fields, after all).
|
yeah, that is an alternative, but really there is no difference between this approach and the one that I am currently using. Maybe the one you suggest is somewhat more simple, but still you would miss the nice point of separating the data description from processing logic (which I believe is "pretty" using schemas and message sets).
thanks for your help! Still I believe I will keep my eyes open for a flexible (yet easy-understandable) standard for data validation.  |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|