Thursday, March 23, 2017

What's new in XSD 1.1

The XSD 1.1 standard builds on the XSD 1.0 adding a number of features that were conspicuous by their absence. Solving many of the problems that keep coming up in forums and discussion groups, it also remains compatible with the previous standard, meaning no need to upgrade existing schemas.

Liquid Studio 2017 support XSD 1.1 as part of the graphical XML Schema editor.

Summary of what’s new in XSD 1.1

  • xs:alternative – provides value driven typing, if X=true then the element is of type Y
  • xs:assert element – allows cross field validation via XPath predicate expressions.
  • xs:assertion facet – allows value level validation via XPath predicate expressions.
  • xs:openContent – enables extensibility by allowing additional elements throughout an element
  • defaultAttributeGroup – add a group of attributes to every element
  • defaultOpenContent – make every element extensible x

Compatibility – using XSD 1.0 schemas with an XSD 1.1 validator

The XSD 1.1 validator is compatible with the XSD 1.0 standard meaning you can pass your XSD 1.0 schemas using a XSD 1.1 validator and not notice any difference in the result.

xs:alternative

Provides a mechanism for refining the type of an element based on an XPath expression. In XSD 1.0 this had to be been done explicitly using the xsi:type attribute in the XML document, but now it can be done implicitly via the schema.

It sounds more complex than it is, an example will make it clearer.

xs:alternative Sample
<?xml version="1.0" encoding="utf-8" ?>
<!--Created with Liquid Studio 2017 (https://www.liquid-technologies.com)-->
<xs:schema elementFormDefault="qualified" 
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Example">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Address" type="AddressType" maxOccurs="unbounded">
                    <xs:alternative test="@country = 'UK'" type="UKAddressType" />
                    <xs:alternative test="@country = 'US'" type="USAddressType" />
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:complexType name="AddressType">
        <xs:sequence>
            <xs:element name="Line1" type="xs:string" />
            <xs:element name="Line2" type="xs:string" />
        </xs:sequence>
        <xs:attribute name="country">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="UK" />
                    <xs:enumeration value="US" />
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
    </xs:complexType>
    <xs:complexType name="UKAddressType">
        <xs:complexContent>
            <xs:extension base="AddressType">
                <xs:sequence>
                    <xs:element name="County" type="xs:string" />
                    <xs:element name="Postcode" type="xs:string" />
                </xs:sequence>
            </xs:extension>
        </xs:complexContent>
    </xs:complexType>
    <xs:complexType name="USAddressType">
        <xs:complexContent>
            <xs:extension base="AddressType">
                <xs:sequence>
                    <xs:element name="State" type="xs:string" />
                    <xs:element name="ZipCode" type="xs:string" />
                </xs:sequence>
            </xs:extension>
        </xs:complexContent>
    </xs:complexType>
</xs:schema>

enter image description here

When the following XML document is validated, the test in each xs:alternative is evaluated, and the first one to pass is used. The first Address element has a the country set to ‘UK’ this matches the test @country = ‘UK’ in the first xs:alternative, so the validation continues on using the type UKAddressType.

The 3rd address has country = ‘UK’ so the UKAddressType is used, however its does not contain the correct child elements, so validation fails.

<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Liquid Studio 2017 (https://www.liquid-technologies.com) -->
<Example xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="AlternativeExample.xsd">
    <Address country="UK">
        <Line1>string</Line1>
        <Line2>string</Line2>
        <County>string</County>
        <Postcode>string</Postcode>
    </Address>
    <Address country="US">
        <Line1>string</Line1>
        <Line2>string</Line2>
        <State>string</State>
        <ZipCode>string</ZipCode>
    </Address>
    <Address country="UK">
        <Line1>string</Line1>
        <Line2>string</Line2>
        <State>INVALID</State>
        <ZipCode>INVALID</ZipCode>
    </Address>
</Example>

Properties
  • test : The XPath expression which must evaluate to true in order for this alternative definition to be used. It is important to note that the XPath expression is limited to selecting attributes only. i.e. you can only base the expression on attribute values, you can not use it to access child elements.
  • xpathDefaultNamespace : The default namespace used within the expression. If not specified then it is the default namespace of the containing element in the schema.
xs:assert

The new xs:assert provides a mechanism for cross field validation, this makes it possible to implement complex business rules in an XSD schema, previously this kind of validation could only be accomplished in a limited sense with keyref’s, or by other tools (i.e. schematron).

Although this new feature is very powerful, compromises have had to be made between performance, ease of implementation and functionality. In order to simplify implementations, and limit resources use, the XPath expression can only access nodes within the element it is attached to. In the example below, as it is attached to the paragraph element it can only access nodes within the paragraph (table/title), it can NOT access the parent element Example.

The xs:assert has a attribute called test, this contains an XPath expression which is evaluated, if it does not evaluate to true, a validation error is raised.

An xs:assert can be added to the following XSD entities
  • xs:alternative
  • xs:complexType
  • xs:element
Properties
  • test : The XPath expression which must evalute to true for validation to be successfull.
  • xpathDefaultNamespace : The default namespace used within the expression. If not specified then it is the default namespace of the containing element in the schema.
Example xs:assert schema

The following schema describes an assert on the paragraph element. Its basically saying if you have a table element the element before it must be a title.

<?xml version="1.0" encoding="utf-8" ?>
<!--Created with Liquid Studio 2017 (https://www.liquid-technologies.com)-->
<xs:schema elementFormDefault="qualified" 
            xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Example">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="paragraph" type="paragraphType"
                            maxOccurs="unbounded" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:complexType mixed="true" name="paragraphType">
        <xs:sequence>
            <xs:element name="title" type="xs:string" minOccurs="0" />
            <xs:element name="table" type="xs:string" minOccurs="0" />
        </xs:sequence>
        <xs:assert test="if (table) then table/preceding-sibling::title else true()" />
    </xs:complexType>
</xs:schema>

XSD 1.1 xs:assert sample

Example xs:assert sample XML document

The following sample XML shows 3 paragraphs the first 2 are valid, the last one has a table element, but the element before is not a title (their is no element before it), so validation fails.

<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Liquid Studio 2017 (https://www.liquid-technologies.com) -->
<Example xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="AssertSchema.xsd">
    <paragraph>
        <title>VALID</title>
        <table>VALID</table>
    </paragraph>

    <paragraph>
        <title>VALID</title>
    </paragraph>

    <paragraph>
        <table>NOT OK - Missing title</table>
    </paragraph>
</Example>

xs:assertion facet
Assertions are a new type of facet that can be applied to xs:simpleTypes, they contain an XPath expression which must evaluate to true in order validate. They can only operate on the value of the type (they can’t access other nodes in the XML document).

Properties
  • test : The XPath expression which must evalute to true for validation to be successfull.
  • xpathDefaultNamespace : The default namespace used within the expression. If not specified then it is the default namespace of the containing element in the schema.
$value : Within the XPath ‘test’ expression the built in variable $value is defined which represents the value to be tested.

Sample xs:assertion
In the sample bellow the xs:assertion ensures that only even values are allowed.

<?xml version="1.0" encoding="utf-8" ?>
<!--Created with Liquid Studio 2017 (https://www.liquid-technologies.com)-->
<xs:schema elementFormDefault="qualified" 
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:simpleType name="EvenNumbers">
        <xs:restriction base="xs:int">
            <xs:assertion test="$value mod 2 = 0" />
        </xs:restriction>
    </xs:simpleType>
    <xs:element name="Numbers">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Number" type="EvenNumbers" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
XSD 1.1. xs:assertion sample

xs:openContent
Open Content allows you to design schemas that are extensible in a way that was simply not possible in XSD 1.0. It provides a mechanism that allows additional elements to be either interleaved with existing elements or placed at the end.

xs:openContent Sample
It is not unusual to want to create a schema that contains a set of defined elements, but can also contain anything else. The consuming application would then processes what it understands and ignore anything else. This seemingly simple goal is very awkward to achieve in XSD 1.0, but with the introduction of open content its simple.

<?xml version="1.0" encoding="utf-8" ?>
<!--Created with Liquid Studio 2017 (https://www.liquid-technologies.com)-->
<xs:schema elementFormDefault="qualified" 
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Book">
        <xs:complexType>
            <xs:openContent mode="interleave">
                <xs:any namespace="##any" processContents="lax" />
            </xs:openContent>
            <xs:sequence>
                <xs:element name="Title" type="xs:string" />
                <xs:element name="Author" type="xs:string" />
                <xs:element name="Date" type="xs:gYear" />
                <xs:element name="ISBN" type="xs:string" />
                <xs:element name="Publisher" type="xs:string" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
enter image description here

The resulting XML document now allows the inclusion of any elements before, between and after the ones defined in the XSD. In this example the element ‘NewElement’ is added in the middle.

<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Liquid Studio 2017 (https://www.liquid-technologies.com) -->
<Book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="OpenContentExample.xsd">
    <Title>string</Title>
    <Author>string</Author>
    <NewElement/>
    <Date>2016</Date>
    <ISBN>string</ISBN>
    <Publisher>string</Publisher>
</Book>

The mode Attribute
  • none : Stops any open content rules being applied to this element (only relivant if a defaultOpenContent element has been applied to the whole schema).
  • Interleave : Allows additional elements to be placed before, between and after the elements defined in the schema.
  • suffix : Allows additional elements to only be placed after the elements defined in the schema.
The xs:openContent element also requires a child xs:any which defines the type of elements that can be added (more about the changes to xs:any later).

defaultAttributeGroup
This is a new attribute that can be attached to the xs:schema. When set it references an xs:attributeGroup which will be added to all the elements defined in this schema. It can be turned off on specific elements my setting the defaultAttributesApply=’false’ attribute on the given element.

xs:defaultOpenContent
This is a root level open content element that is applied to all the elements defined within this schema. It can be turned off on specific elements by adding the an xs:openContent element with the attribute mode=’none’.

Changes to xs:any
The xs:any construct is a little more user freindly in XSD 1.1. its gained a coulple of atttributes to better control what is allowed, and it can be used a little more freely than before.
  • notNamespace - this is a list of all the namesapces that are NOT permitted.
  • notQName - this is a list of all the qualified element names that are NOT permitted.
The use of xs:any is now a little less arduous, in XSD 1.0 the following is invalid.

 <xs:element name="Example">
    <xs:complexType>
        <xs:sequence>
            <xs:any minOccurs="0" namespace="##any" />
            <xs:element name="A" type="xs:string" minOccurs="0" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

This is because the validator can not tell which rule should be applied to an element ‘A’ in the instance document. Both the any and the definition for ‘A’ would be acceptable. In order to prevent this ambiguity the XSD 1.0 rules make this construct invalid.
However, in XSD 1.1 the construct is allowed as prioity is given to formally defined elements over those the would match an xs:any.

Summary
The enhancements made to the XSD standard seem fairly minor but now provide the ability to create truly extensible schemas, and support the implementation of complex business rules. For these reasons alone the XSD 1.1 standard is worth embracing, the other features are just a bonus.