4. Validating XML Documents

4.1. The Validator Class

The Validator class encapsulates XMLUnit's validation support. It will use the SAXParser configured in XMLUnit (see Section 2.4.1, “JAXP”).

The piece of XML to validate is specified in the constructor. The constructors using more than a single argument are only relevant if you want to validate against a DTD and need to provide the location of the DTD itself - for details see the next section.

By default, Validator will validate against a DTD, but it is possible to validate against a (or multiple) Schema(s) as well. Schema validation requires an XML parser that supports it, of course.

4.1.1. DTD Validation

Validating against a DTD is straight forward if the piece of XML contains a DOCTYPE declaration with a SYSTEM identifier that can be resolved at validation time. Simply create a Validator object using one of the single argument constructors.

Example 24. Validating Against the DTD Defined in DOCTYPE

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is);
boolean isValid = v.isValid();

If the piece of XML doesn't contain any DOCTYPE declaration at all or it contains a DOCTYPE but you want to validate against a different DTD, you'd use one of the three argument versions of Validator's constructors. In this case the publicId argument becomes the PUBLIC and systemId the SYSTEM identifier of the DOCTYPE that is implicitly added to the piece of XML. Any existing DOCTYPE will be removed. The systemId should be a URL that can be resolved by your parser.

Example 25. Validating a Piece of XML that doesn't Contain a DOCTYPE

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is,
                            (new File(myDTD)).toURI().toURL().toString(),
                            myPublicId);
boolean isValid = v.isValid();

If the piece of XML already has the correct DOCTYPE declaration but the declaration either doesn't specify a SYSTEM identifier at all or you want the SYSTEM identifier to resolve to a different location you have two options:

  • Use one of the two argument constructors and specify the alternative URL as systemId.

    Example 26. Validating Against a Local DTD

    InputSource is = new InputSource(new FileInputStream(myXmlDocument));
    Validator v = new Validator(is,
                                (new File(myDTD)).toURI().toURL().toString());
    boolean isValid = v.isValid();
    

  • Use a custom EntityResolver via XMLUnit.setControlEntityResolver together with one of the single argument constructor overloads of Validator.

    This approach would allow you to use an OASIS catalog[8] in conjunction with the Apache XML Resolver library[9] to resolve the DTD location as well as the location of any other entity in your piece of XML, for example.

    Example 27. Validating Against a DTD Using Apache's XML Resolver and an XML Catalog

    InputSource is = new InputSource(new FileInputStream(myXmlDocument));
    XMLUnit.setControlEntityResolver(new CatalogResolver());
    Validator v = new Validator(is);
    boolean isValid = v.isValid();
    
    #CatalogManager.properties
    
    verbosity=1
    relative-catalogs=yes
    catalogs=/some/path/to/catalog
    prefer=public
    static-catalog=yes
    catalog-class-name=org.apache.xml.resolver.Resolver
    
    <!-- catalog file -->
    
    <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
      <public publicId="-//Some//DTD V 1.1//EN"
              uri="mydtd.dtd"/>
    </catalog>
    

4.1.2. XML Schema Validation

In order to validate against the XML Schema language Schema validation has to be enabled via the useXMLSchema method of Validator.

By default the parser will try to resolve the location of Schema definition files via a schemaLocation attribute if it is present in the piece of XML or it will try to open the Schema's URI as an URL and read from it.

The setJAXP12SchemaSource method of Validator allows you to override this behavior as long as the parser supports the http://java.sun.com/xml/jaxp/properties/schemaSource property in the way described in "JAXP 1.2 Approved CHANGES"[10].

setJAXP12SchemaSource's argument can be one of

  • A String which contains an URI.
  • An InputStream the Schema can be read from.
  • An InputSource the Schema can be read from.
  • A File the Schema can be read from.
  • An array containing any of the above.

If the property has been set using a String, the Validator class will provide its systemId as specified in the constructor when asked to resolve it. You must only use the single argument constructors if you want to avoid this behavior. If no systemId has been specified, the configured EntityResolver may still be used.

Example 28. Validating Against a Local XML Schema

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is);
v.useXMLSchema(true);
v.setJAXP12SchemaSource(new File(myXmlSchemaFile));
boolean isValid = v.isValid();

4.2. JUnit 3.x Convenience Methods

Both XMLAssert and XMLTestCase provide an assertXMLValid(Validator) method that will fail if Validator's isValid method returns false.

In addition several overloads of the assertXMLValid method are provided that directly correspond to similar overloads of Validator's constructor. These overloads don't support XML Schema validation at all.

Validator itself provides an assertIsValid method that will throw an AssertionFailedError if validation fails.

Neither method provides any control over the message of the AssertionFailedError in case of a failure.

4.3. Configuration Options

4.4. JAXP 1.3 Validation

JAXP 1.3 - shipping with Java5 or better and available as a separate product for earlier Java VMs - introduces a new package javax.xml.validation designed for validations of snippets of XML against different schema languages. Any compliant implementation must support the W3C XML Schema language, but other languages like RELAX NG or Schematron may be supported as well.

The class org.custommonkey.xmlunit.jaxp13.Validator can be used to validate a piece of XML against a schema definition but also to validate the schema definition itself. By default Validator will assume your definition uses the W3C XML Schema language, but it provides a constructor that can be used to specify a different language via an URL supported by the SchemaFactory class. Alternatively you can specify the schema factory itself.

The schema definition itself can be given via Source elements, just like the pieces of XML to validate are specified as Source as well.

Note the Validator class of javax.xml.validation will ignore all xsi:namespaceLocation and xsi:noNamespaceLocation attributes of the XML document you want to validate if you specify at least one schema source.

The following example uses org.custommonkey.xmlunit.jaxp13.Validator to perform the same type of validation shown in Example 28, “Validating Against a Local XML Schema”.

Example 29. Validating Against a Local XML Schema

Validator v = new Validator();
v.addSchemaSource(new StreamSource(new File(myXmlSchemaFile)));
StreamSource is = new StreamSource(new File(myXmlDocument));
boolean isValid = v.isInstanceValid(is);

Validating a schema definition is shown in the next example.

Example 30. Validating an XML Schema Definition

Validator v = new Validator();
v.addSchemaSource(new StreamSource(new File(myXmlSchemaFile)));
boolean isValid = v.isSchemaValid();

There is no explicit JUnit 3 support for org.custommonkey.xmlunit.jaxp13.Validator.