2. Using XMLUnit

2.1. Requirements

XMLUnit requires a JAXP compliant XML parser virtually everywhere. Several features of XMLUnit also require a JAXP compliant XSLT transformer. If it is available, a JAXP compliant XPath engine will be used for XPath tests.

To build XMLUnit at least JAXP 1.2 is required, this is the version provided by the Java class library in JDK 1.4. The JAXP 1.3 (i.e. Java5 and above) XPath engine can only be built when JAXP 1.3 is available.

As long as you don't require support for XML Namespaces or XML Schema, any JAXP 1.1 compliant implementations should work at runtime. For namespace and schema support you will need a parser that complies to JAXP 1.2 and supports the required feature. The XML parser shipping with JDK 1.4 (a version of Apache Crimson) for example is compliant to JAXP 1.2 but doesn't support Schema validation.

XMLUnit is supposed to build and run on any Java version after 1.3 (at least no new hard JDK 1.4 dependencies have been added in XMLUnit 1.1), but it has only been tested on JDK 1.4.2 and above.

To build XMLUnit JUnit 3.x (only tested with JUnit 3.8.x) is required. It is not required at runtime unless you intend to use the XMLTestCase or XMLAssert classes.

2.2. Basic Usage

XMLUnit consists of a few classes all living in the org.custommonkey.xmlunit package. You can use these classes directly from your code, no matter whether you are writing a unit test or want to use XMLUnit's features for any other purpose.

This section provides a few hints of where to start if you want to use a certain feature of XMLUnit, more details can be found in the more specific sections later in this document.

2.2.1. Comparing Pieces of XML

Heart and soul of XMLUnit's comparison engine is DifferenceEngine but most of the time you will use it indirectly via the Diff class.

You can influence the engine by providing (custom) implementations for various interfaces and by setting a couple of options on the XMLUnit class.

More information is available in Section 3, “Comparing Pieces of XML”.

2.2.2. Validating

All validation happens in the Validator class. The default is to validate against a DTD, but XML Schema validation can be enabled by an option (see Validator.useXMLSchema).

Several options of the XMLUnit class affect validation.

More information is available in Section 4, “Validating XML Documents”.

2.2.3. XSLT Transformations

The Transform class provides an easy to use layer on top of JAXP's transformations. An instance of this class is initialized with the source document and a stylesheet and the result of the transformation can be retrieved as a String or DOM Document.

The output of Transform can be used as input to comparisons, validations, XPath tests and so on. There is no detailed sections on transformations since they are really only a different way to create input for the rest of XMLUnit's machinery. Examples can be found in Section 1.6, “Comparing XML Transformations”.

It is possible to provide a custom javax.xml.transform.URIResolver via the XMLUnit.setURIResolver method.

You can access the underlying XSLT transformer via XMLUnit.getTransformerFactory.

2.2.4. XPath Engine

The central piece of XMLUnit's XPath support is the XpathEngine interface. Currently two implementations of the interface exist, SimpleXpathEngine and org.custommonkey.xmlunit.jaxp13.Jaxp13XpathEngine.

SimpleXpathEngine is a very basic implementation that uses your XSLT transformer under the covers. This also means it will expose you to the bugs found in your transformer like the transformer claiming a stylesheet couldn't be compiled for very basic XPath expressions. This has been reported to be the case for JDK 1.5.

org.custommonkey.xmlunit.jaxp13.Jaxp13XpathEngine uses JAXP 1.3's javax.xml.xpath package and seems to work more reliable, stable and performant than SimpleXpathEngine.

You use the XMLUnit.newXpathEngine method to obtain an instance of the XpathEngine. As of XMLUnit 1.1 this will try to use JAXP 1.3 if it is available and fall back to SimpleXpathEngine.

Instances of XpathEngine can return the results of XPath queries either as DOM NodeList or plain Strings.

More information is available in Section 5, “XPath Tests”.

2.2.5. DOM Tree Walking

To test pieces of XML by traversing the DOM tree you use the NodeTester class. Each DOM Node will be passed to a NodeTester implementation you provide. The AbstractNodeTester class is provided as a NullObject Pattern base class for implementations of your own.

More information is available in Section 6, “DOM Tree Walking”.

2.3. Using XMLUnit With JUnit 3.x

Initially XMLUnit was tightly coupled to JUnit and the recommended approach was to write unit tests by inheriting from the XMLTestCase class. XMLTestCase provides a pretty long list of assert... methods that may simplify your interaction with XMLUnit's internals in many common cases.

The XMLAssert class provides the same set of assert...s as static methods. Use XMLAssert instead of XMLTestCase for your unit tests if you can't or don't want to inherit from XMLTestCase.

All power of XMLUnit is available whether you use XMLTestCase and/or XMLAssert or the underlying API directly. If you are using JUnit 3.x then using the specific classes may prove to be more convenient.

2.4. Common Configuration Options

2.4.1. JAXP

If you are using a JDK 1.4 or later, your Java class library already contains the required XML parsers and XSLT transformers. Still you may want to use a different parser/transformer than the one of your JDK - in particular since the versions shipping with some JDKs are known to contain serious bugs.

As described in Section 1.4, “Configuring XMLUnit” there are two main approaches to choose the XML parser of XSLT transformer: System properties and setters in the XMLUnit class.

If you use system properties you have the advantage that your choice affects the whole JAXP system, whether it is used inside of XMLUnit or not. If you are using JDK 1.4 or later you may also want to review the Endorsed Standards Override Mechanism to use a different parser/transformer than the one shipping with your JDK.

The second option - using the XMLUnit class - allows you to use different parsers for control and test documents, it even allows you to use different parsers for different test cases, if you really want to stretch it that far. It may also work for JDK 1.4 and above, even if you don't override the endorsed standards libraries.

You can access the underlying JAXP parser by XMLUnit.newControlParser, XMLUnit.newTestParser, XMLUnit.getControlDocumentBuilderFactory, XMLUnit.getTestDocumentBuilderFactory and XMLUnit.getSAXParserFactory (used by Validator). Note that all these methods return factories or parsers that are namespace aware.

The various build... methods in XMLUnit provide convenience layers for building DOM Documents using the configured parsers.

You can also set the class name for the XPathFactory to use when using JAXP 1.3 by passing the class name to XMLUnit.setXPathFactory.

2.4.2. EntityResolver

You can provide a custom org.xml.sax.EntityResolver for the control and test parsers via XMLUnit.setControlEntityResolver and XMLUnit.setTestEntityResolver. Validator uses the resolver set via setControlEntityResolver as well.

2.4.3. Element Content Whitespace

Element content whitespace - also known as ignorable whitespace - is whitespace contained in elements whose content model doesn't allow text content. I.e. the newline and space characters between <foo> and <bar> in the following example could belong into this category.

<foo>
  <bar/></foo>

Using XMLUnit.setIgnoreWhitespace it is possible to make the test and control parser ignore this kind of whitespace.

Note that setting this property to true usually doesn't have any effect since it only works on validating parsers and XMLUnit doesn't enable validation by default. It does have an effect when comparing pieces of XML, though, since the same flag is used for a different purpose as well in that case. See Section 3.8.1, “Whitespace Handling” for more details.

2.4.4. XSLT Stylesheet Version

Some features of XMLUnit use XSLT stylesheets under the covers, in particular XSLT will be used to strip element content whitespace or comments as well as by SimpleXpathEngine. These stylesheets only require a XSLT transformer that supports XSLT 1.0 and will say so in the stylesheet element.

If your XSLT transformer supports XSLT 2.0 or newer it may[6] issue a warning for these stylesheets which can be annoying. You can use XMLUnit.setXSLTVersion to make XMLUnit change the version attribute to a different value. Note that XMLUnit hasn't been tested with a value other than "1.0".

2.5. Providing Input to XMLUnit

Most methods in XMLUnit that expect a piece of XML as input provide several overloads that obtain their input from different sources. The most common options are:

  • A DOM Document.

    Here you have all control over the document's creation. Such a Document could as well be the result of an XSLT transformation via the Transform class.

  • A SAX InputSource.

    This is the most generic way since InputSource allows you to read from arbitrary InputStreams or Readers. Use an InputStream wrapped by an InputSource if you want the XML parser to pick up the proper encoding from the XML declaration.

  • A String.

    Here a DOM Document is built from the input String using the JAXP parser specified for control or test documents - depending on whether the input is a control or test piece of XML.

    Note that using a String assumes that your XML has already been converted from its XML encoding to a Java String upfront.

  • A Reader.

    Here a DOM Document is built from the input Reader using the JAXP parser specified for control or test documents - depending on whether the input is a control or test piece of XML.

    Note that using a Reader is a bad choice if your XML encoding is different from your platform's default encoding since Java's IO system won't read your XML declaration. It is a good practice to use one of the other overloads rather than the Reader version to ensure encoding has been dealt with properly.



[6] The W3C recommendation says it SHOULD.