6. DOM Tree Walking

Sometimes it is easier to test a piece of XML's validity by traversing the whole document node by node and test each node individually. Maybe there is no control XML to validate against or the expected value of an element's content has to be calculated. There may be several reasons.

XMLUnit supports this approach of testing via the NodeTest class. In order to use it, you need a DOM implementation that generates Document instances that implement the optional org.w3c.traversal.DocumentTraversal interface, which is not part of JAXP's standardized DOM support.

6.1. DocumentTraversal

As of the release of XMLUnit 1.1 the Document instances created by most parsers implement DocumentTraversal, this includes but is not limited to Apache Xerces, the parser shipping with Sun's JDK 5 and later or GNU JAXP. One notable exception is Apache Crimson, which also means the parser shipping with Sun's JDK 1.4 does not support traversal; you need to specify a different parser when using JDK 1.4 (see Section 2.4.1, “JAXP”).

You can test whether your XML parser supports DocumentTraversal by invoking org.w3c.dom.DOMImplementation's hasFeature method with the feature "Traversal".

6.2. NodeTest

The NodeTest is instantiated with a piece of XML to traverse. It offers two performTest methods:

    /**
     * Does this NodeTest pass using the specified NodeTester instance?
     * @param tester
     * @param singleNodeType note <code>Node.ATTRIBUTE_NODE</code> is not
     *  exposed by the DocumentTraversal node iterator unless the root node
     *  is itself an attribute - so a NodeTester that needs to test attributes
     *  should obtain those attributes from <code>Node.ELEMENT_NODE</code>
     *  nodes
     * @exception NodeTestException if test fails
     */
    public void performTest(NodeTester tester, short singleNodeType);

    /**
     * Does this NodeTest pass using the specified NodeTester instance?
     * @param tester
     * @param nodeTypes note <code>Node.ATTRIBUTE_NODE</code> is not
     *  exposed by the DocumentTraversal node iterator unless the root node
     *  is itself an attribute - so a NodeTester that needs to test attributes
     *  should obtain those attributes from <code>Node.ELEMENT_NODE</code>
     *  nodes instead
     * @exception NodeTestException if test fails
     */
    public void performTest(NodeTester tester, short[] nodeTypes);

NodeTester is the class testing each node and is described in the next section.

The second argument limits the tests on DOM Nodes of (a) specific type(s). Node types are specified via the static fields of the Node class. Any Node of a type not specified as the second argument to performTest will be ignored.

Unfortunately XML attributes are not exposed as Nodes during traversal. If you need access to attributes you must add Node.ELEMENT_NODE to the second argument of performTest and access the attributes from their parent Element.

Example 33. Accessing Attributes in a NodeTest

    ...
    NodeTest nt = new NodeTest(myXML);
    NodeTester tester = new MyNodeTester();
    nt.performTest(tester, Node.ELEMENT_NODE);
    ...

class MyNodeTester implements NodeTester {
    public void testNode(Node aNode, NodeTest test) {
        Element anElement = (Element) aNode;
        Attr attributeToTest = anElement.getAttributeNode(ATTRIBUTE_NAME);
        ...
    }
    ...
}

Any entities that appear as part of the Document are expanded before the traversal starts.

6.3. NodeTester

Implementations of the NodeTester interface are responsible for the actual test:

    /**
     * Validate a single Node
     * @param aNode
     * @param forTest
     * @exception NodeTestException if the node fails the test
     */
    void testNode(Node aNode, NodeTest forTest) throws NodeTestException ;

    /**
     * Validate that the Nodes passed one-by-one to the <code>testNode</code>
     * method were all the Nodes expected.
     * @param forTest
     * @exception NodeTestException if this instance was expecting more nodes
     */
    void noMoreNodes(NodeTest forTest) throws NodeTestException ;

NodeTest invokes testNode for each Node as soon as it is reached on the traversal. This means NodeTester "sees" the Nodes in the same order they appear within the tree.

noMoreNodes is invoked when the traversal is finished. It will also be invoked if the tree didn't contain any matched Nodes at all.

Implementations of NodeTester are expected to throw a NodeTestException if the current not doesn't match the test's expectations or more nodes have been expected when noMoreNodes is called.

XMLUnit ships with two implementations of NodeTest that are described in the following to sections.

6.3.1. AbstractNodeTester

AbstractNodeTester implements testNode by testing the passed in Node for its type and delegating to one of the more specific test... Methods it adds. By default the new test... methods all throw a NodeTestException because of an unexpected Node.

It further implements noMoreNodes with an empty method - i.e. it does nothing.

If you are only testing for specific types of Node it may be more convenient to subclass AbstractNodeTester. For example Example 33, “Accessing Attributes in a NodeTest could be re-written as:

Example 34. Accessing Attributes in a NodeTest - AbstractNodeTester version

    ...
    NodeTest nt = new NodeTest(myXML);
    NodeTester tester = new AbstractNodeTester() {
        public void testElement(Element element) throws NodeTestException {
            Attr attributeToTest = element.getAttributeNode(ATTRIBUTE_NAME);
            ...
        }
    };
    nt.performTest(tester, Node.ELEMENT_NODE);
    ...

Note that even though AbstractNodeTester contains a testAttribute method it will never be called by default and you still need to access attributes via their parent elements.

Note also that the root of the test is the document's root element, so any Nodes preceding the document's root Element won't be visited either. For this reason the testDocumentType, testEntity and testNotation methods are probably never called either.

Finally, all entity references have been expanded before the traversal started. EntityReferences will have been replaced by their replacement text if it is available, which means testEntityReference will not be called for them either. Instead the replacement text will show up as (part of) a Text node or as Element node, depending on the entity's definition.

6.3.2. CountingNodeTester

org.custommonkey.xmlunit.examples.CountingNodeTester is a simple example NodeTester that asserts that a given number of Nodes have been traversed. It will throw a NodeTestException when noMoreNodes is called before the expected number of Nodes has been visited or the actual number of nodes exceeded the expected count.

6.4. JUnit 3.x Convenience Methods

XMLAssert and XMLTestCase contain overloads of assertNodeTestPasses methods.

The most general form of it expects you to create a NodeTest instance yourself and lets you specify whether you expect the test to fail or to pass.

The other two overloads create a NodeTest instance from either String or a SAX InputSource and are specialized for the case where you are only interested in a single Node type and expect the test to pass.

Neither method provides any control over the message of the AssertionFailedError in case of a failure.

6.5. Configuration Options

The only configurable option for NodeTest is the XML parser used if the piece of XML is not specified as a Document or DocumentTraversal. NodeTest will use the "control" parser that has been configured - see Section 2.4.1, “JAXP” for details.

It will also use the EntityResolver configured for the control parser if one has been set - see Section 2.4.2, “EntityResolver.