At the center of XMLUnit's support for comparisons is the
DifferenceEngine
class. In practice you
rarely deal with it directly but rather use it via instances of
Diff
or DetailedDiff
classes (see Section 3.5, “Diff
and
DetailedDiff
”).
The DifferenceEngine
walks two trees of
DOM Node
s, the control and the test tree, and
compares the nodes. Whenever it detects a difference, it sends
a message to a configured DifferenceListener
(see Section 3.3, “DifferenceListener
”) and asks a
ComparisonController
(see Section 3.2, “ComparisonController
”) whether the current comparison
should be halted.
In some cases the order of elements in two pieces of XML
may not be significant. If this is true, the
DifferenceEngine
needs help to determine
which Element
s to compare. This is the job
of an ElementQualifier
(see Section 3.4, “ElementQualifier
”).
The types of differences
DifferenceEngine
can detect are enumerated in
the DifferenceConstants
interface and
represented by instances of the Difference
class.
A Difference
can be recoverable;
recoverable Difference
s make the
Diff
class consider two pieces of XML similar
while non-recoverable Difference
s render the
two pieces different.
The types of Difference
s that are
currently detected are listed in Table 1, “Document level Difference
s detected by
DifferenceEngine
”
to Table 4, “Other Difference
s detected by
DifferenceEngine
” (the first two columns refer to
the DifferenceConstants
class).
Table 1. Document level Difference
s detected by
DifferenceEngine
ID | Constant | recoverable | Description |
---|---|---|---|
HAS_DOCTYPE_DECLARATION_ID | HAS_DOCTYPE_DECLARATION | true | One piece of XML has a DOCTYPE declaration while the other one has not. |
DOCTYPE_NAME_ID | DOCTYPE_NAME | false | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different names for the root element. |
DOCTYPE_PUBLIC_ID_ID | DOCTYPE_PUBLIC_ID | false | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different PUBLIC identifiers. |
DOCTYPE_SYSTEM_ID_ID | DOCTYPE_SYSTEM_ID | true | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different SYSTEM identifiers. |
NODE_TYPE_ID | NODE_TYPE | false | The test piece of XML contains a different type
of node than was expected. This type of difference will
also occur if either the root control or test
Node is null while
the other is not. |
NAMESPACE_PREFIX_ID | NAMESPACE_PREFIX | true | Two nodes use different prefixes for the same XML Namespace URI in the two pieces of XML. |
NAMESPACE_URI_ID | NAMESPACE_URI | false | Two nodes in the two pieces of XML share the same local name but use different XML Namespace URIs. |
SCHEMA_LOCATION_ID | SCHEMA_LOCATION | true | Two nodes have different values for the
schemaLocation attribute of the
XMLSchema-Instance namespace. The attribute could be
present on only one of the two nodes. |
NO_NAMESPACE_SCHEMA_LOCATION_ID | NO_NAMESPACE_SCHEMA_LOCATION | true | Two nodes have different values for the
noNamespaceSchemaLocation attribute
of the XMLSchema-Instance namespace. The attribute
could be present on only one of the two nodes. |
Table 2. Element level Difference
s detected by
DifferenceEngine
ID | Constant | recoverable | Description |
---|---|---|---|
ELEMENT_TAG_NAME_ID | ELEMENT_TAG_NAME | false | The two pieces of XML contain elements with different tag names. |
ELEMENT_NUM_ATTRIBUTES_ID | ELEMENT_NUM_ATTRIBUTES | false | The two pieces of XML contain a common element, but the number of attributes on the element is different. |
HAS_CHILD_NODES_ID | HAS_CHILD_NODES | false | An element in one piece of XML has child nodes while the corresponding one in the other has not. |
CHILD_NODELIST_LENGTH_ID | CHILD_NODELIST_LENGTH | false | Two elements in the two pieces of XML differ by their number of child nodes. |
CHILD_NODELIST_SEQUENCE_ID | CHILD_NODELIST_SEQUENCE | true | Two elements in the two pieces of XML contain the same child nodes but in a different order. |
CHILD_NODE_NOT_FOUND_ID | CHILD_NODE_NOT_FOUND | false | A child node in one piece of XML couldn't be matched against any other node of the other piece. |
ATTR_SEQUENCE_ID | ATTR_SEQUENCE | true | The attributes on an element appear in different order[a] in the two pieces of XML. |
[a] Note that the order of attributes is not significant in XML, different parsers may return attributes in a different order even if parsing the same XML document. There is an option to turn this check off - see Section 3.8, “Configuration Options” - but it is on by default for backwards compatibility reasons |
Table 3. Attribute level Difference
s detected by
DifferenceEngine
ID | Constant | recoverable | Description |
---|---|---|---|
ATTR_VALUE_EXPLICITLY_SPECIFIED_ID | ATTR_VALUE_EXPLICITLY_SPECIFIED | true | An attribute that has a default value according to the content model of the element in question has been specified explicitly in one piece of XML but not in the other.[a] |
ATTR_NAME_NOT_FOUND_ID | ATTR_NAME_NOT_FOUND | false | One piece of XML contains an attribute on an element that is missing in the other. |
ATTR_VALUE_ID | ATTR_VALUE | false | The value of an element's attribute is different in the two pieces of XML. |
[a] In order for this difference to be detected the parser must have been in validating mode when the piece of XML was parsed and the DTD or XML Schema must have been available. |
Table 4. Other Difference
s detected by
DifferenceEngine
ID | Constant | recoverable | Description |
---|---|---|---|
COMMENT_VALUE_ID | COMMENT_VALUE | false | The content of two comments is different in the two pieces of XML. |
PROCESSING_INSTRUCTION_TARGET_ID | PROCESSING_INSTRUCTION_TARGET | false | The target of two processing instructions is different in the two pieces of XML. |
PROCESSING_INSTRUCTION_DATA_ID | PROCESSING_INSTRUCTION_DATA | false | The data of two processing instructions is different in the two pieces of XML. |
CDATA_VALUE_ID | CDATA_VALUE | false | The content of two CDATA sections is different in the two pieces of XML. |
TEXT_VALUE_ID | TEXT_VALUE | false | The value of two texts is different in the two pieces of XML. |
Note that some of the differences listed may be ignored by
the DifferenceEngine
if certain configuration
options have been specified. See Section 3.8, “Configuration Options” for details.
DifferenceEngine
passes differences
found around as instances of the Difference
class. In addition to the type of of difference this class also
holds information on the nodes that have been found to be
different. The nodes are described by
NodeDetail
instances that encapsulate the DOM
Node
instance as well as the XPath expression
that locates the Node
inside the given piece
of XML. NodeDetail
also contains a "value"
that provides more information on the actual values that have
been found to be different, the concrete interpretation depends
on the type of difference as can be seen in Table 5, “Contents of NodeDetail.getValue()
for Difference
s”.
Table 5. Contents of NodeDetail.getValue()
for Difference
s
Difference.getId() | NodeDetail.getValue() |
---|---|
HAS_DOCTYPE_DECLARATION_ID | "not null" if the document has
a DOCTYPE declaration, "null"
otherwise. |
DOCTYPE_NAME_ID | The name of the root element. |
DOCTYPE_PUBLIC_ID | The PUBLIC identifier. |
DOCTYPE_SYSTEM_ID | The SYSTEM identifier. |
NODE_TYPE_ID | If one node was absent: "not
null" if the node exists,
"null" otherwise. If the node types
differ the value will be a string-ified version of
org.w3c.dom.Node.getNodeType() . |
NAMESPACE_PREFIX_ID | The Namespace prefix. |
NAMESPACE_URI_ID | The Namespace URI. |
SCHEMA_LOCATION_ID | The attribute's value or "[attribute absent]" if it has not been specified. |
NO_NAMESPACE_SCHEMA_LOCATION_ID | The attribute's value or "[attribute absent]" if it has not been specified. |
ELEMENT_TAG_NAME_ID | The tag name with any Namespace information stripped. |
ELEMENT_NUM_ATTRIBUTES_ID | The number of attributes present turned into a
String . |
HAS_CHILD_NODES_ID | "true" if the element has
child nodes, "false"
otherwise. |
CHILD_NODELIST_LENGTH_ID | The number of child nodes present turned into a
String . |
CHILD_NODELIST_SEQUENCE_ID | The sequence number of this child node turned into a
String . |
CHILD_NODE_NOT_FOUND_ID | The name of the unmatched node or
"null" . If the node is an element
inside an XML namespace the name will be
Java5-QName -like
{NS-URI}LOCAL-NAME - in all other
cases it is the node's local name. |
ATTR_SEQUENCE_ID | The attribute's name. |
ATTR_VALUE_EXPLICITLY_SPECIFIED_ID | "true" if the attribute has
been specified, "false"
otherwise. |
ATTR_NAME_NOT_FOUND_ID | The attribute's name or
"null" . If the attribute belongs to
an XML namespace the name will be
Java5-QName -like
{NS-URI}LOCAL-NAME - in all other
cases it is the attribute's local name. |
ATTR_VALUE_ID | The attribute's value. |
COMMENT_VALUE_ID | The actual comment. |
PROCESSING_INSTRUCTION_TARGET_ID | The processing instruction's target. |
PROCESSING_INSTRUCTION_DATA_ID | The processing instruction's data. |
CDATA_VALUE_ID | The content of the CDATA section. |
TEXT_VALUE_ID | The actual text. |
As said in the first paragraph you won't deal with
DifferenceEngine
directly in most cases. In
cases where Diff
or
DetailedDiff
don't provide what you need
you'd create an instance of DifferenceEngine
passing a ComparisonController
in the
constructor and invoke compare
with your DOM
trees to compare as well as a
DifferenceListener
and
ElementQualifier
. The listener will be
called on any differences while the control
method is executing.
Example 16. Using DifferenceEngine
Directly
class MyDifferenceListener implements DifferenceListener { private boolean calledFlag = false; public boolean called() { return calledFlag; } public int differenceFound(Difference difference) { calledFlag = true; return RETURN_ACCEPT_DIFFERENCE; } public void skippedComparison(Node control, Node test) { } } DifferenceEngine engine = new DifferenceEngine(myComparisonController); MyDifferenceListener listener = new MyDifferenceListener(); engine.compare(controlNode, testNode, listener, myElementQualifier); System.err.println("There have been " + (listener.called() ? "" : "no ") + "differences.");
The ComparisonController
's job is to
decide whether a comparison should be halted after a difference
has been found. Its interface is:
/** * Determine whether a Difference that the listener has been notified of * should halt further XML comparison. Default behaviour for a Diff * instance is to halt if the Difference is not recoverable. * @see Difference#isRecoverable * @param afterDifference the last Difference passed to <code>differenceFound</code> * @return true to halt further comparison, false otherwise */ boolean haltComparison(Difference afterDifference);
Whenever a difference has been detected by the
DifferenceEngine
the
haltComparison
method will be called
immediately after the DifferenceListener
has
been informed of the difference. This is true no matter what
type of Difference
has been found or which
value the DifferenceListener
has
returned.
The only implementations of
ComparisonController
that ship with XMLUnit
are Diff
and DetailedDiff
,
see Section 3.5, “Diff
and
DetailedDiff
” for details about them.
A ComparisonController
that halted the
comparison on any non-recoverable difference could be
implemented as:
Example 17. A Simple
ComparisonController
public class HaltOnNonRecoverable implements ComparisonController { public boolean haltComparison(Difference afterDifference) { return !afterDifference.isRecoverable(); } }
DifferenceListener
contains two
callback methods that are invoked by the
DifferenceEngine
when differences are
detected:
/** * Receive notification that 2 nodes are different. * @param difference a Difference instance as defined in {@link * DifferenceConstants DifferenceConstants} describing the cause * of the difference and containing the detail of the nodes that * differ * @return int one of the RETURN_... constants describing how this * difference was interpreted */ int differenceFound(Difference difference); /** * Receive notification that a comparison between 2 nodes has been skipped * because the node types are not comparable by the DifferenceEngine * @param control the control node being compared * @param test the test node being compared * @see DifferenceEngine */ void skippedComparison(Node control, Node test);
differenceFound
is invoked by
DifferenceEngine
as soon as a difference has
been detected. The return value of that method is completely
ignored by DifferenceEngine
, it becomes
important when used together with Diff
,
though (see Section 3.5, “Diff
and
DetailedDiff
”). The return value should be
one of the four constants defined in the the
DifferenceListener
interface:
/** * Standard return value for the <code>differenceFound</code> method. * Indicates that the <code>Difference</code> is interpreted as defined * in {@link DifferenceConstants DifferenceConstants}. */ int RETURN_ACCEPT_DIFFERENCE; /** * Override return value for the <code>differenceFound</code> method. * Indicates that the nodes identified as being different should be * interpreted as being identical. */ int RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL; /** * Override return value for the <code>differenceFound</code> method. * Indicates that the nodes identified as being different should be * interpreted as being similar. */ int RETURN_IGNORE_DIFFERENCE_NODES_SIMILAR; /** * Override return value for the <code>differenceFound</code> method. * Indicates that the nodes identified as being similar should be * interpreted as being different. */ int RETURN_UPGRADE_DIFFERENCE_NODES_DIFFERENT = 3;
The skippedComparison
method is
invoked if the DifferenceEngine
encounters
two Node
s it cannot compare. Before invoking
skippedComparison
DifferenceEngine
will have invoked
differenceFound
with a
Difference
of type
NODE_TYPE
.
A custom DifferenceListener
that
ignored any DOCTYPE related differences could be written
as:
Example 18. A DifferenceListener
that Ignores
DOCTYPE Differences
public class IgnoreDoctype implements DifferenceListener { private static final int[] IGNORE = new int[] { DifferenceConstants.HAS_DOCTYPE_DECLARATION_ID, DifferenceConstants.DOCTYPE_NAME_ID, DifferenceConstants.DOCTYPE_PUBLIC_ID_ID, DifferenceConstants.DOCTYPE_SYSTEM_ID_ID }; static { Arrays.sort(IGNORE); } public int differenceFound(Difference difference) { return Arrays.binarySearch(IGNORE, difference.getId()) >= 0 ? RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL : RETURN_ACCEPT_DIFFERENCE; } public void skippedComparison(Node control, Node test) { } }
Apart from Diff
and
DetailedDiff
XMLUnit ships with an additional
implementation of DifferenceListener
.
IgnoreTextAndAttributeValuesDifferenceListener
doesn't do anything in skippedComparison
.
It "downgrades" Difference
s of type
ATTR_VALUE
,
ATTR_VALUE_EXPLICITLY_SPECIFIED
and
TEXT_VALUE
to recoverable
differences.
This means if instances of
IgnoreTextAndAttributeValuesDifferenceListener
are used together with Diff
then two pieces
of XML will be considered similar if they have the same basic
structure. They are not considered identical, though.
Note that the list of ignored differences doesn't cover
all textual differences. You should configure XMLUnit to
ignore comments and whitespace and to consider CDATA sections
and text nodes to be the same (see Section 3.8, “Configuration Options”) in order to cover
COMMENT_VALUE
and
CDATA_VALUE
as well.
When DifferenceEngine
encounters a list
of DOM Element
s as children of another
Element
it will ask the configured
ElementQualifier
which
Element
of the control piece of XML should be
compared to which of the test piece. Its contract is:
/** * Determine whether two elements are comparable * @param control an Element from the control XML NodeList * @param test an Element from the test XML NodeList * @return true if the elements are comparable, false otherwise */ boolean qualifyForComparison(Element control, Element test);
For any given Element
in the control
piece of XML DifferenceEngine
will cycle
through the corresponding list of Element
s in
the test piece of XML until
qualifyForComparison
has returned
true
or the test document is
exhausted.
When using DifferenceEngine
or
Diff
it is completely legal to set the
ElementQualifier
to null
.
In this case any kind of Node
is compared to
the test Node
that appears at the same
position in the sequence.
Example 19. Example Nodes for ElementQualifier
(the comments are not part of the example)
<!-- control piece of XML --> <parent> <child1/> <!-- control node 1 --> <child2/> <!-- control node 2 --> <child2 foo="bar">xyzzy</child2> <!-- control node 3 --> <child2 foo="baz"/> <!-- control node 4 --> </parent> <!-- test piece of XML --> <parent> <child2 foo="baz"/> <!-- test node 1 --> <child1/> <!-- test node 2 --> <child2>xyzzy</child2> <!-- test node 3 --> <child2 foo="bar"/> <!-- test node 4 --> </parent>
Taking Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” without any
ElementQualifier
DifferenceEngine
will compare control node
n
to test node n
for
n
between 1 and 4. In many cases this is
exactly what is desired, but sometimes
<a><b/><c/></a>
should be similar
to <a><c/><b/></a>
because the
order of elements doesn't matter - this is when you'd use a
different ElementQualifier
. XMLUnit ships
with several implementations.
Only Element
s with the same name -
and Namespace URI if present - qualify.
In Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” this means
control node 1 will be compared to test node 2. Then control
node 2 will be compared to test node 3 because
DifferenceEngine
will start to search for
the matching test Element
at the second
test node, the same sequence number the control node is at.
Control node 3 is compared to test node 3 as well and control
node 4 to test node 4.
Only Element
s with the same name -
and Namespace URI if present - as well as the same values for
all attributes given in
ElementNameAndAttributeQualifier
's
constructor qualify.
Let's say "foo"
has been passed to
ElementNameAndAttributeQualifier
's
constructor when looking at Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)”. This again means control
node 1 will be compared to test node 2 since they do have the
same name and no value at all for attribute
"foo"
. Then control node 2 will be
compared to test node 3 - again, no value for
"foo"
. Control node 3 is compared to test
node 4 as they have the same value "bar"
.
Finally control node 4 is compared to test node 1; here
DifferenceEngine
searches from the
beginning of the test node list after test node 4 didn't
match.
There are three constructors in
ElementNameAndAttributeQualifier
. The
no-arg constructor creates an instance that compares all
attributes while the others will compare a single attribute or
a given subset of all attributes.
Only Element
s with the same name -
and Namespace URI if present - as well as the same text
content nested into them qualify.
In Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” this means
control node 1 will be compared to test node 2 since they both
don't have any nested text at all. Then control node 2 will
be compared to test node 4. Control node 3 is compared to
test node 3 since they have the same nested text and control
node 4 to test node 4.
All ElementQualifier
s seen so far
only looked at the Element
s themselves and
not at the structure nested into them at a deeper level. A
frequent user question has been which
ElementQualifier
should be used if the
pieces of XML in Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” should be
considered similar.
Example 20. Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)
<!-- control --> <table> <tr> <!-- control row 1 --> <td>foo</td> </tr> <tr> <!-- control row 2 --> <td>bar</td> </tr> </table> <!-- test --> <table> <tr> <!-- test row 1 --> <td>bar</td> </tr> <tr> <!-- test row 2 --> <td>foo</td> </tr> </table>
At first glance
ElementNameAndTextQualifier
should work but
it doesn't. When DifferenceEngine
processed the children of table
it would
compare control row 1 to test row 1 since both
tr
elements have the same name and both
have no textual content at all.
What is needed in this case is an
ElementQualifier
that looks at the element's
name, as well as the name of the first child element and the
text nested into that first child element. This is what
RecursiveElementNameAndTextQualifier
does.
RecursiveElementNameAndTextQualifier
ignores whitespace between the elements leading up to the
nested text.
MultiLevelElementNameAndTextQualifier
has
in a way been the predecessor
of Section 3.4.4, “org.custommonkey.xmlunit.examples.RecursiveElementNameAndTextQualifier
”.
It also matches element names and those of nested child
elements until it finds matches, but
unlike RecursiveElementNameAndTextQualifier
,
you must
tell MultiLevelElementNameAndTextQualifier
at which nesting level it should expect the nested text.
MultiLevelElementNameAndTextQualifier
's
constructor expects a single argument which is the nesting
level of the expected text. If you use an argument of 1,
MultiLevelElementNameAndTextQualifier
is
identical to ElementNameAndTextQualifier
.
In Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” a value of 2 would be
needed.
By default
MultiLevelElementNameAndTextQualifier
will not ignore whitespace between the elements leading up
to the nested text. If your piece of XML contains this sort
of whitespace (like Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” which
contains a newline and several space characters between
<tr>
and
<td>
) you can either instruct
XMLUnit to ignore whitespace completely (see
Section 3.8.1, “Whitespace Handling”) or use the two-arg
constructor of
MultiLevelElementNameAndTextQualifier
introduced with XMLUnit 1.2 and set the
ignoreEmptyTexts
argument to
true.
In
general RecursiveElementNameAndTextQualifier
requires less knowledge upfront and its whitespace-handling
is more intuitive.
Diff
and
DetailedDiff
provide simplified access to
DifferenceEngine
by implementing the
ComparisonController
and
DifferenceListener
interfaces themselves.
They cover the two most common use cases for comparing two
pieces of XML: checking whether the pieces are different (this
is what Diff
does) and finding all
differences between them (this is what
DetailedDiff
does).
DetailedDiff
is a subclass of
Diff
and can only be constructed by creating
a Diff
instance first.
The major difference between them is their implementation
of the ComparisonController
interface:
DetailedDiff
will never stop the comparison
since it wants to collect all differences.
Diff
in turn will halt the comparison as soon
as the first Difference
is found that is not
recoverable. In addition DetailedDiff
collects all Difference
s in a list and
provides access to it.
By default Diff
will consider two
pieces of XML as identical if no differences have been found at
all, similar if all differences that have been found have been
recoverable (see Table 1, “Document level Difference
s detected by
DifferenceEngine
” to Table 4, “Other Difference
s detected by
DifferenceEngine
”) and different as soon as any
non-recoverable difference has been found.
It is possible to specify a
DifferenceListener
to Diff
using the overrideDifferenceListener
method.
In this case each Difference
will be
evaluated by the passed in
DifferenceListener
. By returning
RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL
the
custom listener can make Diff
ignore the
difference completely. Likewise any
Difference
for which the custom listener
returns
RETURN_IGNORE_DIFFERENCE_NODES_SIMILAR
will
be treated as if the Difference
was
recoverable.
There are several overloads of the Diff
constructor that allow you to specify your piece of XML in many
ways. There are overloads that accept additional
DifferenceEngine
and
ElementQualifier
arguments. Passing in a
DifferenceEngine
of your own is the only way
to use a ComparisonController
other than
Diff
.
Note that Diff
and
DetailedDiff
use
ElementNameQualifier
as their default
ElementQualifier
. This is different from
DifferenceEngine
which defaults to no
ElementQualifier
at all.
To use a custom ElementQualifier
you
can also use the overrideElementQualifier
method. Use this with an argument of null
to
unset the default ElementQualifier
as
well.
To compare two pieces of XML you'd create a
Diff
instance from those two pieces and
invoke identical
to check that there have
been no differences at all and similar
to
check that any difference, if any, has been recoverable. If the
pieces are identical they are also similar. Likewise if they
are not similar they can't be identical either.
Example 21. Comparing Two Pieces of XML Using
Diff
Diff d = new Diff("<a><b/><c/></a>", "<a><c/><b/></a>"); assertFalse(d.identical()); // CHILD_NODELIST_SEQUENCE Difference assertTrue(d.similar());
The result of the comparison is cached in
Diff
, repeated invocations of
identical
or similar
will
not reevaluate the pieces of XML.
Note: calling toString
on an instance
of Diff
or DetailedDiff
will perform the comparision and cache its result immediately.
If you change the DifferenceListener
or
ElementQualifier
after calling
toString
it won't have any effect.
DetailedDiff
provides only a single
constructor that expects a Diff
as argument.
Don't use DetailedDiff
if all you need to
know is whether two pieces of XML are identical/similar - use
Diff
directly since its short-cut
ComparisonController
implementation will save
time in this case.
Example 22. Finding All Differences Using
DetailedDiff
Diff d = new Diff("<a><b/><c/></a>", "<a><c/><b/></a>"); DetailedDiff dd = new DetailedDiff(d); dd.overrideElementQualifier(null); assertFalse(dd.similar()); List l = dd.getAllDifferences(); assertEquals(2, l.size()); // expected <b/> but was <c/> and vice versa
Sometimes you might be interested in any sort of comparison result and want to get notified of successful matches as well. Maybe you want to provide feedback on the amount of differences and similarities between two documents, for example.
The interface MatchTracker
can be
implemented to get notified on each and every successful match,
note that there may be a lot more comparisons going on than you
might expect and that your callback gets notified a lot.
Example 23. The MatchTracker
interface
package org.custommonkey.xmlunit; /** * Listener for callbacks from a {@link DifferenceEngine#compare * DifferenceEngine comparison} that is notified on each and every * comparision that resulted in a match. */ public interface MatchTracker { /** * Receive notification that 2 match. * @param match a Difference instance as defined in {@link * DifferenceConstants DifferenceConstants} describing the test * that matched and containing the detail of the nodes that have * been compared */ void matchFound(Difference difference); }
Despite its name the Difference
instance passed into the matchFound
method
really describes a match and not a difference. You can expect
that the getValue
method on both the
control and the test NodeDetail
will be
equal.
DifferenceEngine
provides a constructor
overload that allows you to pass in
a MatchTracker
instance and also provides
a setMatchTracker
method. Diff
and DetailedDiff
provide overrideMatchTracker
methods that
fill the same purpose.
Note that your MatchTracker
won't
receive any callbacks once the
configured ComparisonController
has decided
that DifferenceEngine
should halt the
comparison.
XMLAssert
and
XMLTestCase
contain quite a few overloads of
methods for comparing two pieces of XML.
The method's names use the word Equal
to mean the same as similar
in the
Diff
class (or throughout this guide). So
assertXMLEqual
will assert that only
recoverable differences have been encountered where
assertXMLNotEqual
asserts that some
differences have been non-recoverable.
assertXMLIdentical
asserts that there haven't
been any differences at all while
assertXMLNotIdentical
asserts that there have
been differences (recoverable or not).
Most of the overloads of assertXMLEqual
just provide different means to specify the pieces of XML as
String
s, InputSource
s,
Reader
s[7] or Document
s. For each
method there is a version that takes an additional
err
argument which is used to create the
message if the assertion fails.
If you don't need any control over the
ElementQualifier
or
DifferenceListener
used by
Diff
these methods will save some boilerplate
code. If CONTROL
and TEST
are pieces of XML represented as one of the supported inputs
then
Diff d = new Diff(CONTROL, TEST); assertTrue("expected pieces to be similar, " + d.toString(), d.similar());
and
assertXMLEqual("expected pieces to be similar", CONTROL, TEST);
are equivalent.
If you need more control over the Diff
instance there is a version of assertXMLEqual
(and assertXMLIdentical
) that accepts a
Diff
instance as its argument as well as a
boolean
indicating whether you expect the
Diff
to be similar
(identical
) or not.
XMLTestCase
contains a couple of
compareXML
methods that really are only
shortcuts to Diff
's constructors.
There is no way to use DifferenceEngine
or DetailedDiff
directly via the convenience
methods.
Unless you are using Document
or
DOMSource
overrides when specifying your
pieces of XML, XMLUnit will use the configured XML parsers (see
Section 2.4.1, “JAXP”) and EntityResolver
s
(see Section 2.4.2, “EntityResolver
”). There are configuration
options to use different settings for the control and test
pieces of XML.
In addition some of the other configuration settings may lead to XMLUnit using the configured XSLT transformer (see Section 2.4.1, “JAXP”) under the covers.
Two different configuration options affect how XMLUnit treats whitespace in comparisons:
If XMLUnit has been configured to ignore element
content whitespace it will trim any text nodes found by
the parser. This means that there won't appear to be any
textual content in element <foo>
for the following example. If you don't set
XMLUnit.setIgnoreWhitespace
there would
be textual content consisting of a new line
character.
<foo> </foo>
At the same time the following two
<foo>
elements will be considered
identical if the option has been enabled, though.
<foo>bar</foo> <foo> bar </foo>
When this option is set to true
,
Diff
will use the XSLT transformer
under the covers.
If you set
XMLUnit.setNormalizeWhitespace
to true
then XMLUnit will replace any kind of whitespace found in
character content with a SPACE character and collapse
consecutive whitespace characters to a single SPACE. It
will also trim the resulting character content on both
ends.
The following two <foo>
elements will be considered identical if the option has
been set:
<foo>bar baz</foo> <foo> bar baz</foo>
Note that this is not related to "normalizing" the
document as a whole (see Section 3.8.2, “"Normalizing" Document
s”).
"Normalize" in this context corresponds to the
normalize
method in DOM's
Document
class. It is the process of
merging adjacent Text
nodes and is not
related to "normalizing whitespace" as described in the
previous section.
Usually you don't need to care about this option since
the XML parser is required to normalize the
Document
when creating it. The only reason
you may want to change the option via
XMLUnit.setNormalize
is that your
Document
instances have not been created by
an XML parser but rather been put together in memory using the
DOM API directly.
Using XMLUnit.setIgnoreComments
you
can make XMLUnit's difference engine ignore comments
completely.
When this option is set to true
,
Diff
will use the XSLT transformer under
the covers.
It is not always necessary to know whether a text has
been put into a CDATA section or not. Using
XMLUnit.setIgnoreDiffBetweenTextAndCDATA
you can make XMLUnit consider the following two pieces of XML
identical:
<foo><bar></foo>
<foo><![CDATA[<bar>]]></foo>
Normally the XML parser will expand character references
to their Unicode equivalents but for more complex entity
definitions the parser may expand them or not.
Using XMLUnit.setExpandEntityReferences
you
can control the parser's setting.
When XMLUnit cannot match a control Element to a test
Element (the configured ElementQualifier - see
Section 3.4, “ElementQualifier
” - doesn't return true for
any of the test Elements) it will try to compare it against
the first unmatched test Element (if there is one).
Starting with XMLUnit 1.3 one can
use XMLUnit.setCompareUnmatched
to
disable this behavior and
generate CHILD_NODE_NOT_FOUND
differences
instead.
If the control document is
<root> <a/> </root>
and the test document is
<root> <b/> </root>
the default setting will create a
single ELEMENT_TAG_NAME
Difference
("expected a but found b").
Setting XMLUnit.setCompareUnmatched
to
false will create two Differences of
type CHILD_NODE_NOT_FOUND
(one for "a" and
one for "b") instead.