Categories
World Wide Web

Problem on XML

Consider 2 XML files:
File A:
<root>
 <title>1</title>
</root>

and
File B:
<root>
 <title>1<title>1</title></title>
 <title>1</title>
</root>

The difference between the 2 files is a new branch <title>1<title>1</title></title> has been inserted.

How do you find this out? What is the algorithm?

For those who don't know about this, these are what called XML Diff's.
Each organization seems to have its own program, which does this.

There are many algorithms.

Microsoft's XML Diff is impressive.

Yet another example:

Given:
File A:
<root>
 <title>1</title>
</root>

and
File B:
<root>
 <title>1<title>1</title></title>
 <titlee>1</titlee>
</root>

what is the difference between the 2 files?

Difference 1: A new branch <titlee>1</titlee> has been added.
Difference 2: <title>1</title> has been inserted into <title>1</title>.

There are lots of open source tools as well. Any good/interesting algorithm you can think of?