Michel de Rougemont

We consider the classical Tree-Edit-Distance which gives a measure on trees and a distance between a given tree $T$ and a language $L$ defined by a DTD or a tree-automaton. If a given tree $T$ is not far from $L$, we show how to find a modified $T’$ which isin $L$ and not too far from $T$, applying local corrections. We first consider binary trees and generalize the method to unranked labelled trees, i.e. XML documents.

When XML files do not parse correctly and if there are few errors, we can generate a corrected file after some local modifications. This technique allows to extend HTML’s robustness to XML.