A similarity between probabilistic tree languages: application to XML document families

作者:

Highlights:

摘要

We describe a general approach to compute a similarity measure between distributions generated by probabilistic tree automata that may be used in a number of applications in the pattern recognition field. In particular, we show how this similarity can be computed for families of structured (XML) documents. In such case, the use of regular expressions to specify the right part of the expansion rules adds some complexity to the task.

论文关键词:Distance between tree languages,Similarity of structured documents

论文评审过程:Received 11 September 2002, Accepted 3 October 2002, Available online 13 February 2003.

论文官网地址:https://doi.org/10.1016/S0031-3203(02)00320-5