XML schema clustering with semantic and hierarchical similarity measures

作者:

Highlights:

摘要

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis.

论文关键词:Clustering,Data mining,Document mining,XML,Semi-structured data,Semantic similarity,Structural similarity,Schema matching

论文评审过程:Received 3 April 2006, Revised 7 August 2006, Accepted 15 August 2006, Available online 26 September 2006.

论文官网地址:https://doi.org/10.1016/j.knosys.2006.08.006