QMatch – Using paths to match XML schemas

作者:

Highlights:

摘要

Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing popularity of the XML model and the proliferation of XML documents on-line, automated matching of XML documents and databases has become a critical issue. In this paper, we present a hybrid schema match algorithm, QMatch, that provides a unique path-based framework for harnessing traditional structural and semantic information, while exploiting the constraints inherent in XML documents such as the order of XML elements, to provide improved levels of matching between two given XML schemata. QMatch is based on the measurement of a unique quality of match metric, QoM, and a set of classifiers which together provide not only an effective basis for the development of a new schema match algorithm, but also a useful tool for tuning existing schema match algorithms to output at desired levels of matching. In this paper, we show via a set of experiments the benefits of the path-based QMatch over existing structural, linguistic, and hybrid algorithms such as Cupid, and provide an empirical measure of the accuracy of QMatch in terms of the true matches discovered by the algorithm.

论文关键词:Schema matching,Schema integration,Hybrid schema matching,XML schema matching

论文评审过程:Received 14 March 2006, Revised 14 March 2006, Accepted 14 March 2006, Available online 18 April 2006.

论文官网地址:https://doi.org/10.1016/j.datak.2006.03.002