Semantic integration in Xyleme: a uniform tree-based approach

作者:

Highlights:

摘要

Xyleme is a huge warehouse integrating XML data of the Web. Xyleme considers a simple data model with data trees and tree types for describing the data sources, and a simple query language based on tree queries with boolean conditions. The main components of the data model are a mediated schema modeled by an abstract tree type, as a view of a set of tree types associated with actual data trees, called concrete tree types, and a mapping expressing the connection between the mediated schema and the concrete tree types. The first contribution of this paper is formal: we provide a declarative model-theoretic semantics for Xyleme tree queries, a way of checking tree query containment, and a characterization of tree queries as a composition of branch queries. The other contributions are algorithmic and handle the potentially huge size of the mapping relation which is a crucial issue for semantic integration and query evaluation in Xyleme. First, we propose a method for pre-evaluating queries at compile time by storing some specific meta-information about the mapping into map translation tables. These map translation tables summarize the set of all the branch queries that can be generated from the mediated schema and the set of all the mappings. Then, we propose different operators and strategies for relaxing queries which, having an empty map translation table, will have no answer if they are evaluated against the data. Finally, we present a method for semi-automatically generating the mapping relation.

论文关键词:Xyleme,Warehouse,Tree queries,Map translation tables

论文评审过程:Received 3 June 2002, Revised 3 July 2002, Accepted 3 July 2002, Available online 11 December 2002.

论文官网地址:https://doi.org/10.1016/S0169-023X(02)00140-4