A framework for abstracting data sources having heterogeneous representation formats

作者:

Highlights:

摘要

This paper deals with the issue of abstracting a data source characterized by one among several possible representation formats. First we show that data source abstraction plays a central role in several important application problems in the area of information system design. Then we propose a new approach which is capable of semi-automatically carrying out the abstraction of a data source possibly encoded according to one among a variety of formats such as structured databases, OEM graphs and XML documents. The capability to handle heterogeneous formats is obtained via the usage of a particular conceptual model, called SDR-Network, which is able to uniformly represent and handle data sources with different formats. As a significant application of the presented data source abstraction algorithm, the construction of an Intensional Repository is also illustrated.

论文关键词:Inter-scheme properties,Semi-structured information sources,Scheme abstraction,Source summarization

论文评审过程:Received 29 January 2003, Revised 9 April 2003, Accepted 14 May 2003, Available online 6 June 2003.

论文官网地址:https://doi.org/10.1016/S0169-023X(03)00092-2