Optimal and efficient integration of heterogeneous summary tables in a distributed database

作者：

Highlights：

•

摘要

In any particular combination of domains, although a common understanding of the underlying low-level concepts concerning a domain attribute may exist, different concept hierarchies may have been built at different data-holding sites. A distributed database may therefore hold different views of the same data, or differently classified samples from the same population. Because of this, when statistical functions are applied to generate summary tables, the resulting summary-based partitions may be heterogeneous. In these situations, integration of such summary-based partitions can reveal latent information at a new, and finer, level of granularity. In this paper, the classification schemes are described using a matrix representation of the intersection hypergraph, and efficient numerical algorithms are proposed to determine the optimal granularity of the integrated summary data.

论文关键词：Distributed heterogeneous databases,Data integration,Knowledge discovery

论文评审过程：Received 25 August 1997, Revised 14 November 1997, Accepted 10 August 1998, Available online 1 June 1999.

论文官网地址：https://doi.org/10.1016/S0169-023X(98)00039-1