On discovery of functional dependencies from data

作者:

Highlights:

摘要

Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.

论文关键词:Functional dependencies,Data mining,Knowledge discovery,Hash-based algorithm

论文评审过程:Received 25 October 2011, Revised 30 January 2013, Accepted 30 January 2013, Available online 14 February 2013.

论文官网地址:https://doi.org/10.1016/j.datak.2013.01.008