A new approach to mine frequent patterns using item-transformation methods

作者:

Highlights:

摘要

Mining frequent patterns is a fundamental and crucial task in data-mining problems. The algorithms reported in the literature for mining frequent patterns can be classified into two approaches: the candidate generation-and-test approach (for example, the Apriori algorithm) and the pattern-growth approach (such as the FP-growth algorithm). The approaches both suffered from the problems that their speed is slow for large databases. This paper proposes a novel and simple approach, which does not belong to the above two approaches. This approach treats the database as a stream of data and finds frequent patterns by scanning the database only once. In addition, the approach can incrementally mine frequent patterns if the database is updated or inserted subsequently. Three versions of the approach (i.e., mapping-table, transformation-function, and logic-circuit) are provided. The logic-circuit version is the first one that mines frequent patterns by simple logic gates, and the modeling of this version shows its speed is thousands of times faster than that of the FP-growth algorithm. Analyses and simulations of the approach are also performed. Analyses show that the transformation-function version is much better than the Apriori and FP-growth ones in storage complexity. Simulation results show that the mapping-table version is comparable to the FP-growth algorithm in execution time.

论文关键词:Data mining,Frequent pattern,Apriori,Gray code,Item transformation

论文评审过程:Received 28 July 2005, Revised 15 September 2006, Accepted 3 January 2007, Available online 30 January 2007.

论文官网地址:https://doi.org/10.1016/j.is.2007.01.001