Discovering knowledge from large databases using prestored information

作者:

Highlights:

摘要

In this paper, we examine the two issues of mining association rules and mining sequential patterns in a large database of sales transactions. The problems of mining association rules and mining sequential patterns focus on discovering large itemsets and large sequences, respectively. We present PSI and PSI_seq for efficient large itemsets generation and large sequences generation, respectively. The main ideas of these two algorithms are using prestored information to minimize the numbers of candidate itemsets and candidate sequences counted in each database scan. The prestored informations for PSI and PSI_seq include the itemsets and the sequences along with their support counts found in the last mining, respectively. Typically a user may require to tune the value of the minimum support many times before a set of useful association rules can be obtained from the transaction database. Using prestored information, the total computation time will be reduced effectively. Empirical results show that our approaches outperform previous methods by an order of magnitude, using little storage space for the prestored information.

论文关键词:Knowledge Discovery,Data Mining,Association Rules,Sequential Patterns

论文评审过程:Received 20 November 1999, Revised 16 October 2000, Available online 18 October 2001.

论文官网地址:https://doi.org/10.1016/S0306-4379(01)00006-0