DBC: a condensed representation of frequent patterns for efficient mining

作者:

Highlights:

摘要

Given a large set of data, a common data mining problem is to extract the frequent patterns occurring in this set. The idea presented in this paper is to extract a condensed representation of the frequent patterns called disjunction-bordered condensation (DBC), instead of extracting the whole frequent pattern collection. We show that this condensed representation can be used to regenerate all frequent patterns and their exact frequencies. Moreover, this regeneration can be performed without any access to the original data. Practical experiments show that the DBCcan be extracted very efficiently even in difficult cases and that this extraction and the regeneration of the frequent patterns is much more efficient than the direct extraction of the frequent patterns themselves. We compared the DBC with another representation of frequent patterns previously investigated in the literature called frequent closed sets. In nearly all experiments we have run, the DBC have been extracted much more efficiently than frequent closed sets. In the other cases, the extraction times are very close.

论文关键词:Data mining,Frequent patterns,Condensed representations

论文评审过程:Received 26 February 2002, Revised 11 November 2002, Accepted 11 November 2002, Available online 15 February 2003.

论文官网地址:https://doi.org/10.1016/S0306-4379(03)00002-4