Sweeping the disjunctive search space towards mining new exact concise representations of frequent itemsets

作者:

Highlights:

摘要

Concise (or condensed) representations of frequent patterns follow the minimum description length (MDL) principle, by providing the shortest description of the whole set of frequent patterns. In this work, we introduce a new exact concise representation of frequent itemsets. This representation is based on an exploration of the disjunctive search space. The disjunctive itemsets convey information about the complementary occurrence of items in a dataset. A novel closure operator is then devised to suit the characteristics of the explored search space. The proposed operator aims at mapping many disjunctive itemsets to a unique one, called a disjunctive closed itemset. Hence, it permits to drastically reduce the number of handled itemsets within the targeted re-presentation. Interestingly, the proposed representation offers direct access to the disjunctive and negative supports of frequent itemsets while ensuring the derivation of their exact conjunctive supports. We conclude from the experimental results reported and discussed here that our representation is effective and sound in comparison with different other concise representations.

论文关键词:Data mining,Frequent itemset,Association rule,Concise representation,Complementary occurrence,Disjunctive support,Disjunctive search space,Closure operator,Equivalence class,Disjunctive closed itemset,Essential itemset,Generalized association rule,Minimum description length principle

论文评审过程:Received 21 May 2008, Revised 1 May 2009, Accepted 1 May 2009, Available online 15 May 2009.

论文官网地址:https://doi.org/10.1016/j.datak.2009.05.001