Mining summarization of high utility itemsets

作者:

Highlights:

摘要

Mining interesting itemsets from transaction databases has attracted a lot of research interests for decades. In recent years, high utility itemset (HUI) has emerged as a hot topic in this field. In real applications, the bottleneck of HUI mining is not at the efficiency but at the interpretability, due to the huge number of itemsets generated by the mining process. Because the downward closure property of itemsets no longer holds for HUIs, the compression or summarization methods for frequent itemsets are not available. With this in mind, considering coverage and diversity, we introduce a novel well-founded approach, called SUIT-miner, for succinctly summarizing HUIs with a small collection of itemsets. First, we define the condition under which an itemset can cover another itemset. Then, a greedy algorithm is presented to find the least itemsets to cover all of HUIs, in order to ensure diversity. For enhancing the efficiency, the greedy algorithm employs some pruning strategies. To evaluate the performance of SUIT-miner, we conduct extensive experiments on real datasets. The experimental results show that SUIT-miner is effective and efficient.

论文关键词:Data mining,High utility itemsets,Utility mining,Summarization

论文评审过程:Received 8 November 2014, Revised 1 April 2015, Accepted 1 April 2015, Available online 6 April 2015, Version of Record 13 May 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.04.004