Generalized maximal utility for mining high average-utility itemsets

作者:Wei Song, Lu Liu, Chaomin Huang

摘要

Mining high average-utility itemsets (HAUIs) is a promising research topic in data mining because, in contrast to high utility itemsets, they are not biased toward long itemsets. Regardless of what upper bounds and pruning strategies are used, most existing HAUI mining algorithms are founded on the concept of maximal utility, namely the highest utility of a single item in each transaction. In this paper, we study this problem by generalizing the typical maximal utility and average-utility upper bound from a single item to an itemset, and propose an efficient HAIU mining algorithm based on generalized maximal utility (HAUIM-GMU). For this algorithm, we first propose the concepts of generalized maximal utility and the generalized average-utility upper bound, and discuss how the proposed upper bound can be made tighter to generate fewer candidates. A new pruning strategy is then proposed based on the concept of support, and this is shown to be effective for filtering out unpromising itemsets. The final algorithm is described in detail. Extensive experimental results show that the HAUIM-GMU algorithm outperforms existing state-of-the-art algorithms.

论文关键词:Data mining, High average-utility itemset, Generalized maximal utility, Generalized average-utility upper bound, Critical support count

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-021-01614-z