An uncertainty-based approach: Frequent itemset mining from uncertain data with different item importance

作者:

Highlights:

摘要

Since itemset mining was proposed, various approaches have been devised, ranging from processing simple item-based databases to dealing with more complex databases including sequence, utility, or graph information. Especially, in contrast to the mining approaches that process such databases containing exact presence or absence information of items, uncertain pattern mining finds meaningful patterns from uncertain databases with items’ existential probability information. However, traditional uncertain mining methods have a problem in that it cannot apply importance of each item obtained from the real world into the mining process. In this paper, to solve such a problem and perform uncertain itemset mining operations more efficiently, we propose a new uncertain itemset mining algorithm additionally considering importance of items such as weight constraints. In our algorithm, both items’ existential probabilities and weight factors are considered; as a result, we can selectively obtain more meaningful itemsets with high importance and existential probabilities. In addition, the algorithm can operate more quickly with less memory by efficiently reducing the number of calculations causing useless itemset generations. Experimental results in this paper show that the proposed algorithm is more efficient and scalable than state-of-the-art methods.

论文关键词:Data mining,Existential probability,Frequent pattern mining,Uncertain pattern,Weight constraint

论文评审过程:Received 29 January 2015, Revised 25 August 2015, Accepted 26 August 2015, Available online 29 August 2015, Version of Record 8 November 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.08.018