RP-Miner: a relaxed prune algorithm for frequent similar pattern mining

作者:Ansel Yoan Rodríguez-González, José Francisco Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa, José Ruiz-Shulcloper

摘要

Most of the current algorithms for mining frequent patterns assume that two object subdescriptions are similar if they are equal, but in many real-world problems some other ways to evaluate the similarity are used. Recently, three algorithms (ObjectMiner, STreeDC-Miner and STreeNDC-Miner) for mining frequent patterns allowing similarity functions different from the equality have been proposed. For searching frequent patterns, ObjectMiner and STreeDC-Miner use a pruning property called Downward Closure property, which should be held by the similarity function. For similarity functions that do not meet this property, the STreeNDC-Miner algorithm was proposed. However, for searching frequent patterns, this algorithm explores all subsets of features, which could be very expensive. In this work, we propose a frequent similar pattern mining algorithm for similarity functions that do not meet the Downward Closure property, which is faster than STreeNDC-Miner and loses fewer frequent similar patterns than ObjectMiner and STreeDC-Miner. Also we show the quality of the set of frequent similar patterns computed by our algorithm with respect to the quality of the set of frequent similar patterns computed by the other algorithms, in a supervised classification context.

论文关键词:Data mining, Frequent patterns, Mixed data, Similarity functions, Downward closure property

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-010-0309-9