Mining closed partially ordered patterns, a new optimized algorithm

作者:

Highlights:

摘要

Nowadays, sequence databases are available in several domains with increasing sizes. Exploring such databases with new pattern mining approaches involving new data structures is thus important. This paper investigates this data mining challenge by presenting OrderSpan, an algorithm that is able to extract a set of closed partially ordered patterns from a sequence database. It combines well-known properties of prefixes and suffixes. Furthermore, we extend OrderSpan by adapting efficient optimizations used in sequential pattern mining domain. Indeed, the proposed method is flexible and follows the sequential pattern paradigm. It is more efficient in the search space exploration, as it skips redundant branches. Experiments were performed on different real datasets to show (1) the effectiveness of the optimized approach and (2) the benefit of closed partially ordered patterns with respect to closed sequential patterns.

论文关键词:Data mining,Sequential patterns,Partially ordered patterns

论文评审过程:Received 24 April 2014, Revised 20 December 2014, Accepted 25 December 2014, Available online 17 January 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2014.12.027