Non-redundant sequential rules—Theory and algorithm

作者:

Highlights:

摘要

A sequential rule expresses a relationship between two series of events happening one after another. Sequential rules are potentially useful for analyzing data in sequential format, ranging from purchase histories, network logs and program execution traces.In this work, we investigate and propose a syntactic characterization of a non-redundant set of sequential rules built upon past work on compact set of representative patterns. A rule is redundant if it can be inferred from another rule having the same support and confidence. When using the set of mined rules as a composite filter, replacing a full set of rules with a non-redundant subset of the rules does not impact the accuracy of the filter.We consider several rule sets based on composition of various types of pattern sets—generators, projected-database generators, closed patterns and projected-database closed patterns. We investigate the completeness and tightness of these rule sets. We characterize a tight and complete set of non-redundant rules by defining it based on the composition of two pattern sets. Furthermore, we propose a compressed set of non-redundant rules in a spirit similar to how closed patterns serve as a compressed representation of a full set of patterns. Lastly, we propose an algorithm to mine this compressed set of non-redundant rules. A performance study shows that the proposed algorithm significantly improves both the runtime and compactness of mined rules over mining a full set of sequential rules.

论文关键词:Theoretical data mining,Frequent pattern mining,Sequential pattern mining,Sequential rules,Non-redundant rules

论文评审过程:Received 8 January 2009, Accepted 21 January 2009, Available online 4 February 2009.

论文官网地址:https://doi.org/10.1016/j.is.2009.01.002