Constraint-based sequential pattern mining: The consideration of recency and compactness

作者:

Highlights:

摘要

Sequential pattern mining is an important data-mining method for determining time-related behavior in sequence databases. The information obtained from sequential pattern mining can be used in marketing, medical records, sales analysis, and so on. Existing methods only focus on the concept of frequency because of the assumption that sequences' behaviors do not change over time. The environment from which the data is generated is often dynamic, however, so the sequences' behaviors may change over time. To adapt the discovered patterns to these changes, two new concepts, recency and compactness, are incorporated into traditional sequential pattern mining. The concept of recency causes patterns to quickly adapt to the latest behaviors in sequence databases, while the concept of compactness ensures reasonable time spans for the discovered patterns. We named the new patterns CFR-patterns because three concepts (compactness, frequency, and recency) are simultaneously considered. An efficient method is presented to find CFR-patterns. Empirical evaluation shows that the proposed methods are computationally efficient and that they are more advantageous than traditional methods when sequences' behaviors change over time.

论文关键词:Sequential pattern,Constraint-based mining,Temporal database

论文评审过程:Received 26 July 2004, Revised 29 May 2005, Accepted 20 October 2005, Available online 30 November 2005.

论文官网地址:https://doi.org/10.1016/j.dss.2005.10.006