Mining high utility patterns in interval-based event sequences

作者:

Highlights:

摘要

Sequential pattern mining is an interesting research area with broad range of applications. Most prior research on sequential pattern mining has considered point-based data where events occur instantaneously. However, in many application domains, events persist over intervals of time of varying lengths. Furthermore, traditional frameworks for sequential pattern mining assume all events have the same weight or utility. This simplifying assumption neglects the opportunity to find informative patterns in terms of utilities, such as profits. To address these issues, we incorporate the concept of utility into interval-based sequences and define a framework to mine high utility patterns in interval-based sequences i.e., patterns whose utility meets or exceeds a minimum threshold. In the proposed framework, the utility of events is considered while assuming multiple events can occur coincidentally and persist over varying periods of time. An algorithm named High Utility Interval-based Pattern Miner (HUIPMiner) is proposed and applied to real datasets. To achieve an efficient solution, HUIPMiner is augmented with two effective pruning strategies. Experimental results show that HUIPMiner is an effective solution to the problem of mining high utility interval-based sequences. Moreover, it is shown that the execution time of the algorithm is reduced when the proposed pruning strategies are applied.

论文关键词:High utility interval-based,Utility mining,Sequential pattern mining,Temporal pattern,Event sequence

论文评审过程:Available online 27 August 2021, Version of Record 27 September 2021.

论文官网地址:https://doi.org/10.1016/j.datak.2021.101924