Mining frequent itemsets in data streams within a time horizon

作者:

Highlights:

摘要

In this paper, we present an algorithm for mining frequent itemsets in a stream of transactions within a limited time horizon. In contrast to other approaches that are presented in the literature, the proposed algorithm makes use of a test window that can discard non-frequent itemsets from a set of candidates. The efficiency of this approach relies on the property that the higher the support threshold is, the smaller the test window is. In addition to considering a sharp horizon, we consider a smooth window. Indeed, in many applications that are of practical interest, not all of the time slots have the same relevance, e.g., more recent slots can be more interesting than older slots. Smoothness can be determined in both qualitative and quantitative terms. A comparison to other algorithms is conducted. The experimental results prove that the proposed solution is faster than other approaches but has a slightly higher cost in terms of memory.

论文关键词:Data mining,Mining methods and algorithms,Frequent itemsets

论文评审过程:Received 18 August 2011, Revised 22 October 2013, Accepted 22 October 2013, Available online 26 December 2013.

论文官网地址:https://doi.org/10.1016/j.datak.2013.10.002