A sliding window method for finding top-k path traversal patterns over streaming Web click-sequences

作者:

Highlights:

摘要

Online mining of path traversal patterns from Web click-streams is one of the most important problems of Web usage mining. In this paper, we propose a sliding window-based Web data mining algorithm, called Top-SW (Top-k path traversal patterns of Stream sliding Window), to discover the set of top-k path traversal patterns from streaming maximal forward references, where k is the desired number of path traversal patterns to be mined. A new summary data structure, called Top-list (a list of Top-k path traversal patterns) is developed to maintain the essential information about the top-k path traversal patterns from the current maximal forward references stream. Experimental studies show that the proposed Top-SW algorithm is an efficient, single-pass algorithm for mining the set of top-k path traversal patterns from a continuous stream of maximal forward references.

论文关键词:Data mining,Web usage mining,Data streams,Top-k pattern mining,Path traversal patterns,Stream sliding windows

论文评审过程:Available online 13 May 2008.

论文官网地址:https://doi.org/10.1016/j.eswa.2008.05.025