Mining web access patterns with super-pattern constraint

作者:Trang Van, Atsuo Yoshitaka, Bac Le

摘要

We consider the problem of mining web access patterns with super-pattern constraint. This constraint requires that the sequential patterns in the sequence database must contain a particular set of patterns as sub-patterns. One common application of this constraint is web usage mining which mines the user access behavior on the web. In this paper, we introduce an efficient strategy for mining web access patterns with super-pattern constraint that requires only one database scan. Firstly, we present the MWAPC (M ining W eb A ccess P atterns based on super-pattern C onstraint) algorithm, in which each frequent pattern has to be checked if it contains at least one pattern from a user-defined set of patterns. Then we develop an effective algorithm, called EMWAPC that prunes the search space at the beginning of mining process and avoids checking the constraints one by one based on three proposed propositions. We have conducted the experiments on real web log databases. The experimental results show that the proposed algorithms outperform the previous methods.

论文关键词:Web access pattern mining, Super-pattern constraint, Dynamic bit vector, Prefix-web access pattern tree

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-018-1182-6