A MapReduce solution for incremental mining of sequential patterns from big data

作者:

Highlights:

• Two phase MapReduce algorithm is proposed for incremental mining of sequential patterns.

• Backward mining makes use of the knowledge obtained during the previous mining process.

• Co-occurrence reverse map data structure efficiently generates the candidate sequences.

• Candidate generation rules avoids the generation of too many false candidates.

• Three novel early prune properties are introduced based on the study of item co-occurrences.

摘要

•Two phase MapReduce algorithm is proposed for incremental mining of sequential patterns.•Backward mining makes use of the knowledge obtained during the previous mining process.•Co-occurrence reverse map data structure efficiently generates the candidate sequences.•Candidate generation rules avoids the generation of too many false candidates.•Three novel early prune properties are introduced based on the study of item co-occurrences.

论文关键词:Big data,Data mining,Incremental mining,MapReduce framework,Sequential pattern mining

论文评审过程:Received 11 August 2018, Revised 1 May 2019, Accepted 12 May 2019, Available online 14 May 2019, Version of Record 20 May 2019.

论文官网地址:https://doi.org/10.1016/j.eswa.2019.05.013