Damped sliding based utility oriented pattern mining over stream data
作者:
Highlights:
•
摘要
High utility pattern mining (HUPM) discovers meaningful patterns by considering features of items and utility from non-binary data. Data called stream data is continually generated over time. Various techniques based on high utility pattern mining have been suggested for processing stream data. High utility pattern mining based on a sliding window performs pattern mining using a window. Since it uses only the data stored in a window, only the latest data can be managed. However, Stream data has the property that newly created data has a higher influence than relatively old data. It is necessary to consider the importance of the data stored in a window differently. In this paper, we propose an efficient algorithm based on a sliding window approach that mines high utility patterns considering the latest data more significantly from damped stream data where new data is constantly being inserted. In other words, our technique divides the stream data into fixed-sized multiple batch data and processes differently the importance of each batch data in a window according to the added time using the decaying factor. Moreover, we conduct experiments to compare and analyze our approach with the state-of-the-art algorithms using real and synthetic datasets. The experimental results show that our proposed method outperforms the competitors in terms of run time, memory usage, and scalability test.
论文关键词:Data mining,High utility patterns,Stream data,Sliding window,Decaying factor
论文评审过程:Received 21 July 2020, Revised 9 November 2020, Accepted 2 December 2020, Available online 13 December 2020, Version of Record 24 December 2020.
论文官网地址:https://doi.org/10.1016/j.knosys.2020.106653