Detecting concept change in dynamic data streams

作者:Russel Pears, Sripirakas Sakthithasan, Yun Sing Koh

摘要

In this research we present a novel approach to the concept change detection problem. Change detection is a fundamental issue with data stream mining as classification models generated need to be updated when significant changes in the underlying data distribution occur. A number of change detection approaches have been proposed but they all suffer from limitations with respect to one or more key performance factors such as high computational complexity, poor sensitivity to gradual change, or the opposite problem of high false positive rate. Our approach uses reservoir sampling to build a sequential change detection model that offers statistically sound guarantees on false positive and false negative rates but has much smaller computational complexity than the ADWIN concept drift detector. Extensive experimentation on a wide variety of datasets reveals that the scheme also has a smaller false detection rate while maintaining a competitive true detection rate to ADWIN.

论文关键词:Concept drift detection, Data stream mining, Sequential hypothesis testing, Reservoir sampling

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-013-5433-9