An information entropy-based approach to outlier detection in rough sets

作者:

Highlights:

摘要

The information entropy in information theory, developed by Shannon, gives an effective measure of uncertainty for a given system. And it also seems a competing mechanism for the measurement of uncertainty in rough sets. Many researchers have applied the information entropy to rough sets, and proposed different information entropy models in rough sets. Especially, Düntsch et al. presented a well-justified information entropy model for the measurement of uncertainty in rough sets. In this paper, we shall demonstrate the application of this model for the study of a specific data mining problem – outlier detection. By virtue of Düntsch’s information entropy model, we propose a novel definition of outliers – IE (information entropy)-based outliers in rough sets. An algorithm to find such outliers is also given. And the effectiveness of IE-based method for outlier detection is demonstrated on two publicly available data sets.

论文关键词:Information entropy,Outlier detection,Rough sets,Data mining

论文评审过程:Available online 21 February 2010.

论文官网地址:https://doi.org/10.1016/j.eswa.2010.02.087