Enhancing data analysis: uncertainty-resistance method for handling incomplete data

作者:Javad Hamidzadeh, Mona Moradi

摘要

In data analysis, incomplete data commonly occurs and can have significant effects on the conclusions that can be drawn from the data. Incomplete data cause another problem, so-called uncertainty which leads to producing unreliable results. Hence, developing effective techniques to impute these missing values is crucial. Missing or incomplete data and noise are two common sources of uncertainty. In this paper, an effective method for imputing missing values is introduced which is robust to uncertainties that are arising from incompleteness and noise. A kernel-based method for removing the noise is designed. Using the belief function theory, the class of incomplete data is determined. Finally, every missing dimension is imputed considering the mean value of the same dimension of the members belonging to the determined class. The performance has been evaluated on real-world data sets from UCI repository. The results of the experiments have been compared with state-of-the-art methods, which show the superiority of the proposed method regarding classification accuracy.

论文关键词:Incomplete data, Missing values, Belief function theory, Mapped data, Classification

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-019-01514-4