A selective Bayes Classifier for classifying incomplete data based on gain ratio

作者：

Highlights：

•

摘要

Actual data sets are often incomplete because of various kinds of reasons. Although numerous algorithms about classification have been proposed, most of them deal with complete data. So methods of constructing classifiers for incomplete data deserve more attention. By analyzing main methods of processing incomplete data for classification, this paper presents a selective Bayes Classifier for classifying incomplete data with a simpler formula for computing gain ratio. The proposed algorithm needs no assumption about data sets that are necessary for previous methods of processing incomplete data in classification. Experiments on 12 benchmark incomplete data sets show that this method can greatly improve the accuracy of classification. Furthermore, it can sharply reduce the number of attributes and so can greatly simplify the data sets and classifiers.

论文关键词：Bayesian Classifiers,Feature selection,Incomplete data,Gain ratio

论文评审过程：Received 18 April 2007, Accepted 21 March 2008, Available online 29 March 2008.

论文官网地址：https://doi.org/10.1016/j.knosys.2008.03.013