A semi-hard voting combiner scheme to ensemble multi-class probabilistic classifiers

摘要

Ensembling of probabilistic classifiers is a technique that has been widely applied in classification, allowing to build a new classifier combining a set of base classifiers. Of the different schemes that can be used to construct the ensemble, we focus on the simple majority vote (MV), which is one of the most popular combiner schemes, being the foundation of the meta-algorithm bagging. We propose a non-trainable weighted version of the simple majority vote rule that, instead of assign weights to each base classifier based on their respective estimated accuracies, uses the confidence level CL, which is the standard measure of the degree of support that each one of the base classifiers gives to its prediction. In the binary case, we prove that if the number of base classifiers is odd, the accuracy of this scheme is greater than that of the majority vote. Moreover, through a sensitivity analysis, we show in the multi-class setting that its resilience to the estimation error of the probabilities assigned by the classifiers to each class is greater than that of the average scheme. We also consider another simple measure of the degree of support that incorporates additional knowledge of the probability distribution over the classes, namely the modified confidence level MCL. The usefulness for bagging of the proposed weighted majority vote based on CL or MCL is checked through a series of experiments with different databases of public access, resulting that it outperforms the simple majority vote in the sense of a statistically significant improvement regarding two performance measures: Accuracy and Matthews Correlation Coefficient (MCC), while holding up against the average combiner, which majority vote does not, being less computationally demanding.