Pruning of Error Correcting Output Codes by optimization of accuracy–diversity trade off

作者:Süreyya Özöğür-Akyüz, Terry Windeatt, Raymond Smith

摘要

Ensemble learning is a method of combining learners to obtain more reliable and accurate predictions in supervised and unsupervised learning. However, the ensemble sizes are sometimes unnecessarily large which leads to additional memory usage, computational overhead and decreased effectiveness. To overcome such side effects, pruning algorithms have been developed; since this is a combinatorial problem, finding the exact subset of ensembles is computationally infeasible. Different types of heuristic algorithms have developed to obtain an approximate solution but they lack a theoretical guarantee. Error Correcting Output Code (ECOC) is one of the well-known ensemble techniques for multiclass classification which combines the outputs of binary base learners to predict the classes for multiclass data. In this paper, we propose a novel approach for pruning the ECOC matrix by utilizing accuracy and diversity information simultaneously. All existing pruning methods need the size of the ensemble as a parameter, so the performance of the pruning methods depends on the size of the ensemble. Our unparametrized pruning method is novel as being independent of the size of ensemble. Experimental results show that our pruning method is mostly better than other existing approaches.

论文关键词:Ensemble learning, Ensemble pruning, Error Correcting Output Codes, DC programming, Support vector machines, Integer programming

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-014-5477-5