Creating diversity in ensembles using synthetic neighborhoods of training samples

作者:Zhi Chen, Tao Lin, Rui Chen, Yingtao Xie, Hongyan Xu

摘要

Diversity among base classifiers is known to be a key driver for the construction of an effective ensemble classifier. Several methods have been proposed to construct diverse base classifiers using artificially generated training samples. However, in these methods, diversity is often obtained at the expense of the accuracy of base classifiers. Inspired by the localized generalization error model a new sample generation method is proposed in this study. When preparing different training sets for base classifiers, the proposed method generates samples located within limited neighborhoods of the corresponding training samples. The generated samples are different with the original training samples but they also expand different parts of the original training data. Learning these datasets can result in a set of base classifiers that are accurate in different regions of the input space as well as maintaining appropriate diversity. Experiments performed on 26 benchmark datasets showed that: (1) our proposed method significantly outperformed some state-of-the-art ensemble methods in term of the classification accuracy; (2) our proposed method was significantly more efficient that other sample generation based ensemble methods.

论文关键词:Classifier ensemble, Diversity, Generalization ability, Localized generalization error model, Sample generation

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-017-0922-3