A lazy bagging approach to classification

作者:

Highlights:

摘要

In this paper, we propose lazy bagging (LB), which builds bootstrap replicate bags based on the characteristics of test instances. Upon receiving a test instance xk, LB trims bootstrap bags by taking into consideration xk's nearest neighbors in the training data. Our hypothesis is that an unlabeled instance's nearest neighbors provide valuable information to enhance local learning and generate a classifier with refined decision boundaries emphasizing the test instance's surrounding region. In particular, by taking full advantage of xk's nearest neighbors, classifiers are able to reduce classification bias and variance when classifying xk. As a result, LB, which is built on these classifiers, can significantly reduce classification error, compared with the traditional bagging (TB) approach. To investigate LB's performance, we first use carefully designed synthetic data sets to gain insight into why LB works and under which conditions it can outperform TB. We then test LB against four rival algorithms on a large suite of 35 real-world benchmark data sets using a variety of statistical tests. Empirical results confirm that LB can statistically significantly outperform alternative methods in terms of reducing classification error.

论文关键词:Classification,Classifier ensemble,Bagging,Lazy learning

论文评审过程:Received 12 May 2007, Revised 5 March 2008, Accepted 8 March 2008, Available online 19 March 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.03.008