An incremental node embedding technique for error correcting output codes

作者:

Highlights:

摘要

The error correcting output codes (ECOC) technique is a useful way to extend any binary classifier to the multiclass case. The design of an ECOC matrix usually considers an a priori fixed number of dichotomizers. We argue that the selection and number of dichotomizers must depend on the performance of the ensemble code in relation to the problem domain. In this paper, we present a novel approach that improves the performance of any initial output coding by extending it in a sub-optimal way. The proposed strategy creates the new dichotomizers by minimizing the confusion matrix among classes guided by a validation subset. A weighted methodology is proposed to take into account the different relevance of each dichotomizer. As a result, overfitting is avoided and small codes with good generalization performance are obtained. In the decoding step, we introduce a new strategy that follows the principle that positions coded with the symbol zero should have small influence in the results. We compare our strategy to other well-known ECOC strategies on the UCI database, and the results show it represents a significant improvement.

论文关键词:Multiclass classification,Error correcting output codes,Ensemble of dichotomizers,Codeword design,One-versus-one,One-versus-all

论文评审过程:Received 20 June 2006, Revised 17 April 2007, Accepted 20 April 2007, Available online 3 May 2007.

论文官网地址:https://doi.org/10.1016/j.patcog.2007.04.008