Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition

作者:José A. Sáez, Mikel Galar, Julián Luengo, Francisco Herrera

摘要

The presence of noise in data is a common problem that produces several negative consequences in classification problems. In multi-class problems, these consequences are aggravated in terms of accuracy, building time, and complexity of the classifiers. In these cases, an interesting approach to reduce the effect of noise is to decompose the problem into several binary subproblems, reducing the complexity and, consequently, dividing the effects caused by noise into each of these subproblems. This paper analyzes the usage of decomposition strategies, and more specifically the One-vs-One scheme, to deal with noisy multi-class datasets. In order to investigate whether the decomposition is able to reduce the effect of noise or not, a large number of datasets are created introducing different levels and types of noise, as suggested in the literature. Several well-known classification algorithms, with or without decomposition, are trained on them in order to check when decomposition is advantageous. The results obtained show that methods using the One-vs-One strategy lead to better performances and more robust classifiers when dealing with noisy data, especially with the most disruptive noise schemes.

论文关键词:Noisy data, Class noise, Attribute noise, One-vs-One, Decomposition strategies, Ensembles, Classification

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0570-1