Data dependency in multiple classifier systems

作者:

Highlights:

摘要

In this paper, the data dependency of aggregation modules in multiple classifier system is being investigated. We first propose a new categorization scheme, in which combining methods are grouped into data-independent, implicitly data-dependent and explicitly data-dependent. It is argued that data-dependent approaches present the highest potential for improved performance. In this study, we intend to provide a comprehensive investigation of this argument and explore the impact of data dependency on the performance of multiple classifiers. We evaluate this impact based on two criteria, prediction accuracy and stability. In addition, we examine the effect of class imbalance and uneven data distribution on these two criteria. This paper presents the findings of an extensive set of comparative experiments. Based on the findings, it can be concluded that data-dependent aggregation methods are generally more stable and less sensitive to class imbalance. In addition, data-dependent methods exhibited superior or identical generalization ability for most of the data sets.

论文关键词:Multiple classifier systems,Data dependency,Aggregation methods,Stability,Class imbalance

论文评审过程:Received 22 December 2006, Revised 29 September 2008, Accepted 29 November 2008, Available online 10 December 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.11.035