A systematic analysis of performance measures for classification tasks

作者:

Highlights:

摘要

This paper presents a systematic analysis of twenty four performance measures used in the complete spectrum of Machine Learning classification tasks, i.e., binary, multi-class, multi-labelled, and hierarchical. For each classification task, the study relates a set of changes in a confusion matrix to specific characteristics of data. Then the analysis concentrates on the type of changes to a confusion matrix that do not change a measure, therefore, preserve a classifier’s evaluation (measure invariance). The result is the measure invariance taxonomy with respect to all relevant label distribution changes in a classification problem. This formal analysis is supported by examples of applications where invariance properties of measures lead to a more reliable evaluation of classifiers. Text classification supplements the discussion with several case studies.

论文关键词:Performance evaluation,Machine Learning,Text classification

论文评审过程:Received 14 February 2008, Revised 21 November 2008, Accepted 6 March 2009, Available online 8 May 2009.

论文官网地址:https://doi.org/10.1016/j.ipm.2009.03.002