On measuring the performance of binary classifiers

作者：Charles Parker

摘要

If one is given two binary classifiers and a set of test data, it should be straightforward to determine which of the two classifiers is the superior. Recent work, however, has called into question many of the methods heretofore accepted as standard for this task. In this paper, we analyze seven ways of determining whether one classifier is better than another, given the same test data. Five of these are long established, and two are relative newcomers. We review and extend work showing that one of these methods is clearly inappropriate and then conduct an empirical analysis with a large number of datasets to evaluate the real-world implications of our theoretical analysis. Both our empirical and theoretical results converge strongly toward one of the newer methods.

论文关键词：Performance measures, Binary classification, Supervised learning, Evaluation

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10115-012-0558-x