ConfusionVis: Comparative evaluation and selection of multi-class classifiers based on confusion matrices

作者:

Highlights:

摘要

In machine learning, the presumably best model is selected from a variety of model candidates generated by testing different model types, hyperparameters, or feature subsets. The advent of deep learning has made model selection even more challenging due to the huge parameter search space. Relying on a single metric to select the best model does not consider class imbalances or the different costs of misclassifications. We argue that incorporating human knowledge to interactively analyse the per-class errors and class confusions over all model candidates enables a more efficient training process and yields better models for given applications. This paper proposes the model-agnostic approach ConfusionVis which allows to comparatively evaluate and select multi-class classifiers based on their confusion matrices. This contributes to making the models’ results understandable, while treating the models as black boxes. Therefore, we propose a novel method to measure and visualise distances between confusion matrices and an interactive query interface to incorporate all composition levels of class errors. The approach is evaluated in a user study and the applicability is shown by a case study where marine biologists investigate the conservation efforts of baleen whales by classifying whale species in acoustic recordings. ConfusionVis is available online: https://www.ml-and-vis.org/confusionvis.

论文关键词:Machine learning,Interpretable machine learning,Classification,Model selection,Species conservation

论文评审过程:Received 2 February 2021, Revised 16 March 2022, Accepted 23 March 2022, Available online 14 April 2022, Version of Record 4 May 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108651