Threshold optimisation for multi-label classifiers

作者:

Highlights:

摘要

Many multi-label classifiers provide a real-valued score for each class. A well known design approach consists of tuning the corresponding decision thresholds by optimising the performance measure of interest. We address two open issues related to the optimisation of the widely used F measure and precision–recall (P–R) curve, with respect to the class-related decision thresholds, on a given data set. (i) We derive properties of the micro-averaged F, which allow its global maximum to be found by an optimisation strategy with a low computational cost. So far, only a suboptimal threshold selection rule and a greedy algorithm with no optimality guarantee were known. (ii) We rigorously define the macro- and micro-P–R curves, analyse a previously suggested strategy for computing them, based on maximising F, and develop two possible implementations, which can be also exploited for optimising related performance measures. We evaluate our algorithms on five data sets related to three different application domains.

论文关键词:Multi-label classification,S-Cut thresholding,F measure,Precision–recall curve

论文评审过程:Received 9 May 2012, Revised 11 October 2012, Accepted 9 January 2013, Available online 17 January 2013.

论文官网地址:https://doi.org/10.1016/j.patcog.2013.01.012