One-pass AUC optimization

作者:

摘要

AUC is an important performance measure that has been used in diverse tasks, such as class-imbalanced learning, cost-sensitive learning, learning to rank, etc. In this work, we focus on one-pass AUC optimization that requires going through training data only once without having to store the entire training dataset. Conventional online learning algorithms cannot be applied directly to one-pass AUC optimization because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second-order statistics of training data in memory, resulting in a storage requirement independent of the number of training data. To efficiently handle high-dimensional data, we develop two deterministic algorithms that approximate the covariance matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithms.

论文关键词:AUC,ROC curve,Online learning,Large-scale learning,Least square loss,Random projection

论文评审过程:Received 29 June 2014, Revised 26 February 2016, Accepted 12 March 2016, Available online 17 March 2016, Version of Record 25 March 2016.

论文官网地址:https://doi.org/10.1016/j.artint.2016.03.003