Statistical comparisons of active learning strategies over multiple datasets

作者:

Highlights:

摘要

Active learning has become an important area of research owing to the increasing number of real-world problems in which a huge amount of unlabelled data is available. Active learning strategies are commonly compared by means of visually comparing learning curves. However, in cases where several active learning strategies are assessed on multiple datasets, the visual comparison of learning curves may not be the best choice to conclude whether a strategy is significantly better than another one. In this paper, two comparison approaches are proposed, based on the use of non-parametric statistical tests, to statistically compare active learning strategies over multiple datasets. The application of the two approaches is illustrated by means of a thorough experimental study, demonstrating the usefulness of the proposal for the analysis of the active learning performance.

论文关键词:Active learning,Statistical comparison,Non-parametric statistical tests

论文评审过程:Received 21 September 2017, Revised 26 January 2018, Accepted 29 January 2018, Available online 31 January 2018, Version of Record 20 February 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.01.033