An effective framework based on local cores for self-labeled semi-supervised classification

作者:

Highlights:

摘要

Semi-supervised self-labeled methods apply unlabeled data to improve the performance of classifiers which are trained by labeled data alone. Nevertheless, applying unlabeled data may deteriorate the prediction accuracy. One of the causes is that there are insufficient labeled data for training an initial classifier in self-labeled methods. However, existing solutions for this problem of lacking sufficient initial labeled data still have technical defects. For example, they fail to deal with non-spherical data and improve insufficient initial labeled data effectively, when initial labeled data are extremely scarce. In this paper, we propose an effective semi-supervised self-labeled framework based on local cores, aiming to solve the problem of lacking adequate initial labeled data in self-labeled methods and overcome existing technical defects above. Main ideas of our framework include two sides: (a) inadequate initial labeled data are improved by adding predicted local cores to them, where local cores are predicted by active labeling or co-labeling; (b) we use any semi-supervised self-labeled method to train a given classifier on improved labeled data and updated unlabeled data. In our framework, local cores roughly reveal the data distribution, which helps the proposed framework work on spherical or non-spherical data sets. In addition, local cores also help our framework improve insufficient initial labeled data effectively, even when initial labeled data are extremely scarce. Experiments show that the proposed framework is compatible with tested self-labeled methods, and can help self-labeled methods train a k nearest neighbor or support vector machine, when initial labeled data are insufficient.

论文关键词:Semi-supervised learning (SSL),Semi-supervised classification (SSC),Self-labeled,Local cores,Natural neighbors

论文评审过程:Received 10 November 2019, Revised 20 March 2020, Accepted 20 March 2020, Available online 31 March 2020, Version of Record 24 April 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105804