Incremental semi-supervised learning on streaming data

作者:

Highlights:

摘要

In streaming data classification, most of the existing methods assume that all arrived evolving data are completely labeled. One challenge is that some applications where only small amount of labeled examples are available for training. Incremental semi-supervised learning algorithms have been proposed for regularizing neural networks by incorporating various side information, such as pairwise constraints or user-provided labels. However, it is hard to put them into practice, especially for non-stationary environments due to the effectiveness and parameter sensitivity of such algorithms. In this paper, we propose a novel incremental semi-supervised learning framework on streaming data. Each layer of model is comprised of a generative network, a discriminant structure and the bridge. The generative network uses dynamic feature learning based on autoencoders to learn generative features from streaming data which has been demonstrated its potential in learning latent feature representations. In addition, the discriminant structure regularizes the network construction via building pairwise similarity and dissimilarity constraints. It is also used for facilitating the parameter learning of the generative network. The network and structure are integrated into a joint learning framework and bridged by enforcing the correlation of their parameters, which balances the flexible incorporation of supervision information and numerical tractability for non-stationary environments as well as explores the intrinsic data structure. Moreover, an efficient algorithm is designed to solve the proposed optimization problem and we also give an ensemble method. Particularly, when multiple layers of model are stacked, the performance is significantly boosted. Finally, to validate the effectiveness of the proposed method, extensive experiments are conducted on synthetic and real-life datasets. The experimental results demonstrate that the performance of the proposed algorithms is superior to some state-of-the-art approaches.

论文关键词:Semi-supervised learning,Dynamic feature learning,Streaming data,Classification

论文评审过程:Received 19 February 2018, Revised 27 September 2018, Accepted 10 November 2018, Available online 16 November 2018, Version of Record 4 December 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2018.11.006