A hybrid generative/discriminative method for semi-supervised classification

作者:

Highlights:

摘要

Training methods for machine learning are often characterized as being generative or discriminative. We present a new co-training style algorithm which employs a generative classifier (Naive Bayes) and a discriminative classifier (Support Vector Machine) as base classifiers, to take advantage of both methods. Furthermore, we introduce a pair of weight parameters to balance the impact of labeled and pseudo-labeled data, and define a hybrid objective function to tune their values during co-training. The final prediction is given by the combination of base classifiers, and we define a pseudo-validation set to regulate their weight. Additionally, we present a strategy of pseudo-labeled data selecting to deal with the class imbalance problem. Experimental results on six datasets show that our method performs much better in practice, especially when the amount of labeled data is small.

论文关键词:Co-training,Hybrid generative/discriminative methods,Naive Bayes,Support vector machine,Classification,Class imbalance

论文评审过程:Received 31 October 2011, Revised 3 May 2012, Accepted 25 July 2012, Available online 6 August 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.07.020