Semi-supervised clustering with discriminative random fields

作者:

Highlights:

摘要

Semi-supervised clustering exploits a small quantity of supervised information to improve the accuracy of data clustering. In this paper, a framework for semi-supervised clustering is proposed. This framework is capable of integrating with a traditional clustering algorithm seamlessly, and particularly useful for the application where a traditional clustering is designated to use.In the proposed framework, discriminative random fields (DRFs) are employed to model the consistency between the result of a traditional clustering algorithm and the supervised information with the assumption of semi-supervised learning. The semi-supervised clustering problem is thus formulated as finding the label configuration with the maximum a posteriori (MAP) probability of the DRF. A procedure based on the iterated conditional modes algorithm and a metric-learning algorithm is developed to find a suboptimal MAP solution of the DRF. The proposed approach has been tested against various data sets. Experimental results demonstrate that our approach can enhance the clustering accuracy, and thus prove the feasibility of the proposed approach.

论文关键词:Semi-supervised clustering,Discriminative random fields

论文评审过程:Received 29 June 2011, Revised 22 March 2012, Accepted 30 May 2012, Available online 12 June 2012.

论文官网地址:https://doi.org/10.1016/j.patcog.2012.05.021