Semi-supervised fuzzy clustering with metric learning and entropy regularization

作者:

Highlights:

摘要

Existing methods for semi-supervised fuzzy c-means (FCMs) suffer from the following issues: (1) the Euclidean distance tends to work poorly if each feature of the instance is unequal variance as well as correlation from others and (2) it is generally uneasy to assign an appropriate value for the parameter m involved in their objective function. To address these problems, we develop a novel semi-supervised metric-based fuzzy clustering algorithm called SMUC by introducing metric learning and entropy regularization simultaneously into the conventional fuzzy clustering algorithm. More specifically, SMUC focuses on learning a Mahalanobis distance metric from side information given by the user to displace the Euclidean distance in FCM-based methods. Thus, it has the same flavor as typical supervised metric algorithms, which makes the distance between instances within a cluster smaller than that between instances belonging to different clusters. Moreover, SMUC introduces maximum entropy as a regularized term in its objective function such that its resulting formulas have the clear physical meaning compared with the other semi-supervised FCM methods. In addition, it naturally avoids the choice on the parameter m due to such a maximum-entropy regularizer. The experiments on real-world data sets show the feasibility and effectiveness of the proposed method with encouraging results.

论文关键词:Metric learning,Maximum entropy,Prior membership degree,Pairwise constraint,Semi-supervised clustering

论文评审过程:Received 2 September 2011, Revised 15 May 2012, Accepted 25 May 2012, Available online 1 June 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.05.016