Simultaneous clustering and classification over cluster structure representation

作者:

Highlights:

摘要

Two main tasks in pattern recognition area are clustering and classification. Owing to their different goals, traditionally these two tasks are treated separately. However, when label information is available, such separate treatment can not fully explore data information. First, classification is not favored by the data cluster structure. Second, clustering is not guided by valuable label information. Third, the relationship of clusters and classes is not revealed. Contrary to this separate learning treatment, simultaneous learning clustering and classification could benefit each other and overcomes these problems.Recently, a simultaneous learning framework SCC was proposed. Through modeling p(class|cluster) classification and clustering mechanism in SCC depend only on cluster centroids. However, it produces severely nonlinear objective, thus has to use a heuristic searching method, modified Particle Swarm Optimization, to find the optimal solution. But it is very slow. Further, modeling p(class|cluster) makes SCC hard to incorporate semi-supervised settings.In this paper, we propose an alternative framework SC3SR for simultaneous learning. Besides a classifier derived on the original data, another classifier on the newly-formed cluster structure representation is derived as well. Through this classifier, the clustering learning is guided by the label and classification learning is also favored by cluster structure of data. The final objective is continuously differentiable for which some principled optimization algorithms with convergence guaranteed exist. As a result, our algorithm is much faster than SCC. Further, we generalize this framework to semisupervised situation with the idea of manifold regularization and propose SemiSC3SR algorithm. Our experiments demonstrate the effectiveness of both SC3SR and SemiSC3SR.

论文关键词:Structure in data,Clustering learning,Classification learning,Simultaneous classification and clustering learning

论文评审过程:Received 24 February 2011, Revised 27 October 2011, Accepted 18 November 2011, Available online 13 December 2011.

论文官网地址:https://doi.org/10.1016/j.patcog.2011.11.027