A new weighting k-means type clustering framework with an l2-norm regularization

作者:

Highlights:

摘要

k-Means algorithm has been proven an effective technique for clustering a large-scale data set. However, traditional k-means type clustering algorithms cannot effectively distinguish the discriminative capabilities of features in the clustering process. In this paper, we present a new k-means type clustering framework by extending W-k-means with an l2-norm regularization to the weights of features. Based on the framework, we propose the l2-Wkmeans algorithm by using conventional means as the centroids for clustering numerical data sets and present the l2-NOF and l2-NDM algorithms by using two different smooth modes representatives for clustering categorical data sets. At first, a new objective function is developed for the clustering framework. Then, the corresponding updating rules of the centroids, the membership matrix, and the weights of the features, are derived theoretically for the new algorithms. We conduct extensive experimental verifications to evaluate the performances of our proposed algorithms on numerical data sets and categorical data sets. Experimental studies demonstrate that our proposed algorithms delivers consistently promising results in comparison to the other comparative approaches, such basic k-means, W-k-means, MKM_NOF, MKM_NDM etc., with respects to four metrics: Accuracy, RandIndex, Fscore, and Normal Mutual Information (NMI).

论文关键词:Clustering,k-means algorithm,Feature weighting,l2-norm regularization

论文评审过程:Received 22 September 2017, Revised 17 March 2018, Accepted 21 March 2018, Available online 22 March 2018, Version of Record 11 May 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.03.028