GBK-means clustering algorithm: An improvement to the K-means algorithm based on the bargaining game

作者:

Highlights:

摘要

Due to its simplicity, versatility and the diversity of applications to which it can be applied, k-means is one of the well-known algorithms for clustering data. The foundation of this algorithm is based on the distance measure. However, the traditional k-means has some weaknesses that appear in some data sets related to real applications, the most important of which is to consider only the distance criterion for clustering. Various studies have been conducted to address each of these weaknesses to achieve a balance between quality and efficiency. In this paper, a novel k-means variant of the original algorithm is proposed. This approach leverages the power of bargaining game modelling in the k-means algorithm for clustering data. In this novel setting, cluster centres compete with each other to attract the largest number of similar objectives or entities to their cluster. Thus, the centres keep changing their positions so that they have smaller distances with the maximum possible data than other cluster centres. We name this new algorithm the game-based k-means (GBK-means) algorithm. To show the superiority and efficiency of GBK-means over conventional clustering algorithms, namely, k-means and fuzzy k-means, we use the following syntactic and real-world data sets: (1) a series of two-dimensional syntactic data sets; and (2) ten benchmark data sets that are widely used in different clustering studies. The evaluation criteria show GBK-means is able to cluster data more accurately than classical algorithms based on eight evaluation metrics, namely F-measure, the Dunn index (DI), the rand index (RI), the Jaccard index (JI), normalized mutual information (NMI), normalized variation of information (NVI), the measure of concordance and error rate (ER).

论文关键词:K-means algorithm,Bargaining game,Cluster centre competition,Clustering improvement,Maximum data coverage

论文评审过程:Received 12 April 2020, Revised 25 October 2020, Accepted 9 December 2020, Available online 4 January 2021, Version of Record 4 January 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106672