Estimating the Optimal Number of Clusters Via Internal Validity Index

作者:Shibing Zhou, Fei Liu, Wei Song

摘要

Estimating the optimal number of clusters (NC) is pivotal in cluster analysis. From the viewpoint of sample geometry, a novel internal clustering validity index, which is termed the between-within cluster (BWC) index, is designed in this paper. Moreover, a method is proposed to estimate the optimal NC. The BWC index improves the well-known Silhouette index. BWC validates the clustering results from a certain clustering algorithm (e.g., affinity propagation or hierarchical) and estimates the optimal NC for many kinds of data sets, including synthetic data sets, benchmark data sets, UCI data sets, gene expression data sets, and images. Theoretical analysis and experimental studies demonstrate the effectiveness and high efficiency of the new index and method.

论文关键词:Clustering validity index, Number of clusters, Affinity propagation, Hierarchical clustering

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-021-10427-8