A parameter-free affinity based clustering

作者:Bhaskar Mukhoty, Ruchir Gupta, Lakshmanan K., Mayank Kumar

摘要

Several methods have been proposed to estimate the number of clusters in a dataset; the basic idea behind all of them has been to study an index that measures inter-cluster separation and intra-cluster cohesion over a range of cluster numbers and report the number which gives an optimum value of the index. In this paper, we propose a simple parameter-free approach that is more like human cognition of clusters, where closely lying points are easily identified to form a cluster and the total number of clusters is revealed. To identify closely lying points, the affinity of two points is defined as a function of distance and a threshold affinity is identified, above which two points in a dataset are likely to be in the same cluster. Well separated clusters are identified even in the presence of outliers, whereas for a not well-separated dataset, the final number of clusters is estimated from the detected clusters. And they are merged to produce the final clusters. Experiments performed with several large dimensional synthetic and real datasets show good results with robustness to noise and density variation within a dataset.

论文关键词:Clustering, Parameter-free methods

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01812-2