Monte Carlo comparison of six hierarchical clustering methods on random data

作者:

Highlights:

摘要

There is mounting evidence to suggest that the complete linkage method does the best clustering job among all hierarchical agglomerative techniques, particularly with respect to misclassification in samples from known multivariate normal distributions. However, clustering methods are notorious for discovering clusters on random data sets also. We compare six agglomerative hierarchical methods on univariate random data from uniform and standard normal distributions and find that the complete linkage method generally is best in not discovering false clusters. The criterion is the ratio of number of within-cluster distances to number of all distances at most equal to the maximum within-cluster distance.

论文关键词:Hierarchical clustering methods,Univariate random data,Within-cluster distance,Between-cluster distance,Monte Carlo simulation,Complete linkage method

论文评审过程:Received 1 March 1985, Accepted 22 May 1985, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(86)90038-5