Fast and exact out-of-core and distributed k-means clustering

作者:Ruoming Jin, Anjan Goswami, Gagan Agrawal

摘要

Clustering has been one of the most widely studied topics in data mining and k-means clustering has been one of the popular clustering algorithms. K-means requires several passes on the entire dataset, which can make it very expensive for large disk-resident datasets. In view of this, a lot of work has been done on various approximate versions of k-means, which require only one or a small number of passes on the entire dataset.

论文关键词: k-means clustering, Out-of-core datasets distributed, k-means, Confidence radius, Boundary points

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-005-0210-0