Two-level k-means clustering algorithm for k–τ relationship establishment and linear-time classification

作者:

Highlights:

摘要

Partitional clustering algorithms, which partition the dataset into a pre-defined number of clusters, can be broadly classified into two types: algorithms which explicitly take the number of clusters as input and algorithms that take the expected size of a cluster as input. In this paper, we propose a variant of the k-means algorithm and prove that it is more efficient than standard k-means algorithms. An important contribution of this paper is the establishment of a relation between the number of clusters and the size of the clusters in a dataset through the analysis of our algorithm. We also demonstrate that the integration of this algorithm as a pre-processing step in classification algorithms reduces their running-time complexity.

论文关键词:Clustering,k-Means,Classification,Linear-time complexity,Support vector machines,k-Nearest neighbor classifier

论文评审过程:Received 12 January 2009, Revised 25 August 2009, Accepted 22 September 2009, Available online 1 October 2009.

论文官网地址:https://doi.org/10.1016/j.patcog.2009.09.019