Clustering categorical data sets using tabu search techniques

作者:

Highlights:

摘要

Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy.

论文关键词:Clustering,k-means,k-modes,Tabu search,Numeric data,Categorical data

论文评审过程:Received 15 March 2001, Revised 10 September 2001, Accepted 20 November 2001, Available online 8 February 2002.

论文官网地址:https://doi.org/10.1016/S0031-3203(02)00021-3