Categorical data visualization and clustering using subjective factors
作者:
Highlights:
•
摘要
Clustering is an important data mining problem. However, most earlier work on clustering focused on numeric attributes which have a natural ordering to their attribute values. Recently, clustering data with categorical attributes, whose attribute values do not have a natural ordering, has received more attention. A common issue in cluster analysis is that there is no single correct answer to the number of clusters, since cluster analysis involves human subjective judgement. Interactive visualization is one of the methods where users can decide a proper clustering parameters. In this paper, a new clustering approach called CDCS (Categorical Data Clustering with Subjective factors) is introduced, where a visualization tool for clustered categorical data is developed such that the result of adjusting parameters is instantly reflected. The experiment shows that CDCS generates high quality clusters compared to other typical algorithms.
论文关键词:Data mining,Cluster analysis,Categorical data,Cluster visualization
论文评审过程:Received 3 April 2004, Accepted 1 September 2004, Available online 30 September 2004.
论文官网地址:https://doi.org/10.1016/j.datak.2004.09.001