Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming

作者:

摘要

This paper investigates a genetic programming (GP) approach aimed at the multi-objective design of hierarchical consensus functions for clustering ensembles. By this means, data partitions obtained via different clustering techniques can be continuously refined (via selection and merging) by a population of fusion hierarchies having complementary validation indices as objective functions. To assess the potential of the novel framework in terms of efficiency and effectiveness, a series of systematic experiments, involving eleven variants of the proposed GP-based algorithm and a comparison with basic as well as advanced clustering methods (of which some are clustering ensembles and/or multi-objective in nature), have been conducted on a number of artificial, benchmark and bioinformatics datasets. Overall, the results corroborate the perspective that having fusion hierarchies operating on well-chosen subsets of data partitions is a fine strategy that may yield significant gains in terms of clustering robustness.

论文关键词:Cluster analysis,Clustering ensembles,Multi-objective clustering,Hierarchical fusion,Partition selection,Genetic programming

论文评审过程:Available online 1 February 2011.

论文官网地址:https://doi.org/10.1016/j.dss.2011.01.014