A hierarchical clusterer ensemble method based on boosting theory

作者:

Highlights:

摘要

Bagging and boosting are two well-known methods of developing classifier ensembles. It is generally agreed that the clusterer ensemble methods that utilize the boosting concept can create clusterings with quality and robustness improvements. In this paper, we introduce a new boosting based hierarchical clusterer ensemble method called Bob-Hic. This method is utilized to create a consensus hierarchical clustering (h-clustering) on a dataset, which is helpful to improve the clustering accuracy. Bob-Hic includes several boosting iterations. In each iteration, first, a weighted random sampling is performed on the original dataset. An individual h-clustering is then created on the selected samples. At the end of the iterations, the individual clusterings are combined to a final consensus h-clustering. The intermediate structures used in the combination are distance descriptor matrices which correspond to individual h-clustering results. This final integration is done through an information theoretic approach. Experiments on popular synthetic and real datasets confirm that the proposed method improves the results of simple clustering algorithms. In addition, our experimental results confirm that this method provides better consensus clustering quality compared to other available ensemble techniques.

论文关键词:Boosting,Ensemble,Hierarchical clustering,Rényi

论文评审过程:Received 15 May 2012, Revised 12 February 2013, Accepted 13 February 2013, Available online 28 February 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.02.009