BETULA: Fast clustering of large data with improved BIRCH CF-Trees

作者:

Highlights:

• Improvement of the BIRCH algorithm.

• Improved numerical accuracy.

• Faster and more accurate clustering.

• Supports Hierarchical Clustering, k-means++ and GMM.

• Up to 500x faster than Gaussian Mixture Modeling.

摘要

•Improvement of the BIRCH algorithm.•Improved numerical accuracy.•Faster and more accurate clustering.•Supports Hierarchical Clustering, k-means++ and GMM.•Up to 500x faster than Gaussian Mixture Modeling.

论文关键词:Cluster analysis,BIRCH,CF-Tree,k-means,Gaussian mixture modeling,Hierarchical agglomerative clustering

论文评审过程:Received 20 February 2021, Revised 12 August 2021, Accepted 13 October 2021, Available online 28 October 2021, Version of Record 12 May 2022.

论文官网地址:https://doi.org/10.1016/j.is.2021.101918