Generating clusters of similar sizes by constrained balanced clustering
作者:Yuming Lin, Haibo Tang, You Li, Chuangxin Fang, Zejun Xu, Ya Zhou, Aoying Zhou
摘要
Balanced clustering, which generates clusters of similar sizes, can be useful in a variety of applications. However, existing clustering algorithms either cannot guarantee balanced clustering results or require relatively high time complexities for balanced clustering. In this work, we propose a constrained balanced clustering method, which is referred to as τ-balanced clustering, to generate clusters with a controllable balance degree. The proposed method constrains the cluster sizes in the cluster assignment phase based on an established cluster bound size and an established bound for the number of largest clusters. Second, we optimize the basic τ-balanced clustering method by reducing some unnecessary calculations with two-level filtering. Third, we also design a parallel version for the basic τ-balanced clustering method and the optimized method on GPUs (Graphics Processing Units), to enhance the execution efficiency with high parallelism. Finally, we conduct a series of experiments on nine benchmark datasets to verify the proposed methods. The experimental results show that our methods successfully outperform the state-of-the-art methods.
论文关键词:Balanced clustering, Optimization, GPUs-enhanced, Parallelization
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-021-02682-y