UNIC: A fast nonparametric clustering

作者:

Highlights:

• A new algorithm is proposed to address challenges of clustering large data sets.

• UNIC has near linear time and space complexity and does not require control parameters to be tuned in advance.

• The algorithm derives cluster structure assessing distances between selected arbitrary points and the rest of the set employing methods from robust statistics.

• Experimental results on synthetic and real world data show comparable performance of the algorithm with the selected clustering methods as well as a good ability to scale.

摘要

•A new algorithm is proposed to address challenges of clustering large data sets.•UNIC has near linear time and space complexity and does not require control parameters to be tuned in advance.•The algorithm derives cluster structure assessing distances between selected arbitrary points and the rest of the set employing methods from robust statistics.•Experimental results on synthetic and real world data show comparable performance of the algorithm with the selected clustering methods as well as a good ability to scale.

论文关键词:Cluster analysis,Hard (conventional,crisp) clustering,Nonparametric algorithms,Data mining,Big data

论文评审过程:Received 1 February 2019, Revised 15 August 2019, Accepted 15 November 2019, Available online 19 November 2019, Version of Record 7 December 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.107117