Clustering nominal data using unsupervised binary decision trees: Comparisons with the state of the art methods

作者:

Highlights:

• An extension of clustering using binary decision trees (CUBT) is presented for nominal data.

• New heuristics are given for tuning the parameters of CUBT.

• CUBT outperforms many of the existing approaches for nominal datasets.

• The tree structure helps for the interpretation of the obtained clusters.

• The method usable for direct prediction.

• The method may be used with parallel computing and thus for Big data.

摘要

•An extension of clustering using binary decision trees (CUBT) is presented for nominal data.•New heuristics are given for tuning the parameters of CUBT.•CUBT outperforms many of the existing approaches for nominal datasets.•The tree structure helps for the interpretation of the obtained clusters.•The method usable for direct prediction.•The method may be used with parallel computing and thus for Big data.

论文关键词:CUBT,Unsupervised learning,Clustering,Binary decision trees,Nominal data,Mutual information,Entropy

论文评审过程:Received 19 March 2016, Revised 17 October 2016, Accepted 24 January 2017, Available online 1 February 2017, Version of Record 25 March 2017.

论文官网地址:https://doi.org/10.1016/j.patcog.2017.01.031