Evidential clustering of large dissimilarity data

作者：

Highlights：

•

摘要

In evidential clustering, the membership of objects to clusters is considered to be uncertain and is represented by Dempster-Shafer mass functions, forming a credal partition. The EVCLUS algorithm constructs a credal partition in such a way that larger dissimilarities between objects correspond to higher degrees of conflict between the associated mass functions. In this paper, we present several improvements to EVCLUS, making it applicable to very large dissimilarity data. First, the gradient-based optimization procedure in the original EVCLUS algorithm is replaced by a much faster iterative row-wise quadratic programming method. Secondly, we show that EVCLUS can be provided with only a random sample of the dissimilarities, reducing the time and space complexity from quadratic to roughly linear. Finally, we introduce a two-step approach to construct credal partitions assigning masses to selected pairs of clusters, making the algorithm outputs more informative than those of the original EVCLUS, while remaining manageable for large numbers of clusters.

论文关键词：Dempster-Shafer theory,Evidence theory,Belief functions,Unsupervised learning,Credal partition,Relational data,Proximity data,Pairwise data

论文评审过程：Received 9 April 2016, Revised 20 May 2016, Accepted 22 May 2016, Available online 27 May 2016, Version of Record 18 June 2016.

论文官网地址：https://doi.org/10.1016/j.knosys.2016.05.043