Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions

作者:

Highlights:

摘要

This paper presents new robust clustering algorithms, which significantly improve upon the noise and initialization sensitivity of traditional mixture decomposition algorithms, and simplify the determination of the optimal number of clusters in the data set. The algorithms implement maximum likelihood mixture decomposition of multivariate t-distributions, a robust parametric extension of gaussian mixture decomposition. We achieve improved convergence capability relative to the expectation–maximization (EM) approach by deriving deterministic annealing EM (DAEM) algorithms for this mixture model and turning them into agglomerative algorithms (going through a monotonically decreasing number of components), an approach we term deterministic agglomeration EM (DAGEM). Two versions are derived, based on two variants of DAEM for mixture models. Simulation studies demonstrate the algorithms’ performance for mixtures with isotropic and non-isotropic covariances in two and 10 dimensions with known or unknown levels of outlier contamination.

论文关键词:Clustering,Finite mixture models,EM algorithm,Robust algorithms,t-distribution,Deterministic annealing,Agglomerative algorithms

论文评审过程:Received 16 August 2000, Accepted 5 March 2001, Available online 11 February 2002.

论文官网地址:https://doi.org/10.1016/S0031-3203(01)00080-2