Clustering with block mixture models

作者:

Highlights:

摘要

Basing cluster analysis on mixture models has become a classical and powerful approach. Until now, this approach, which allows to explain some classic clustering criteria such as the well-known k-means criteria and to propose general criteria, has been developed to classify a set of objects measured on a set of variables. But, for this kind of data, if most clustering procedures are designated to construct an optimal partition of objects or, sometimes, of variables, there exist others methods, named block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks.In this work, a new mixture model called block mixture model is proposed to take into account this situation. This model allows to embed simultaneous clustering of objects and variables in a mixture approach. We first consider this probabilistic model in a general context and we develop a new algorithm of simultaneous partitioning based on the CEM algorithm. Then, we focus on the case of binary data and we show that our approach allows us to extend a block clustering method, which had been proposed in this case. Simplicity, fast convergence and the possibility to process large data sets are the major advantages of the proposed approach.

论文关键词:Clustering,Mixture model,Block mixture model,Latent block model,EM algorithm,Block CEM algorithm

论文评审过程:Received 22 June 2001, Accepted 18 March 2002, Available online 4 June 2002.

论文官网地址:https://doi.org/10.1016/S0031-3203(02)00074-2