Discrete data clustering using finite mixture models

作者:

Highlights:

摘要

Finite mixture models have been applied for different computer vision, image processing and pattern recognition tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited. In this paper, we investigate the problem of discrete data modeling using finite mixture models. We propose a novel, well motivated mixture that we call the multinomial generalized Dirichlet mixture. The novel model is compared with other discrete mixtures. We designed experiments involving spatial color image databases modeling and summarization, and text classification to show the robustness, flexibility and merits of our approach.

论文关键词:Discrete data,Finite mixture models,Multinomial,Generalized Dirichlet distribution,EM,Spatial color,Image databases,Labeled and unlabeled images,Summarization,Text classification

论文评审过程:Received 28 June 2007, Revised 16 April 2008, Accepted 24 June 2008, Available online 2 July 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.06.022