Robust unsupervised image categorization based on variational autoencoder with disentangled latent representations

作者:

Highlights:

摘要

Recently, deep generative models have been successfully applied to unsupervised clustering analyses, due to the model capabilities for learning good representations of the input data from a lower-dimensional latent space. In this work, we propose a robust deep generative clustering method based on a variational autoencoder (VAE) for unsupervised image categorization. The merits of our method can be summarized as follows. First, each latent representation generated by the encoder is disentangled into the cluster representation and generation representation, where the cluster representation is responsible for preserving the clustering information, while the generation representation is responsible for conserving the generation information. Thus, by only utilizing the cluster representation, we can improve the performance and efficiency of clustering tasks without interference from generating tasks. Second, a Student’s-t mixture model is adopted as the prior over the cluster representation to enhance the robustness of our method against clustering outliers. Third, we propose a biaugmentation module to promote the training stability for our model. In contrast with most of the existing deep generative clustering methods that require a pretraining step to stabilize the training process, our model is able to provide a stable training process through feature disentanglement and data augmentation. We validate the proposed robust deep generative clustering method through extensive experiments by comparing it with state-of-the-art methods on unsupervised image categorization.

论文关键词:Clustering,Variational autoencoder (VAE),Disentangled latent representations,Robust training,Mixture model,Student’s-t distribution

论文评审过程:Received 26 September 2021, Revised 19 March 2022, Accepted 24 March 2022, Available online 31 March 2022, Version of Record 7 April 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108671