Social image refinement and annotation via weakly-supervised variational auto-encoder

作者：

Highlights：

•

摘要

The ever-increasing size of social images and their corresponding imperfect labels have made social image refinement and annotation a crucial problem in supervised learning. However, previous models based on nearest neighbors or matrix completion are limited when the social image set is huge and labels are highly sparse. Deep generative models utilize inference and generative networks to infer latent variables by introducing an observed data variable; they can handle imperfect data, capture noisy data, and fill in missing data variables. In this paper, we propose a new social image refinement and annotation model based on the weakly-supervised variational auto-encoder generative model. First, we formulate the social image refinement and annotation problem as a joint distribution of social images and labels in a probabilistic generative model. Secondly, we derive a new evidence lower bound object to handle imperfect labels. Thirdly, we design a new multi-layer neural network including inference and generative networks to optimize the new evidence lower bound efficiently. Finally, we perform a comparison of our model with other representative models on several real-world social image datasets. Experimental results on social image refinement and annotation tasks show that the proposed model is competitive or even better than existing state-of-the-arts.

论文关键词：Computer vision,Generative model,Variational auto-encoder,Image refinement,Image annotation

论文评审过程：Received 17 May 2019, Revised 6 October 2019, Accepted 23 November 2019, Available online 27 November 2019, Version of Record 24 February 2020.

论文官网地址：https://doi.org/10.1016/j.knosys.2019.105259