Combining global, regional and contextual features for automatic image annotation

作者:

Highlights:

摘要

This paper presents a novel approach to automatic image annotation which combines global, regional, and contextual features by an extended cross-media relevance model. Unlike typical image annotation methods which use either global or regional features exclusively, as well as neglect the textual context information among the annotated words, the proposed approach incorporates the three kinds of information which are helpful to describe image semantics to annotate images by estimating their joint probability. Specifically, we describe the global features as a distribution vector of visual topics and model the textual context as a multinomial distribution. The global features provide the global distribution of visual topics over an image, while the textual context relaxes the assumption of mutual independence among annotated words which is commonly adopted in most existing methods. Both the global features and textual context are learned by a probability latent semantic analysis approach from the training data. The experiments over 5k Corel images have shown that combining these three kinds of information is beneficial in image annotation.

论文关键词:Global and regional features,Textual context,Cross media relevance model,Latent semantic analysis,Image annotation

论文评审过程:Received 24 December 2007, Revised 10 April 2008, Accepted 7 May 2008, Available online 17 May 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.05.010