Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network

作者:

Highlights:

摘要

We present an efficient representation for sketch based image retrieval (SBIR) derived from a triplet loss convolutional neural network (CNN). We treat SBIR as a cross-domain modelling problem, in which a depiction invariant embedding of sketch and photo data is learned by regression over a siamese CNN architecture with half-shared weights and modified triplet loss function. Uniquely, we demonstrate the ability of our learned image descriptor to generalise beyond the categories of object present in our training data, forming a basis for general cross-category SBIR. We explore appropriate strategies for training, and for deriving a compact image descriptor from the learned representation suitable for indexing data on resource constrained e. g. mobile devices. We show the learned descriptors to outperform state of the art SBIR on the defacto standard Flickr15k dataset using a significantly more compact (56 bits per image, i. e. ≈ 105KB total) search index than previous methods. Datasets and models are available from the CVSSP datasets server at www.cvssp.org.

论文关键词:

论文评审过程:Received 8 June 2016, Revised 1 April 2017, Accepted 19 June 2017, Available online 22 June 2017, Version of Record 17 December 2017.

论文官网地址:https://doi.org/10.1016/j.cviu.2017.06.007