Deep cascaded cross-modal correlation learning for fine-grained sketch-based image retrieval

作者:

Highlights:

• A simple yet effective pipeline for FG-SBIR is created through combining all the beneficial multimodal cues involved in sketches and annotated images.

• A deep cascaded neural network architecture with deep representation, embedding, and ranking is established for revealing multimodal relationships.

• Two extended image datasets are collected to validate the generalization ability of our scheme, which demonstrates its effectiveness for both SBIR and FG-SBIR.

摘要

•A simple yet effective pipeline for FG-SBIR is created through combining all the beneficial multimodal cues involved in sketches and annotated images.•A deep cascaded neural network architecture with deep representation, embedding, and ranking is established for revealing multimodal relationships.•Two extended image datasets are collected to validate the generalization ability of our scheme, which demonstrates its effectiveness for both SBIR and FG-SBIR.

论文关键词:Fine-grained Sketch-based Image Retrieval (FG-SBIR),Deep Cascaded Cross-modal Correlation Learning,Deep Multimodal Representation,Deep Multimodal Embedding,Deep Triplet Ranking

论文评审过程:Received 24 April 2019, Revised 4 November 2019, Accepted 4 December 2019, Available online 11 December 2019, Version of Record 20 December 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.107148