Caption generation on scenes with seen and unseen object categories
作者:
Highlights:
• The problem of true zero-shot image captioning (ZSC) is proposed.
• ZSC involves captioning over classes with no visual or textual train examples.
• A zero-shot object detection-driven approach is proposed to detect unseen objects.
• A template-based model is used to transform detections into sentences.
• A new evaluation metric (V-METEOR) is proposed for ZSC evaluation purposes.
摘要
•The problem of true zero-shot image captioning (ZSC) is proposed.•ZSC involves captioning over classes with no visual or textual train examples.•A zero-shot object detection-driven approach is proposed to detect unseen objects.•A template-based model is used to transform detections into sentences.•A new evaluation metric (V-METEOR) is proposed for ZSC evaluation purposes.
论文关键词:Zero-shot learning,Zero-shot image captioning
论文评审过程:Received 30 March 2022, Revised 17 June 2022, Accepted 22 June 2022, Available online 27 June 2022, Version of Record 6 July 2022.
论文官网地址:https://doi.org/10.1016/j.imavis.2022.104515