Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization

作者:

Highlights:

• To the best of our knowledge, Mask-CNN is the first end-to-end model that selects deep convolutional descriptors for object recognition, especially for fine-grained image recognition.

• We present a novel and efficient part-based three-stream model for fine-grained recognition. By discarding the fully connected layers, the proposed M-CNN is computationally efficient (cf. Table 1 and Table 4 in experiments). Additionally, comparing with state-of-the-art methods, M-CNN has smaller feature dimensionality. Beyond those, it achieves the highest classification accuracy on CUB200-2011 and Birdsnap among published methods.

• The part localization performance of the proposed model outperforms other part-based finegrained approaches which requires additional bounding boxes. In particular, M-CNN is 12.76% higher than state-of-the-art for head localization on CUB200-2011.

摘要

•To the best of our knowledge, Mask-CNN is the first end-to-end model that selects deep convolutional descriptors for object recognition, especially for fine-grained image recognition.•We present a novel and efficient part-based three-stream model for fine-grained recognition. By discarding the fully connected layers, the proposed M-CNN is computationally efficient (cf. Table 1 and Table 4 in experiments). Additionally, comparing with state-of-the-art methods, M-CNN has smaller feature dimensionality. Beyond those, it achieves the highest classification accuracy on CUB200-2011 and Birdsnap among published methods.•The part localization performance of the proposed model outperforms other part-based finegrained approaches which requires additional bounding boxes. In particular, M-CNN is 12.76% higher than state-of-the-art for head localization on CUB200-2011.

论文关键词:Fine-grained image recognition,Deep descriptor selection,Part localization

论文评审过程:Received 17 June 2017, Revised 3 September 2017, Accepted 6 October 2017, Available online 13 October 2017, Version of Record 8 January 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2017.10.002