Detection-Oriented Backbone Trained from Near Scratch and Local Feature Refinement for Small Object Detection

作者：Zhiwei Yan, Huicheng Zheng, Ye Li, Lvran Chen

摘要

Current detection networks usually struggle to detect small-scale object instances due to spatial information loss and lack of semantics. In this paper, we propose a one-stage detector named LocalNet, which pays specific attention to the detailed information modeling. LocalNet is built upon our redesigned detection-oriented backbone called long neck ResNet, which aims to preserve more detailed information in the early stage to enhance the representation of small objects. Furthermore, to enhance the semantics in the detection layers, we propose a local detail-context module, which reintroduces the detailed information lost in the network and exploits the local context within a restricted receptive field range. Moreover, we explore a method for training detectors nearly or totally from scratch, which provides the potential to design network structures with more freedom. With nearly \(94\%\) of the pretrained parameters randomly reinitialized in the backbone, our model improves the mAP of our baseline model from 75.0 to \(82.3\%\) on the PASCAL VOC dataset with an input size of \(300\times 300\) and achieves state-of-the-art accuracy. Even when trained from scratch, our model achieves \(80.8\%\) mAP, which is \(5.8\%\) greater than the mAP of our baseline model with a fully pretrained backbone.

论文关键词：Small object detection, Detection backbone, Local feature representation, Receptive field

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11063-021-10493-y