Dilated convolution based RCNN using feature fusion for Low-Altitude aerial objects

作者：

Highlights：

•

摘要

The low-altitude aerial objects are hard to detect by existing deep learning-based object detectors because of the scale variance, small size, and occlusion-related problems. Deep learning-based detectors do not consider contextual information about the scale information of small-sized objects in low-altitude aerial images. This paper proposes a new system using the concept of receptive fields and fusion of feature maps to improve the efficiency of deep object detectors in low-altitude aerial images. A Dilated ResNet Module (DRM) is proposed, motivated from the trident networks, which works on dilated convolutions to study the contextual data for specifically small-sized objects. Applicability of this component builds the model strong towards scale variations in low-altitude aerial objects. Then, Feature Fusion Module (FFM) is created to offer semantic intelligence for better detection of low-altitude aerial objects. We have chosen vastly deployed faster RCNN as the base detector for the proposal of our technique. The dilated convolution-based RCNN using feature fusion (DCRFF) system is implemented on a benchmark low-altitude UAV based-object detection dataset, VisDrone, which contains multiple object categories of pedestrians, vehicles in crowded scenes. The experiments exhibit the enactment of the given detector on chosen low-altitude aerial object dataset. The proposed system of DCRFF achieves 35.04% mAP on the challenging VisDrone dataset, indicating an average improvement of 2% when compared.

论文关键词：Region proposal,Feature fusion,Dilated convolutions,Receptive field,Object detection

论文评审过程：Received 12 August 2021, Revised 12 January 2022, Accepted 28 March 2022, Available online 4 April 2022, Version of Record 9 April 2022.

论文官网地址：https://doi.org/10.1016/j.eswa.2022.117106