Prediction of the driver’s focus of attention based on feature visualization of a deep autonomous driving model

作者:

Highlights:

摘要

Driver focus of attention (DFoA) is a fundamental research problem in human-like autonomous driving systems. However, most existing methods require large amounts of ground-truth DFoA data for training, which are difficult to collect. Inspired by the visual interpretability of neural networks, this study innovatively developed a DFoA prediction method based on the feature visualization of a deep autonomous driving model that does not require ground-truth DFoA data to train. Here, we propose a multimodal spatiotemporal convolutional network with an attention mechanism for DFoA prediction. First, semantic and depth images were generated from RGB video frames to enable the multipath convolutional network to learn the spatiotemporal information of successive images. A parameter-free attention mechanism with 3-D weights was incorporated as an energy function to calculate the importance of each neuron. A graph attention network was used to learn the most driving behavior-relevant semantic context features. The learned features were fused, and a convolutional long short-term memory network (ConvLSTM) was adopted to achieve the evolution of the fused features in successive frames while considering historical scene variation. Finally, a novel feature visualization method was designed to predict the DFoA by visualizing the driving behavior-relevant features. The experimental results demonstrated that the proposed method can accurately predict DFoA.

论文关键词:Driver’s attention,Attention mechanism,Autonomous driving,Feature visualization,Graph attention network

论文评审过程:Received 2 December 2021, Revised 4 May 2022, Accepted 5 May 2022, Available online 18 May 2022, Version of Record 17 June 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109006