Three-step action search networks with deep Q-learning for real-time object tracking

Highlights：

• A Three-Step Action Search network (TSAS) is designed for real-time object tracking.

• A collaborative learning strategy is developed to learn the TSAS network in order to achieve more discriminative.

• TSAS network is supervised in a sense of action classification and is also formulated as a reinforcement learning of cumulative rewards along the action steps and the time steps.

• Two action-value functions are approximated through deep networks in order to determine the best action for object tracking.

• Deep reinforcement learning is exploited to deal with the localization delay in the action steps effectively and explore the long-term information in videos efficiently.

摘要

•A Three-Step Action Search network (TSAS) is designed for real-time object tracking.•A collaborative learning strategy is developed to learn the TSAS network in order to achieve more discriminative.•TSAS network is supervised in a sense of action classification and is also formulated as a reinforcement learning of cumulative rewards along the action steps and the time steps.•Two action-value functions are approximated through deep networks in order to determine the best action for object tracking.•Deep reinforcement learning is exploited to deal with the localization delay in the action steps effectively and explore the long-term information in videos efficiently.

论文评审过程：Received 3 April 2019, Revised 28 November 2019, Accepted 24 December 2019, Available online 2 January 2020, Version of Record 15 January 2020.