A component-based video content representation for action recognition

作者:

Highlights:

• Representing an innovative framework for recognizing human actions without the need of any human bounding box annotations

• The proposed method moves beyond just recognizing video frames and can estimate regions of interest in each frame

• All action components are identified, instead of finding a single bounding box in each frame.

• A priority based approach is proposed that can learn how to utilize foreground, background and motion in each activity class.

• State-of-the-art results is obtained on four challenging datasets.

摘要

•Representing an innovative framework for recognizing human actions without the need of any human bounding box annotations•The proposed method moves beyond just recognizing video frames and can estimate regions of interest in each frame•All action components are identified, instead of finding a single bounding box in each frame.•A priority based approach is proposed that can learn how to utilize foreground, background and motion in each activity class.•State-of-the-art results is obtained on four challenging datasets.

论文关键词:Actionness likelihood,Action recognition,Action components,LSTM,Three-stream convolutional neural network

论文评审过程:Received 1 June 2019, Revised 18 August 2019, Accepted 20 August 2019, Available online 29 August 2019, Version of Record 22 October 2019.

论文官网地址:https://doi.org/10.1016/j.imavis.2019.08.009