Online learnable keyframe extraction in videos and its application with semantic word vector in action recognition

Highlights：

• We propose a new online learnable keyframe extraction module (OKFEM) to extract keyframes in a video sequence.

• The proposed OKFEM module is the first online module that can process frames sequentially and select the keyframes based on processed frames only.

• Our OKFEM module can be used for transfer learning as well. Our method achieves the best state-of-the-art results on summarization datasets in the transfer learning environment.

• We also propose a plugin module and a novel train/test strategy called ITTS for the first time to incorporate the semantic word vector along with keyframe features as input to the classification models. Our proposed approach of action recognition further improves the best state-of-the-art results on the action recognition datasets.

摘要

•We propose a new online learnable keyframe extraction module (OKFEM) to extract keyframes in a video sequence.•The proposed OKFEM module is the first online module that can process frames sequentially and select the keyframes based on processed frames only.•Our OKFEM module can be used for transfer learning as well. Our method achieves the best state-of-the-art results on summarization datasets in the transfer learning environment.•We also propose a plugin module and a novel train/test strategy called ITTS for the first time to incorporate the semantic word vector along with keyframe features as input to the classification models. Our proposed approach of action recognition further improves the best state-of-the-art results on the action recognition datasets.

论文评审过程：Received 25 September 2020, Revised 28 June 2021, Accepted 21 August 2021, Available online 29 August 2021, Version of Record 5 September 2021.