Domain learning joint with semantic adaptation for human action recognition

Highlights：

• A novel knowledge adaptation framework called Semantic Adaptation based on Vector of Locally Max Pooled deep learn Features (SA-VLMPF) is proposed for action recognition.

• A Cascaded Convolution Fusion Strategy (CCFS) is proposed to integrate the spatial features and temporal features effectively.

• A feature encoding method called Vector of Locally Max-Pooled deep learned Features (VLMPF) is introduced for long-range video representation.

• Extensive experiments on several public benchmark datasets verify the effectiveness of the proposed framework.

摘要

•A novel knowledge adaptation framework called Semantic Adaptation based on Vector of Locally Max Pooled deep learn Features (SA-VLMPF) is proposed for action recognition.•A Cascaded Convolution Fusion Strategy (CCFS) is proposed to integrate the spatial features and temporal features effectively.•A feature encoding method called Vector of Locally Max-Pooled deep learned Features (VLMPF) is introduced for long-range video representation.•Extensive experiments on several public benchmark datasets verify the effectiveness of the proposed framework.

论文关键词：Knowledge adaptation,Two-stream network,Video representation,Action recognition,Cascaded convolution fusion strategy

论文评审过程：Received 9 February 2018, Revised 17 November 2018, Accepted 24 January 2019, Available online 29 January 2019, Version of Record 1 February 2019.