A Context Based Deep Temporal Embedding Network in Action Recognition

作者:Maryam Koohzadi, Nasrollah Moghadam Charkari

摘要

Long term temporal representation methods demand high computational cost, restricting their practical use in real world applications. We propose a two-step deep residual method for efficiently learning long-term discriminative temporal representation, whilst significantly reducing computational cost. In the first step, a novel self-supervision deep temporal embedding method is presented to embed repetitive short-term motions at a cluster-friendly feature space. In the second step, an efficient temporal representation is made by leveraging the differences between the original data and its associated repetitive motion clusters as a novel deep residual method. Experimental results demonstrate that, the proposed method achieves competitive results on some challenging human action recognition datasets like UCF101, HMDB51, THUMOS14, and Kinetics-400.

论文关键词:Deep temporal embedding, Self-supervision, Residual technique, Two-step deep method, Long-term temporal representation

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-020-10248-1