A Context Based Deep Temporal Embedding Network in Action Recognition

作者：Maryam Koohzadi, Nasrollah Moghadam Charkari

摘要

Long term temporal representation methods demand high computational cost, restricting their practical use in real world applications. We propose a two-step deep residual method for efficiently learning long-term discriminative temporal representation, whilst significantly reducing computational cost. In the first step, a novel self-supervision deep temporal embedding method is presented to embed repetitive short-term motions at a cluster-friendly feature space. In the second step, an efficient temporal representation is made by leveraging the differences between the original data and its associated repetitive motion clusters as a novel deep residual method. Experimental results demonstrate that, the proposed method achieves competitive results on some challenging human action recognition datasets like UCF101, HMDB51, THUMOS14, and Kinetics-400.

论文关键词：Deep temporal embedding, Self-supervision, Residual technique, Two-step deep method, Long-term temporal representation

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11063-020-10248-1