Triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition

作者：Yanjing Sun, Han Huang, Xiao Yun, Bin Yang, Kaiwen Dong

摘要

Skeleton-based action recognition has recently attracted widespread attention in the field of computer vision. Previous studies on skeleton-based action recognition are susceptible to interferences from redundant video frames in judging complex actions but ignore the fact that the spatial-temporal features of different actions are extremely different. To solve these problems, we propose a triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition (AM-GCN), which can not only capture the multiple spacetime-semantic feature from the video images to avoid limited information diversity from single-layer feature representation but can also improve the generalization ability of the network. We also present the triplet attention mechanism to apply an attention mechanism to different key points, key channels, and key frames of the actions, improving the accuracy and interpretability of the judgement of complex actions. In addition, different kinds of spacetime-semantic feature information are combined through the proposed fusion decision for comprehensive prediction in order to improve the robustness of the algorithm. We validate AM-GCN with two standard datasets, NTU-RGBD and Kinetics, and compare it with other mainstream models. The results show that the proposed model achieves tremendous improvement.

论文关键词：Action recognition, Graph convolutional neural network, Spacetime-semantic feature, Triplet attention, Fusion decision

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-021-02370-x