Human Action Recognition Based on Spatial Temporal Adaptive Residual Graph Convolutional Networks with Attention Mechanism

In human action recognition based on human skeleton data, Spatial-Temporal Graph Convolution Networks (ST-GCN) have recently achieved remarkable performances. However, the ST-GCN model based on fixed skeleton graphs only captures the local physical relationships among skeleton points. This may not b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Chinese Control Conference S. 7622 - 7627
Hauptverfasser:	Song, Lu, He, Yi, Yuan, Huaqing, Du, Peng
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	Technical Committee on Control Theory, Chinese Association of Automation 28.07.2024
Schlagworte:	Accuracy Action Recognition Adaptation models Adaptive Graph Adaptive systems Attention Mechanism Attention mechanisms Degradation Graph Convolution Graph convolutional networks Image recognition Residual Network
ISSN:	1934-1768
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In human action recognition based on human skeleton data, Spatial-Temporal Graph Convolution Networks (ST-GCN) have recently achieved remarkable performances. However, the ST-GCN model based on fixed skeleton graphs only captures the local physical relationships among skeleton points. This may not be suitable for the diversity of action categories. To address this, we propose a spatial-temporal adaptive residual graph convolutional network with an attention mechanism. We enhance the flexibility of the graph structure by introducing the adaptive graph convolution, which can improve the model's generalization performance and apply it to more data samples. Besides, adding residual links into the graph convolutional network of the STGCN facilitates the fusion of information from both local and global features of human skeleton data, concurrently addressing the issue of network degradation. Additionally, we add the attention mechanism into ST-GCN to emphasize useful features selectively and suppress irrelevant ones. We evaluate the proposed approach on two large-scale datasets: NTU-RGB+D and Kinetics. The experimental results show that our approach surpasses some of the more prominent studies, showing higher accuracy and excellent recognition performance.
ISSN:	1934-1768
DOI:	10.23919/CCC63176.2024.10662672