Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN

Skeleton-based Human Activity Recognition has recently sparked a lot of attention because skeleton data has proven resistant to changes in lighting, body sizes, dynamic camera perspectives, and complicated backgrounds. The Spatial-Temporal Graph Convolutional Networks (ST-GCN) model has been exposed...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Multimedia tools and applications Ročník 83; číslo 5; s. 12705 - 12730
Hlavní autoři: Lovanshi, Mayank, Tiwari, Vivek
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.02.2024
Springer Nature B.V
Témata:
ISSN:1573-7721, 1380-7501, 1573-7721
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Skeleton-based Human Activity Recognition has recently sparked a lot of attention because skeleton data has proven resistant to changes in lighting, body sizes, dynamic camera perspectives, and complicated backgrounds. The Spatial-Temporal Graph Convolutional Networks (ST-GCN) model has been exposed to study spatial and temporal dependencies effectively from skeleton data. However, efficient use of 3D skeleton in-depth information remains a significant challenge, specifically for human joint motion patterns and linkages information. This study attempts a promising solution through a custom ST-GCN model and skeleton joints for human activity recognition. Special attention was given to spatial & temporal features, which were further fed to the classification model for better pose estimation. A comparative study is presented for activity recognition using large-scale databases such as NTU-RGB-D, Kinetics-Skeleton, and Florence 3D datasets. The Custom ST-GCN model outperforms (Top-1 accuracy) the state-of-the-art method on NTU-RGB-D, Kinetics-Skeleton & Florence 3D dataset with a higher margin by 0.7%, 1.25%, and 1.92%, respectively. Similarly, with Top-5 accuracy, the Custom ST-GCN model offers results hike by 0.5%, 0.73% & 1.52%, respectively. It shows that the presented graph-based topologies capture the changing aspects of a motion-based skeleton sequence better than some of the other approaches.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16001-9