Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN
Skeleton-based Human Activity Recognition has recently sparked a lot of attention because skeleton data has proven resistant to changes in lighting, body sizes, dynamic camera perspectives, and complicated backgrounds. The Spatial-Temporal Graph Convolutional Networks (ST-GCN) model has been exposed...
Uloženo v:
| Vydáno v: | Multimedia tools and applications Ročník 83; číslo 5; s. 12705 - 12730 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Springer US
01.02.2024
Springer Nature B.V |
| Témata: | |
| ISSN: | 1573-7721, 1380-7501, 1573-7721 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Skeleton-based Human Activity Recognition has recently sparked a lot of attention because skeleton data has proven resistant to changes in lighting, body sizes, dynamic camera perspectives, and complicated backgrounds. The Spatial-Temporal Graph Convolutional Networks (ST-GCN) model has been exposed to study spatial and temporal dependencies effectively from skeleton data. However, efficient use of 3D skeleton in-depth information remains a significant challenge, specifically for human joint motion patterns and linkages information. This study attempts a promising solution through a custom ST-GCN model and skeleton joints for human activity recognition. Special attention was given to spatial & temporal features, which were further fed to the classification model for better pose estimation. A comparative study is presented for activity recognition using large-scale databases such as NTU-RGB-D, Kinetics-Skeleton, and Florence 3D datasets. The Custom ST-GCN model outperforms (Top-1 accuracy) the state-of-the-art method on NTU-RGB-D, Kinetics-Skeleton & Florence 3D dataset with a higher margin by 0.7%, 1.25%, and 1.92%, respectively. Similarly, with Top-5 accuracy, the Custom ST-GCN model offers results hike by 0.5%, 0.73% & 1.52%, respectively. It shows that the presented graph-based topologies capture the changing aspects of a motion-based skeleton sequence better than some of the other approaches. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1573-7721 1380-7501 1573-7721 |
| DOI: | 10.1007/s11042-023-16001-9 |