Masked Autoencoders for Spatial-Temporal Relationship in Video-Based Group Activity Recognition

Group Activity Recognition (GAR) is a challenging problem involving several intricacies. The core of GAR lies in delving into spatiotemporal features to generate appropriate scene representations. Previous methods, however, either feature a complex framework requiring individual action labels or nee...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access Jg. 12; S. 132084 - 132095
Hauptverfasser:	Yadav, Rajeshwar, Halder, Raju, Banda, Gourinath
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Piscataway IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Activity recognition CAD Cognitive tasks Collaborative software Complexity Computer aided design Crime Datasets Encoding Feature recognition Group activity recognition (GAR) hostage crime IITP hostage dataset Image reconstruction Labels masked autoencoder Predictive models Representations Solid modeling spatial and temporal interaction Spatiotemporal data Spatiotemporal phenomena Video Videos vision transformer
ISSN:	2169-3536, 2169-3536
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!