Visual perception enhancement fall detection algorithm based on vision transformer

Fall detection is a crucial research topic in public healthcare. With advances in intelligent surveillance and deep learning, vision-based fall detection has gained significant attention. While numerous deep learning algorithms prevail in video fall detection due to excellent feature processing capa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal, image and video processing Jg. 19; H. 1; S. 18
Hauptverfasser:	Cai, Xi, Wang, Xiangcheng, Bao, Kexin, Chen, Yinuo, Jiao, Yin, Han, Guang
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	London Springer London 01.01.2025 Springer Nature B.V
Schlagworte:	Accuracy Algorithms Computer Imaging Computer Science Deep learning Design Efficiency Fall detection Image Processing and Computer Vision Machine learning Multimedia Information Systems Neural networks Original Paper Pattern Recognition and Graphics Signal,Image and Speech Processing Time series Vision Visual perception Visual perception driven algorithms Deep learning Computer vision Vision transformer Attention mechanism Fall detection
ISSN:	1863-1703, 1863-1711
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Fall detection is a crucial research topic in public healthcare. With advances in intelligent surveillance and deep learning, vision-based fall detection has gained significant attention. While numerous deep learning algorithms prevail in video fall detection due to excellent feature processing capabilities, they all exhibit limitations in handling long-term spatiotemporal dependencies. Recently, Vision Transformer has shown considerable potential in integrating global information and understanding long-term spatiotemporal dependencies, thus providing novel solutions. In view of this, we propose a visual perception enhancement fall detection algorithm based on Vision Transformer. We utilize Vision Transformer-Base as the baseline model for analyzing global motion information in videos. On this basis, to address the model’s difficulty in capturing subtle motion changes across video frames, we design an inter-frame motion information enhancement module. Concurrently, we propose a locality perception enhancement self-attention mechanism to overcome the model’s weak focus on local key features within the frame. Experimental results show that our method achieves notable performance on the Le2i and UR datasets, surpassing several advanced methods.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1863-1703 1863-1711
DOI:	10.1007/s11760-024-03652-w