Vision‐audio fusion SLAM in dynamic environments

Moving humans, agents, and subjects bring many challenges to robot self‐localisation and environment perception. To adapt to dynamic environments, SLAM researchers typically apply several deep learning image segmentation models to eliminate these moving obstacles. However, these moving obstacle segm...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	CAAI Transactions on Intelligence Technology Jg. 8; H. 4; S. 1364 - 1373
Hauptverfasser:	Zhang, Tianwei, Zhang, Huayan, Li, Xiaofei
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Beijing John Wiley & Sons, Inc 01.12.2023 Wiley
Schlagworte:	Acoustics Audio data Cameras Collaboration Cooperation Deep learning dynamic environment Fourier transforms Image acquisition Image reconstruction Image segmentation intelligent robots Localization Machine learning Methods Microphones Moving object recognition Moving obstacles Robotics Robots sensor fusion Sensors Simultaneous localization and mapping Sound localization Sound sources Vision Visual effects Visual fields Visual perception Visual perception driven algorithms
ISSN:	2468-2322, 2468-6557, 2468-2322
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Moving humans, agents, and subjects bring many challenges to robot self‐localisation and environment perception. To adapt to dynamic environments, SLAM researchers typically apply several deep learning image segmentation models to eliminate these moving obstacles. However, these moving obstacle segmentation methods cost too much computation resource for the onboard processing of mobile robots. In the current industrial environment, mobile robot collaboration scenario, the noise of mobile robots could be easily found by on‐board audio‐sensing processors and the direction of sound sources can be effectively acquired by sound source estimation algorithms, but the distance estimation of sound sources is difficult. However, in the field of visual perception, the 3D structure information of the scene is relatively easy to obtain, but the recognition and segmentation of moving objects is more difficult. To address these problems, a novel vision‐audio fusion method that combines sound source localisation methods with a visual SLAM scheme is proposed, thereby eliminating the effect of dynamic obstacles on multi‐agent systems. Several heterogeneous robots experiments in different dynamic scenes indicate very stable self‐localisation and environment reconstruction performance of our method.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2468-2322 2468-6557 2468-2322
DOI:	10.1049/cit2.12206