Soft Actor-Critic for Navigation of Mobile Robots

This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of intelligent & robotic systems Jg. 102; H. 2; S. 31
Hauptverfasser:	de Jesus, Junior Costa, Kich, Victor Augusto, Kolling, Alisson Henrique, Grando, Ricardo Bedin, Cuadros, Marco Antonio de Souza Leite, Gamarra, Daniel Fernando Tello
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Dordrecht Springer Netherlands 01.06.2021 Springer Springer Nature B.V
Schlagworte:	Algorithms Angular velocity Artificial Intelligence Comparative analysis Control Electrical Engineering Employee motivation Engineering Mechanical Engineering Mechatronics Navigation Regular Paper Robotics Robots Topical collection on ICAR 2019 Special Issue Deep reinforcement learning Soft actor-critic Deep deterministic policy gradients Navigation for mobile robots
ISSN:	0921-0296, 1573-0409
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robot to arrive at a target in an environment, both networks have 10 laser range findings, the previous linear and angular velocity, and relative position and angle of the mobile robot to the target are used as the network inputs. As outputs, the networks have the linear and angular velocity of the mobile robot. The reward function created was designed in a way to only give a positive reward to the agent when it gets to the target and a negative reward when colliding with any object. The proposed architecture was applied successfully in two simulated environments, and a comparison between the two referred techniques was made using the results obtained as a basis and it was demonstrated that the SAC algorithm has a superior performance for the navigation of mobile robots than the DDPG algorithm (Code available at https://github.com/dranaju/project ).
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0921-0296 1573-0409
DOI:	10.1007/s10846-021-01367-5