Bidirectional Obstacle Avoidance Enhancement‐Deep Deterministic Policy Gradient: A Novel Algorithm for Mobile‐Robot Path Planning in Unknown Dynamic Environments

Real‐time path planning in unknown dynamic environments is a significant challenge for mobile robots. Many researchers have attempted to solve this problem by introducing deep reinforcement learning, which trains agents through interaction with their environments. A method called BOAE‐DDPG, which co...

Full description

Saved in:

Bibliographic Details
Published in:	Advanced intelligent systems Vol. 6; no. 4
Main Authors:	Xue, Junxiao, Zhang, Shiwen, Lu, Yafei, Yan, Xiaoran, Zheng, Yuanxun
Format:	Journal Article
Language:	English
Published:	Weinheim John Wiley & Sons, Inc 01.04.2024 Wiley
Subjects:	Algorithms Artificial intelligence Attention Behavior cross‐attention mechanisms deep deterministic policy gradients Deep learning deep reinforcement learning dynamic psychologies Lasers Machine learning mobile robot path planning Neural networks Obstacle avoidance Path planning Planning Robot dynamics Robots Unknown environments Velocity
ISSN:	2640-4567, 2640-4567
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Real‐time path planning in unknown dynamic environments is a significant challenge for mobile robots. Many researchers have attempted to solve this problem by introducing deep reinforcement learning, which trains agents through interaction with their environments. A method called BOAE‐DDPG, which combines the novel bidirectional obstacle avoidance enhancement (BOAE) mechanism with the deep deterministic policy gradient (DDPG) algorithm, is proposed to enhance the learning ability of obstacle avoidance. Inspired by the analysis of the reaction advantage in dynamic psychology, the BOAE mechanism focuses on obstacle‐avoidance reactions from the state and action. The cross‐attention mechanism is incorporated to enhance the attention to valuable obstacle‐avoidance information. Meanwhile, the obstacle‐avoidance behavioral advantage is separately estimated using the modified dueling network. Based on the learning goals of the mobile robot, new assistive reward factors are incorporated into the reward function to promote learning and convergence. The proposed method is validated through several experiments conducted using the simulation platform Gazebo. The results show that the proposed method is suitable for path planning tasks in unknown environments and has an excellent obstacle‐avoidance learning capability. A novel deep reinforcement learning‐based method called bidirectional obstacle avoidance enhancement‐deep deterministic policy gradient (BOAE‐DDPG) for mobile‐robot path planning in unknown dynamic environments is proposed. The core BOAE mechanism is inspired by dynamic psychology, making BOAE‐DDPG better at learning obstacle avoidance without relying on environmental information. In addition, new assisted reward factors designed for path planning promote learning and convergence.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2640-4567 2640-4567
DOI:	10.1002/aisy.202300444