Aggressive and robust low-level control and trajectory tracking for quadrotors with deep reinforcement learning

Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level cont...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications Jg. 37; H. 3; S. 1223 - 1240
Hauptverfasser: Chen, Shiyu, Li, Yanjie, Lou, Yunjiang, Lin, Ke
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London Springer London 01.01.2025
Springer Nature B.V
Schlagworte:
ISSN:0941-0643, 1433-3058
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level controllers with traditional model-based control schemes is challenging. This paper presents an aggressive and robust reinforcement learning (RL)-based low-level control policy for quadrotors. The policy maps the observed quadrotor state directly to motor thrust commands, without requiring the quadrotor dynamics. Additionally, a trajectory generation pipeline is developed to improve the accuracy of trajectory tracking tasks based on differential flatness. With the learned low-level control policy, extensive simulations and real-world experiments are implemented to validate the performance of the policy. The results indicate that our RL-based low-level control policy outperforms traditional proportional–integral–derivative (PID) control methods and related learning-based policies in terms of accuracy and robustness, particularly in environments with uncertain disturbances. Furthermore, the proposed RL-based control policy exhibits an aggressive response in trajectory tracking, even when the speed of the desired trajectory is increased to 6 m/s. Moreover, the learned policy demonstrates strong vibration suppression capabilities and enables the quadrotor to recover to a hovering state from random initial conditions with shorter response time.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-024-10675-4