Aggressive and robust low-level control and trajectory tracking for quadrotors with deep reinforcement learning

Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level cont...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Neural computing & applications Ročník 37; číslo 3; s. 1223 - 1240
Hlavní autoři: Chen, Shiyu, Li, Yanjie, Lou, Yunjiang, Lin, Ke
Médium: Journal Article
Jazyk:angličtina
Vydáno: London Springer London 01.01.2025
Springer Nature B.V
Témata:
ISSN:0941-0643, 1433-3058
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level controllers with traditional model-based control schemes is challenging. This paper presents an aggressive and robust reinforcement learning (RL)-based low-level control policy for quadrotors. The policy maps the observed quadrotor state directly to motor thrust commands, without requiring the quadrotor dynamics. Additionally, a trajectory generation pipeline is developed to improve the accuracy of trajectory tracking tasks based on differential flatness. With the learned low-level control policy, extensive simulations and real-world experiments are implemented to validate the performance of the policy. The results indicate that our RL-based low-level control policy outperforms traditional proportional–integral–derivative (PID) control methods and related learning-based policies in terms of accuracy and robustness, particularly in environments with uncertain disturbances. Furthermore, the proposed RL-based control policy exhibits an aggressive response in trajectory tracking, even when the speed of the desired trajectory is increased to 6 m/s. Moreover, the learned policy demonstrates strong vibration suppression capabilities and enables the quadrotor to recover to a hovering state from random initial conditions with shorter response time.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-024-10675-4