Aggressive and robust low-level control and trajectory tracking for quadrotors with deep reinforcement learning
Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level cont...
Uloženo v:
| Vydáno v: | Neural computing & applications Ročník 37; číslo 3; s. 1223 - 1240 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
London
Springer London
01.01.2025
Springer Nature B.V |
| Témata: | |
| ISSN: | 0941-0643, 1433-3058 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Executing accurate trajectory tracking tasks using a high-performance low-level controller is crucial for quadrotors to be applied in various scenarios, especially those involving uncertain disturbances. However, due to the uncertainties in disturbed environments, developing effective low-level controllers with traditional model-based control schemes is challenging. This paper presents an aggressive and robust reinforcement learning (RL)-based low-level control policy for quadrotors. The policy maps the observed quadrotor state directly to motor thrust commands, without requiring the quadrotor dynamics. Additionally, a trajectory generation pipeline is developed to improve the accuracy of trajectory tracking tasks based on differential flatness. With the learned low-level control policy, extensive simulations and real-world experiments are implemented to validate the performance of the policy. The results indicate that our RL-based low-level control policy outperforms traditional proportional–integral–derivative (PID) control methods and related learning-based policies in terms of accuracy and robustness, particularly in environments with uncertain disturbances. Furthermore, the proposed RL-based control policy exhibits an aggressive response in trajectory tracking, even when the speed of the desired trajectory is increased to 6 m/s. Moreover, the learned policy demonstrates strong vibration suppression capabilities and enables the quadrotor to recover to a hovering state from random initial conditions with shorter response time. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0941-0643 1433-3058 |
| DOI: | 10.1007/s00521-024-10675-4 |