Drone Elevation Control Based on Python-Unity Integrated Framework for Reinforcement Learning Applications

Reinforcement learning (RL) applications require a huge effort to become established in real-world environments, due to the injury and break down risks during interactions between the RL agent and the environment, in the online training process. In addition, the RL platform tools (e.g., Python OpenA...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Drones (Basel) Ročník 7; číslo 4; s. 225
Hlavní autori: Abbass, Mahmoud Abdelkader Bashery, Kang, Hyun-Soo
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Basel MDPI AG 01.04.2023
Predmet:
ISSN:2504-446X, 2504-446X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Reinforcement learning (RL) applications require a huge effort to become established in real-world environments, due to the injury and break down risks during interactions between the RL agent and the environment, in the online training process. In addition, the RL platform tools (e.g., Python OpenAI’s Gym, Unity ML-Agents, PyBullet, DART, MoJoCo, RaiSim, Isaac, and AirSim), that are required to reduce the real-world challenges, suffer from drawbacks (e.g., the limited number of examples and applications, and difficulties in implementation of the RL algorithms, due to difficulties with the programing language). This paper presents an integrated RL framework, based on Python–Unity interaction, to demonstrate the ability to create a new RL platform tool, based on making a stable user datagram protocol (UDP) communication between the RL agent algorithm (developed using the Python programing language as a server), and the simulation environment (created using the Unity simulation software as a client). This Python–Unity integration process, increases the advantage of the overall RL platform (i.e., flexibility, scalability, and robustness), with the ability to create different environment specifications. The challenge of RL algorithms’ implementation and development is also achieved. The proposed framework is validated by applying two popular deep RL algorithms (i.e., Vanilla Policy Gradient (VPG) and Actor-Critic (A2C)), on an elevation control challenge for a quadcopter drone. The validation results for these experimental tests, prove the innovation of the proposed framework, to be used in RL applications, because both implemented algorithms achieve high stability, by achieving convergence to the required performance through the semi-online training process.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2504-446X
2504-446X
DOI:10.3390/drones7040225