Optimizing Autonomous Vehicle Performance Using Improved Proximal Policy Optimization.

Saved in:
Bibliographic Details
Title: Optimizing Autonomous Vehicle Performance Using Improved Proximal Policy Optimization.
Authors: Bilban, Mehmet, İnan, Onur
Source: Sensors (14248220); Mar2025, Vol. 25 Issue 6, p1941, 25p
Subject Terms: LEVY processes, CITY traffic, REINFORCEMENT learning, ACCELERATION (Mechanics), AUTONOMOUS vehicles, TRAFFIC signs & signals
Abstract: Highlights: In this study, a new Lévy flight-integrated proximal policy optimization (LFPPO) algorithm for enhanced exploration and control in autonomous vehicles is introduced. This enables autonomous vehicles to overcome the exploration limitations of standard PPO algorithms, providing improved decision-making and enhanced control over speed and acceleration, especially in complex urban environments. Indeed, the LFPPO algorithm achieves superior performance and reliability in dynamic driving scenarios: the experimental results in the CARLA simulator show that the LFPPO algorithm significantly outperforms the standard PPO algorithm, achieving a success rate of 99% (vs. 81%), and exhibiting a robust and reliable autonomous driving performance with optimized speed and acceleration control in dynamic urban traffic scenarios. What are the main findings? The integration of Lévy flight into the proximal policy optimization (PPO) algorithm (LFPPO) significantly improves the algorithm's exploration capabilities, allowing it to escape local minima and achieve better policy optimization. The experimental results in the CARLA simulator show that the LFPPO algorithm achieves a 99% success rate, compared to the 81% achieved by the standard PPO algorithm, demonstrating enhanced stability and higher rewards in autonomous vehicle decision-making. What is the implication of the main finding? The LFPPO algorithm enables autonomous vehicles to make more reliable and safer decisions in complex and dynamic traffic conditions, enhancing overall driving performance. The integration of real-time data streaming using Apache Kafka allows autonomous systems to process and react to dynamic environments more efficiently, improving real-time decision-making. Autonomous vehicles must make quick and accurate decisions to operate efficiently in complex and dynamic urban traffic environments, necessitating a reliable and stable learning mechanism. The proximal policy optimization (PPO) algorithm stands out among reinforcement learning (RL) methods for its consistent learning process, ensuring stable decisions under varying conditions while avoiding abrupt deviations during execution. However, the PPO algorithm often becomes trapped in a limited search space during policy updates, restricting its adaptability to environmental changes and alternative strategy exploration. To overcome this limitation, we integrated Lévy flight's chaotic and comprehensive exploration capabilities into the PPO algorithm. Our method helped the algorithm explore larger solution spaces and reduce the risk of getting stuck in local minima. In this study, we collected real-time data such as speed, acceleration, traffic sign positions, vehicle locations, traffic light statuses, and distances to surrounding objects from the CARLA simulator, processed via Apache Kafka. These data were analyzed by both the standard PPO and our novel Lévy flight-enhanced PPO (LFPPO) algorithm. While the PPO algorithm offers consistency, its limited exploration hampers adaptability. The LFPPO algorithm overcomes this by combining Lévy flight's chaotic exploration with Apache Kafka's real-time data streaming, an advancement absent in state-of-the-art methods. Tested in CARLA, the LFPPO algorithm achieved a 99% success rate compared to the PPO algorithm's 81%, demonstrating superior stability and rewards. These innovations enhance safety and RL exploration, with the LFPPO algorithm reducing collisions to 1% versus the PPO algorithm's 19%, advancing autonomous driving beyond existing techniques. [ABSTRACT FROM AUTHOR]
Copyright of Sensors (14248220) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Description
Abstract:Highlights: In this study, a new Lévy flight-integrated proximal policy optimization (LFPPO) algorithm for enhanced exploration and control in autonomous vehicles is introduced. This enables autonomous vehicles to overcome the exploration limitations of standard PPO algorithms, providing improved decision-making and enhanced control over speed and acceleration, especially in complex urban environments. Indeed, the LFPPO algorithm achieves superior performance and reliability in dynamic driving scenarios: the experimental results in the CARLA simulator show that the LFPPO algorithm significantly outperforms the standard PPO algorithm, achieving a success rate of 99% (vs. 81%), and exhibiting a robust and reliable autonomous driving performance with optimized speed and acceleration control in dynamic urban traffic scenarios. What are the main findings? The integration of Lévy flight into the proximal policy optimization (PPO) algorithm (LFPPO) significantly improves the algorithm's exploration capabilities, allowing it to escape local minima and achieve better policy optimization. The experimental results in the CARLA simulator show that the LFPPO algorithm achieves a 99% success rate, compared to the 81% achieved by the standard PPO algorithm, demonstrating enhanced stability and higher rewards in autonomous vehicle decision-making. What is the implication of the main finding? The LFPPO algorithm enables autonomous vehicles to make more reliable and safer decisions in complex and dynamic traffic conditions, enhancing overall driving performance. The integration of real-time data streaming using Apache Kafka allows autonomous systems to process and react to dynamic environments more efficiently, improving real-time decision-making. Autonomous vehicles must make quick and accurate decisions to operate efficiently in complex and dynamic urban traffic environments, necessitating a reliable and stable learning mechanism. The proximal policy optimization (PPO) algorithm stands out among reinforcement learning (RL) methods for its consistent learning process, ensuring stable decisions under varying conditions while avoiding abrupt deviations during execution. However, the PPO algorithm often becomes trapped in a limited search space during policy updates, restricting its adaptability to environmental changes and alternative strategy exploration. To overcome this limitation, we integrated Lévy flight's chaotic and comprehensive exploration capabilities into the PPO algorithm. Our method helped the algorithm explore larger solution spaces and reduce the risk of getting stuck in local minima. In this study, we collected real-time data such as speed, acceleration, traffic sign positions, vehicle locations, traffic light statuses, and distances to surrounding objects from the CARLA simulator, processed via Apache Kafka. These data were analyzed by both the standard PPO and our novel Lévy flight-enhanced PPO (LFPPO) algorithm. While the PPO algorithm offers consistency, its limited exploration hampers adaptability. The LFPPO algorithm overcomes this by combining Lévy flight's chaotic exploration with Apache Kafka's real-time data streaming, an advancement absent in state-of-the-art methods. Tested in CARLA, the LFPPO algorithm achieved a 99% success rate compared to the PPO algorithm's 81%, demonstrating superior stability and rewards. These innovations enhance safety and RL exploration, with the LFPPO algorithm reducing collisions to 1% versus the PPO algorithm's 19%, advancing autonomous driving beyond existing techniques. [ABSTRACT FROM AUTHOR]
ISSN:14248220
DOI:10.3390/s25061941