Comparative Study of Reinforcement Learning Performance Based on PPO and DQN Algorithms
Saved in:
| Title: | Comparative Study of Reinforcement Learning Performance Based on PPO and DQN Algorithms |
|---|---|
| Authors: | Ce Tan |
| Source: | Applied and Computational Engineering. 175:30-36 |
| Publisher Information: | EWA Publishing, 2025. |
| Publication Year: | 2025 |
| Description: | With the rapid development of artificial intelligence technology, reinforcement learning (RL) has emerged as a core research direction in the field of intelligent decision-making. Among numerous reinforcement learning algorithms, Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) have gained widespread attention due to their outstanding performance. These two algorithms have been extensively applied in areas such as autonomous driving and game AI, demonstrating strong adaptability and effectiveness. However, despite numerous application instances, systematic comparative studies on their specific performance differences remain relatively scarce. This study aims to systematically evaluate the differences between DQN and PPO algorithms across four performance metrics: convergence speed, stability, sample efficiency, and computational complexity. By combining theoretical analysis and experimental validation, we selected classic reinforcement learning environmentsCartPole (for discrete action testing) and CarRacing (for continuous action evaluation)to conduct a detailed performance assessment. The results show that DQN exhibits superior performance in discrete action environments with faster convergence and higher sample efficiency, whereas PPO demonstrates greater stability and adaptability in continuous action environments. |
| Document Type: | Article |
| ISSN: | 2755-273X 2755-2721 |
| DOI: | 10.54254/2755-2721/2025.ast24879 |
| Accession Number: | edsair.doi...........d7d8875f1810b568742508f75862ce8a |
| Database: | OpenAIRE |
| Abstract: | With the rapid development of artificial intelligence technology, reinforcement learning (RL) has emerged as a core research direction in the field of intelligent decision-making. Among numerous reinforcement learning algorithms, Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) have gained widespread attention due to their outstanding performance. These two algorithms have been extensively applied in areas such as autonomous driving and game AI, demonstrating strong adaptability and effectiveness. However, despite numerous application instances, systematic comparative studies on their specific performance differences remain relatively scarce. This study aims to systematically evaluate the differences between DQN and PPO algorithms across four performance metrics: convergence speed, stability, sample efficiency, and computational complexity. By combining theoretical analysis and experimental validation, we selected classic reinforcement learning environmentsCartPole (for discrete action testing) and CarRacing (for continuous action evaluation)to conduct a detailed performance assessment. The results show that DQN exhibits superior performance in discrete action environments with faster convergence and higher sample efficiency, whereas PPO demonstrates greater stability and adaptability in continuous action environments. |
|---|---|
| ISSN: | 2755273X 27552721 |
| DOI: | 10.54254/2755-2721/2025.ast24879 |
Nájsť tento článok vo Web of Science