View in EDS

Comparative Study of Reinforcement Learning Performance Based on PPO and DQN Algorithms

Saved in:

Bibliographic Details
Title:	Comparative Study of Reinforcement Learning Performance Based on PPO and DQN Algorithms
Authors:	Ce Tan
Source:	Applied and Computational Engineering. 175:30-36
Publisher Information:	EWA Publishing, 2025.
Publication Year:	2025
Description:	With the rapid development of artificial intelligence technology, reinforcement learning (RL) has emerged as a core research direction in the field of intelligent decision-making. Among numerous reinforcement learning algorithms, Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) have gained widespread attention due to their outstanding performance. These two algorithms have been extensively applied in areas such as autonomous driving and game AI, demonstrating strong adaptability and effectiveness. However, despite numerous application instances, systematic comparative studies on their specific performance differences remain relatively scarce. This study aims to systematically evaluate the differences between DQN and PPO algorithms across four performance metrics: convergence speed, stability, sample efficiency, and computational complexity. By combining theoretical analysis and experimental validation, we selected classic reinforcement learning environmentsCartPole (for discrete action testing) and CarRacing (for continuous action evaluation)to conduct a detailed performance assessment. The results show that DQN exhibits superior performance in discrete action environments with faster convergence and higher sample efficiency, whereas PPO demonstrates greater stability and adaptability in continuous action environments.
Document Type:	Article
ISSN:	2755-273X 2755-2721
DOI:	10.54254/2755-2721/2025.ast24879
Accession Number:	edsair.doi...........d7d8875f1810b568742508f75862ce8a
Database:	OpenAIRE

Nájsť tento článok vo Web of Science

Description
Abstract:	With the rapid development of artificial intelligence technology, reinforcement learning (RL) has emerged as a core research direction in the field of intelligent decision-making. Among numerous reinforcement learning algorithms, Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) have gained widespread attention due to their outstanding performance. These two algorithms have been extensively applied in areas such as autonomous driving and game AI, demonstrating strong adaptability and effectiveness. However, despite numerous application instances, systematic comparative studies on their specific performance differences remain relatively scarce. This study aims to systematically evaluate the differences between DQN and PPO algorithms across four performance metrics: convergence speed, stability, sample efficiency, and computational complexity. By combining theoretical analysis and experimental validation, we selected classic reinforcement learning environmentsCartPole (for discrete action testing) and CarRacing (for continuous action evaluation)to conduct a detailed performance assessment. The results show that DQN exhibits superior performance in discrete action environments with faster convergence and higher sample efficiency, whereas PPO demonstrates greater stability and adaptability in continuous action environments.
ISSN:	2755273X 27552721
DOI:	10.54254/2755-2721/2025.ast24879

Cannot write session to /tmp/vufind_sessions/sess_g398ed78dirs0qs949p5huj3gh