Efficient evolutionary curriculum learning for scalable multi-agent reinforcement learning

Scalability is a key factor for the deployment of multi-agent reinforcement learning (MARL), and how to effectively control the computational cost while improving scalability has become a core challenge in this field. Although existing research has extensively explored how to enhance the scalability...

Full description

Saved in:
Bibliographic Details
Published in:Journal of King Saud University. Computer and information sciences Vol. 37; no. 8; pp. 243 - 25
Main Authors: Li, Chao, Liu, Yanfei, Wang, Jieling, Wang, Zhong, Wang, Chengjin, Tian, Qi
Format: Journal Article
Language:English
Published: Cham Springer International Publishing 01.10.2025
Springer Nature B.V
Springer
Subjects:
ISSN:1319-1578, 2213-1248, 1319-1578
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Scalability is a key factor for the deployment of multi-agent reinforcement learning (MARL), and how to effectively control the computational cost while improving scalability has become a core challenge in this field. Although existing research has extensively explored how to enhance the scalability of algorithms, it generally overlooks the computational overhead and training costs of models. Addressing this critical challenge, we introduce the Efficient Evolutionary Curriculum Learning (E2CL). This method integrates evolutionary ideas into curriculum learning, training multiple populations of agents in parallel at each stage. Through competitive, selection, crossover, and mutation operations, the fittest populations are selected for training in the next stage, thereby overcoming the bottleneck issue in traditional curriculum learning where the optimal agent from the previous stage struggles to adapt to new stages. To further optimize training efficiency, we propose a dual-dimensional hybrid probability sampling mechanism for population selection based on population rewards and training stability. This mechanism effectively reduces redundant competition during the evolutionary process, significantly lowering the computational overhead of the algorithm. Additionally, we also design a hybrid curriculum knowledge transfer method that combines model reload and buffer reuse to maximize the utilization of knowledge between curricula, enhancing the Jumpstart performance in new curriculum stages. Simulation results in cooperative and adversarial tasks show that E2CL significantly reduces the training cost while ensuring performance, thereby establishing a balance between performance and resource consumption. E2CL offers an efficient paradigm for multi-agent collaborative training and is a practical solution for MARL training in resource-sensitive scenarios.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1319-1578
2213-1248
1319-1578
DOI:10.1007/s44443-025-00215-y