DMRL: A distributed multi-agent reinforcement learning algorithm for imbalanced classification

Traditional imbalanced classification methods rely on sampling or allocating different weights to different classes to improve the recognition rate for minority classes. However, these methods ignore the importance of adaptability, particularly as the degree of imbalance increases, resulting in sign...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems Vol. 327; p. 114101
Main Authors: Ji, Yixin, Jing, Chao
Format: Journal Article
Language:English
Published: Elsevier B.V 09.10.2025
Subjects:
ISSN:0950-7051
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traditional imbalanced classification methods rely on sampling or allocating different weights to different classes to improve the recognition rate for minority classes. However, these methods ignore the importance of adaptability, particularly as the degree of imbalance increases, resulting in significant limitations when dynamically selecting the optimal classification strategy. To tackle this issue, we propose a distributed multi-agent reinforcement learning (DMRL) method for imbalanced classification problems, which models the classification problem as a multi-agent Markov decision-making process within the distributed computing scheme. Subsequently, there are three important schemes implemented in DMRL: 1) A multi-agent classification scheme based on improved double deep Q-network (MCSQ) that dynamically optimizes the imbalanced classification strategy through the reward function and importance weights; 2) A prioritized experience replay-based scheme for sampling agents experience (PERS) that uses prioritized experience replay to learn from important samples; 3) A distributed computing scheme based on the multi-agent centralized training and decentralized execution (CTDE) paradigm (DCSM) that combines distributed computing with multi-agent CTDE to improve learning efficiency. Finally, we perform our experiment on public datasets such as IMDB, Cifar-10, Fashion-Mnist, and Mnist using the various imbalanced ratios. Experimental results demonstrate that DMRL outperforms eight representative methods, with a maximum improvement of 8.9 % in G-mean and 10.7 % in F-measure when compared to the suboptimal method. Simultaneously, we study the impact of the value of the reward function, the number of agents and servers, and the effectiveness of prioritized experience replay on the performance of DMRL.
ISSN:0950-7051
DOI:10.1016/j.knosys.2025.114101