Multiagent-based deep reinforcement learning framework for multi-asset adaptive trading and portfolio management

The highly dynamic nature of stock markets has motivated researchers to propose various supervised learning models to assist investors to optimize financial performance. Machine learning models have been used to predict price trends, and approaches have been proposed for portfolio management. Howeve...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 594; p. 127800
Main Authors: Cheng, Li-Chen, Sun, Jian-Shiou
Format: Journal Article
Language:English
Published: Elsevier B.V 14.08.2024
Subjects:
ISSN:0925-2312, 1872-8286
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The highly dynamic nature of stock markets has motivated researchers to propose various supervised learning models to assist investors to optimize financial performance. Machine learning models have been used to predict price trends, and approaches have been proposed for portfolio management. However, these studies focus on only one kind of financial issue, and the methods proposed exhibit poor generalizability. We address these problems with a multi-agent portfolio adaptive trading framework based on reinforcement learning to create an automated trading system with the best trading strategy that can be achieved by long-short situation judgment and adaptive capital allocation. We use the TD3 algorithm in the multi-agent algorithm to mitigate the overestimation and overfitting exhibited by traditional value functions and improve training stability. Experimental results show that the proposed framework outperforms single-agent reinforcement learning algorithms while achieving more stable returns. •This study seeks to optimize trading strategy and portfolio management issues using RL.•The multi-agent architecture allows agents to explore chaotic environments aside from their own capital assets.•The adaptive TPM autonomously rates assets, optimizing Sortino ratio rewards through dynamic trading weights.•Experimental results show that the proposed framework outperforms single-agent reinforcement learning algorithms.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2024.127800