Joint Multi-objective Optimization for Radio Access Network Slicing Using Multi-agent Deep Reinforcement Learning

Radio access network (RAN) slices can provide various customized services for next-generation wireless networks. Thus, multiple performance metrics of different types of RAN slices need to be jointly optimized. However, existing efforts in multi-objective optimization problem (MOOP) for RAN slicing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on vehicular technology Jg. 72; H. 9; S. 1 - 16
Hauptverfasser: Zhou, Guorong, Zhao, Liqiang, Zheng, Gan, Xie, Zhijie, Song, Shenghui, Chen, Kwang-Cheng
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0018-9545, 1939-9359
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Radio access network (RAN) slices can provide various customized services for next-generation wireless networks. Thus, multiple performance metrics of different types of RAN slices need to be jointly optimized. However, existing efforts in multi-objective optimization problem (MOOP) for RAN slicing are only in the scalar form, which is difficult to achieve simultaneous optimization. In this paper, we consider a non-scalar MOOP for RAN slicing with three types of slices, i.e. , the high-bandwidth slice, the low-delay slice, and the wide-coverage slice over the same underlying physical network. We jointly optimize the throughput, the transmission delay, and the coverage area by user-oriented dynamic virtual base stations (vBSs)' deployment, and sub-channel and power allocation. An improved multi-agent deep deterministic policy gradient (IMADDPG) algorithm, having the characteristics of centralized training and distributed execution, is proposed to solve the above non-deterministic polynomial-time hard (NP-hard) problem. The rank voting method is introduced in the inference process to obtain near-Pareto optimal solutions. Simulation results verify that the proposed scheme can ensure better performance than the traditional scalar utility method and other benchmark algorithms. The proposed scheme has the advantage of flexibly approaching any point of the Pareto boundary, while the traditional scalar method only subjectively approaches one of the Pareto optimal solutions. Furthermore, our proposal strikes a compelling tradeoff among three types of RAN slices due to the non-dominance between Pareto optimal solutions.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9545
1939-9359
DOI:10.1109/TVT.2023.3268671