Non-Orthogonal Age-Optimal Information Dissemination in Vehicular Networks: A Meta Multi-Objective Reinforcement Learning Approach

This article considers minimizing the age-of-information (AoI) and transmit power consumption in a vehicular network, where a roadside unit (RSU) provides timely updates about a set of physical processes to vehicles. We consider non-orthogonal multi-modal information dissemination, which is based on...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on mobile computing Ročník 23; číslo 10; s. 9789 - 9803
Hlavní autoři:	Al-Habob, Ahmed A., Tabassum, Hina, Waqar, Omer
Médium:	Magazine Article
Jazyk:	angličtina
Vydáno:	IEEE 01.10.2024
Témata:	Age-of-information (AoI) deep reinforcement learning (DRL) Measurement meta deep reinforcement learning (meta-DRL) multi-objective optimization Optimization Power demand Reinforcement learning Resource management successive interference cancellation (SIC) Task analysis Unicast
ISSN:	1536-1233, 1558-0660
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This article considers minimizing the age-of-information (AoI) and transmit power consumption in a vehicular network, where a roadside unit (RSU) provides timely updates about a set of physical processes to vehicles. We consider non-orthogonal multi-modal information dissemination, which is based on superposed message transmission from RSU and successive interference cancellation (SIC) at vehicles. The formulated problem is a multi-objective mixed-integer nonlinear programming problem; thus, a Pareto-optimal front is very challenging to obtain. First, we leverage the weighted-sum approach to decompose the multi-objective problem into a set of multiple single-objective sub-problems corresponding to each predefined objective preference weight. Then, we develop a hybrid deep Q-network (DQN)-deep deterministic policy gradient (DDPG) model to solve each optimization sub-problem respective to predefined objective-preference weight. The DQN optimizes the decoding order, while the DDPG solves the continuous power allocation. The model needs to be retrained for each sub-problem. We then present a two-stage meta-multi-objective reinforcement learning solution to estimate the Pareto front with a few fine-tuning update steps without retraining the model for each sub-problem. Simulation results illustrate the efficacy of the proposed solutions compared to the existing benchmarks and that the meta-multi-objective reinforcement learning model estimates a high-quality Pareto frontier with reduced training time.
ISSN:	1536-1233 1558-0660
DOI:	10.1109/TMC.2024.3367166