Knowledge- and Model-Driven Deep Reinforcement Learning for Efficient Federated Edge Learning: Single- and Multi-Agent Frameworks

In this paper, we investigate federated learning (FL) efficiency improvement in practical edge computing systems, where edge workers have non-independent and identically distributed (non-IID) local data, as well as dynamic and heterogeneous computing and communication capabilities. We consider a gen...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on machine learning in communications and networking Ročník 3; s. 332 - 352
Hlavní autoři: Li, Yangchen, Zhao, Lingzhi, Wang, Tianle, Ding, Lianghui, Yang, Feng
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 2025
Témata:
ISSN:2831-316X, 2831-316X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In this paper, we investigate federated learning (FL) efficiency improvement in practical edge computing systems, where edge workers have non-independent and identically distributed (non-IID) local data, as well as dynamic and heterogeneous computing and communication capabilities. We consider a general FL algorithm with configurable parameters, including the number of local iterations, mini-batch sizes, step sizes, aggregation weights, and quantization parameters, and provide a rigorous convergence analysis. We formulate a joint optimization problem for FL worker selection and algorithm parameter configuration to minimize the final test loss subject to time and energy constraints. The resulting problem is a complicated stochastic sequential decision-making problem with an implicit objective function and unknown transition probabilities. To address these challenges, we propose knowledge/model-driven single-agent and multi-agent deep reinforcement learning (DRL) frameworks. We transform the primal problem into a Markov decision process (MDP) for the single-agent DRL framework and a decentralized partially-observable Markov decision process (Dec-POMDP) for the multi-agent DRL framework. We develop efficient single-agent and multi-agent asynchronous advantage actor-critic (A3C) approaches to solve the MDP and Dec-POMDP, respectively. In both frameworks, we design a knowledge-based reward to facilitate effective DRL and propose a model-based stochastic policy to tackle the mixed discrete-continuous actions and large action spaces. To reduce the computational complexities of policy learning and execution, we introduce a segmented actor-critic architecture for the single-agent DRL and a distributed actor-critic architecture for the multi-agent DRL. Numerical results demonstrate the effectiveness and advantages of the proposed frameworks in enhancing FL efficiency.
ISSN:2831-316X
2831-316X
DOI:10.1109/TMLCN.2025.3534754