A Communication-Efficient Algorithm for Federated Multilevel Stochastic Compositional Optimization

Recent literature shows a growing interest in the integration of federated learning (FL) and multilevel stochastic compositional optimization (MSCO), which arises in meta-learning and reinforcement learning. It is known that a bottleneck in FL is communication efficiency, when compared to fully dece...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transactions on signal processing Ročník 72; s. 1 - 15
Hlavní autori:	Yang, Shuoguang, Li, Fengpei
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York IEEE 01.01.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Algorithms Communication Communication Efficient Complexity Complexity theory Distributed Optimization Extrapolation Federated learning Multilevel Stochastic Compositional Optimization Optimization Reinforcement learning Signal processing algorithms Stochastic processes Task analysis
ISSN:	1053-587X, 1941-0476
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Recent literature shows a growing interest in the integration of federated learning (FL) and multilevel stochastic compositional optimization (MSCO), which arises in meta-learning and reinforcement learning. It is known that a bottleneck in FL is communication efficiency, when compared to fully decentralized methods. Yet, it remains unclear whether communication-efficient algorithms exist for MSCO in distributed settings. Single-loop optimizations, used in recent methods, structurally require communications per fixed samples generated, resulting in communication complexity being no less than sample complexity, hence lower bounded by <inline-formula><tex-math notation="LaTeX">\mathcal O (1/\epsilon)</tex-math></inline-formula>, for reaching an ε -accurate solution. This paper studies distibuted MSCO of a smooth, strongly convex objective with smooth gradients. Based on a double-loop strategy, we proposed Federated Stochastic Compositional Gradient Extrapolation (F ed SCGE), a federated MSCO method that attains an optimal <inline-formula><tex-math notation="LaTeX">\mathcal O(\log\frac{1}{\epsilon})</tex-math></inline-formula> communication complexity while maintaining an (almost) optimal <inline-formula><tex-math notation="LaTeX">\tilde{\mathcal O}(1/\epsilon)</tex-math></inline-formula> sample complexity, both of which independent of client number, making the approach scalable. Our analysis leverages the random gradient extrapolation method (RGEM) in [19] and generalizes it by overcoming the biased gradients of MSCO. To the best of our knowledge, our work is the first to show the simultaneous attainability of both complexity bounds for distributed MSCO.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2024.3392351