Unified Analysis of Decentralized Gradient Descent: A Contraction Mapping Framework

Uloženo v:
Podrobná bibliografie
Název: Unified Analysis of Decentralized Gradient Descent: A Contraction Mapping Framework
Autoři: Larsson, Erik G, Michelusi, Nicolo
Zdroj: IEEE Open Journal of Signal Processing. 6:507-529
Témata: Topology, Noise measurement, Convergence, Noise, Vectors, Symmetric matrices, Signal processing algorithms, Network topology, Optimization, Heuristic algorithms, Decentralized machine learning, decentralized optimization, decentralized gradient descent (DGD), diffusion, stochastic DGD and diffusion, federated learning, communication noise, random topology, link failures, convergence, contractions, fixed points, mean Hessian theorem
Popis: The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.
Popis souboru: electronic
Přístupová URL adresa: https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-214240
https://doi.org/10.1109/OJSP.2025.3557332
Databáze: SwePub
Popis
Abstrakt:The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.
ISSN:26441322
DOI:10.1109/OJSP.2025.3557332