Unified Analysis of Decentralized Gradient Descent: A Contraction Mapping Framework

Saved in:
Bibliographic Details
Title: Unified Analysis of Decentralized Gradient Descent: A Contraction Mapping Framework
Authors: Larsson, Erik G, Michelusi, Nicolo
Source: IEEE Open Journal of Signal Processing. 6:507-529
Subject Terms: Topology, Noise measurement, Convergence, Noise, Vectors, Symmetric matrices, Signal processing algorithms, Network topology, Optimization, Heuristic algorithms, Decentralized machine learning, decentralized optimization, decentralized gradient descent (DGD), diffusion, stochastic DGD and diffusion, federated learning, communication noise, random topology, link failures, convergence, contractions, fixed points, mean Hessian theorem
Description: The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.
File Description: electronic
Access URL: https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-214240
https://doi.org/10.1109/OJSP.2025.3557332
Database: SwePub
Description
Abstract:The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.
ISSN:26441322
DOI:10.1109/OJSP.2025.3557332