Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization

This article proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signal processing and machine learning communities. In distributed setting, a large number of samples are allocated to multiple agent...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transaction on neural networks and learning systems Ročník 34; číslo 9; s. 5310 - 5321
Hlavní autoři: Jiang, Xia, Zeng, Xianlin, Sun, Jian, Chen, Jie
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2162-237X, 2162-2388, 2162-2388
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This article proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signal processing and machine learning communities. In distributed setting, a large number of samples are allocated to multiple agents in the network. Each agent computes local stochastic gradient and communicates with its neighbors to seek for the global optimum. In this article, we develop a modified variance reduction technique to deal with the variance introduced by stochastic gradients. Combining gradient tracking and variance reduction techniques, this article proposes a distributed stochastic algorithm, gradient tracking algorithm with variance reduction (GT-VR), to solve large-scale non-convex finite-sum optimization over multiagent networks. A complete and rigorous proof shows that the GT-VR algorithm converges to the first-order stationary points with <inline-formula> <tex-math notation="LaTeX">O({1}/{k}) </tex-math></inline-formula> convergence rate. In addition, we provide the complexity analysis of the proposed algorithm. Compared with some existing first-order methods, the proposed algorithm has a lower <inline-formula> <tex-math notation="LaTeX">\mathcal {O}(PM\epsilon ^{-1}) </tex-math></inline-formula> gradient complexity under some mild condition. By comparing state-of-the-art algorithms and GT-VR in numerical simulations, we verify the efficiency of the proposed algorithm.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2022.3170944