Federated Learning Using Three-Operator ADMM

Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler ef...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE journal of selected topics in signal processing Vol. 17; no. 1; pp. 205 - 221
Main Authors:	Kant, Shashi, Silva, Jose Mairton B. da, Fodor, Gabor, Goransson, Bo, Bengtsson, Mats, Fischione, Carlo
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.01.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Base stations communication efficiency Computer Science with specialization in Computer Communication Convergence Cost function Data models Datasets Datavetenskap med inriktning mot datorkommunikation distributed machine learning Federated learning Heterogeneity Machine learning Maskininlärning Numerical models Optimization Servers Signal processing algorithms three-operator ADMM
ISSN:	1932-4553, 1941-0484, 1941-0484
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such difficulties is FedADMM, which is based on the classical two-operator consensus alternating direction method of multipliers (ADMM). The common assumption of FL algorithms, including FedADMM, is that they learn a global model using data only on the users' side and not on the edge server. However, in edge learning, the server is expected to be near the base station and has often direct access to rich datasets. In this paper, we argue that it is much more beneficial to leverage the rich data on the edge server then utilizing only user datasets. Specifically, we show that the mere application of FL with an additional virtual user node representing the data on the edge server is inefficient. We propose FedTOP-ADMM, which generalizes FedADMM and is based on a three-operator ADMM-type technique that exploits a smooth cost function on the edge server to learn a global model in parallel to the edge devices. Our numerical experiments indicate that FedTOP-ADMM has substantial gain up to 33% in communication efficiency to reach a desired test accuracy with respect to FedADMM, including a virtual user on the edge server.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1932-4553 1941-0484 1941-0484
DOI:	10.1109/JSTSP.2022.3221681