HADFL: Heterogeneity-aware Decentralized Federated Learning Framework

Federated learning (FL) supports training models on geographically distributed devices. However, traditional FL systems adopt a centralized synchronous strategy, putting high communication pressure and model generalization challenge. Existing optimizations on FL either fail to speedup training on he...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2021 58th ACM/IEEE Design Automation Conference (DAC) s. 1 - 6
Hlavní autoři: Cao, Jing, Lian, Zirui, Liu, Weihong, Zhu, Zongwei, Ji, Cheng
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 05.12.2021
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Federated learning (FL) supports training models on geographically distributed devices. However, traditional FL systems adopt a centralized synchronous strategy, putting high communication pressure and model generalization challenge. Existing optimizations on FL either fail to speedup training on heterogeneous devices or suffer from poor communication efficiency. In this paper, we propose HADFL, a framework that supports decentralized asynchronous training on heterogeneous devices. The devices train model locally with heterogeneity-aware local steps using local data. In each aggregation cycle, they are selected based on probability to perform model synchronization and aggregation. Compared with the traditional FL system, HADFL can relieve the central server's communication pressure, efficiently utilize heterogeneous computing power, and can achieve a maximum speedup of 3.15x than decentralized-FedAvg and 4.68x than Pytorch distributed training scheme, respectively, with almost no loss of convergence accuracy.
AbstractList Federated learning (FL) supports training models on geographically distributed devices. However, traditional FL systems adopt a centralized synchronous strategy, putting high communication pressure and model generalization challenge. Existing optimizations on FL either fail to speedup training on heterogeneous devices or suffer from poor communication efficiency. In this paper, we propose HADFL, a framework that supports decentralized asynchronous training on heterogeneous devices. The devices train model locally with heterogeneity-aware local steps using local data. In each aggregation cycle, they are selected based on probability to perform model synchronization and aggregation. Compared with the traditional FL system, HADFL can relieve the central server's communication pressure, efficiently utilize heterogeneous computing power, and can achieve a maximum speedup of 3.15x than decentralized-FedAvg and 4.68x than Pytorch distributed training scheme, respectively, with almost no loss of convergence accuracy.
Author Liu, Weihong
Zhu, Zongwei
Cao, Jing
Ji, Cheng
Lian, Zirui
Author_xml – sequence: 1
  givenname: Jing
  surname: Cao
  fullname: Cao, Jing
  email: congjia@mail.ustc.edu.cn
  organization: University of Science and Technology of China,China
– sequence: 2
  givenname: Zirui
  surname: Lian
  fullname: Lian, Zirui
  email: ustclzr@mail.ustc.edu.cn
  organization: University of Science and Technology of China,China
– sequence: 3
  givenname: Weihong
  surname: Liu
  fullname: Liu, Weihong
  email: lwh2017@mail.ustc.edu.cn
  organization: University of Science and Technology of China,China
– sequence: 4
  givenname: Zongwei
  surname: Zhu
  fullname: Zhu, Zongwei
  email: zzw1988@ustc.edu.cn
  organization: University of Science and Technology of China,China
– sequence: 5
  givenname: Cheng
  surname: Ji
  fullname: Ji, Cheng
  email: cheng.ji@njust.edu.cn
  organization: Nanjing University of Science and Technology,China
BookMark eNotj81Kw0AURkdQUGueQIS8QOrMZH7dhaRphICb7svNzE0ZbCcyCZT69Bbs5jtndeB7JvdxikjIG6Nrxqh9b6qaGarFmlPO1lYaxSi7I5nVhiklRcm1oI8km-cwUEWlEdd9Ipuuatr-I-9wwTQdMGJYLgWcIWHeoMO4JDiGX_R5ix4TLFfrEVIM8ZC3CU54ntL3C3kY4ThjduOK7NrNru6K_mv7WVd9AdzopfCjUkbJEbRQgALE6KylSmjnJXN8AKcESKa9xUFjCRS5HYQbTSkHL2i5Iq__2YCI-58UTpAu-9vX8g9jjkyd
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/DAC18074.2021.9586101
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665432740
1665432748
EndPage 6
ExternalDocumentID 9586101
Genre orig-research
GrantInformation_xml – fundername: China Postdoctoral Science Foundation
  funderid: 10.13039/501100002858
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIO
ID FETCH-LOGICAL-a287t-df66865fa746ae4a4fc990647cd51c2bac64a517d9eb7e3a0e29b4cf835bd403
IEDL.DBID RIE
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:30 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a287t-df66865fa746ae4a4fc990647cd51c2bac64a517d9eb7e3a0e29b4cf835bd403
PageCount 6
ParticipantIDs ieee_primary_9586101
PublicationCentury 2000
PublicationDate 2021-Dec.-5
PublicationDateYYYYMMDD 2021-12-05
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.-5
  day: 05
PublicationDecade 2020
PublicationTitle 2021 58th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev DAC
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060584060
Score 2.3240614
Snippet Federated learning (FL) supports training models on geographically distributed devices. However, traditional FL systems adopt a centralized synchronous...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Collaborative work
Computational modeling
Data models
Design automation
Distributed Training
Federated Learning
Heterogeneous Computing
Heterogeneous networks
Machine Learning
Performance evaluation
Training
Title HADFL: Heterogeneity-aware Decentralized Federated Learning Framework
URI https://ieeexplore.ieee.org/document/9586101
WOSCitedRecordID wos000766079700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09b8IwED1R1KFTW0HVb3noWENCnDjuhoCIoUIMDGzobF8qFqgobaX--p5DoKrUpVsUKbJysfTuxe-9A3gIKVioKZPoFErlNEnrjJOk0RlDSYKxrYZN6Mkkn8_NtAGPBy8MEVXiM-qEy-os36_de_hV1jVpzmjPXOdIa73zau33TjjdY2yKapNOHJnusD-IQ9QLk8Be3Kmf_TVEpcKQ4vR_q59B-8eMJ6YHmDmHBq1aMBr3h8XzkxgHNcuaNwFxNy3xEzckhlQrLpdf5EUR0iK4ofSijlJ9EcVekNWGWTGaDcaynoggkZnNVvoyy_IsLVGrDEmhKrmgwS7qfBq7nkWXKeTqe0NWU4IR9YxVruQ2y3oVJRfQXK1XdAmCWUTKZUGdW8Yx7dG73Ed5iPJhzofuClqhAovXXebFon75679v38BJKHIl80hvobndvNMdHLuP7fJtc199qG-t_JS8
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0QNNGTGjB-uwePLvRj2-16I0BTIxIOHLiR6e7UcAGCoIm_3tlSMCZevDVNmk2nm7x53ffeMPbgUrBAYSzASBDSKBS50UagAqM1hiH4eTlsQg2HyWSiRzX2uPfCIGIpPsOWuyzP8u3CbNyvsraOEkJ74joHkZSBv3Vr7XaPO98jdPIqm47v6Xav0_Vd2AvRwMBvVU__GqNSokh68r_1T1nzx47HR3ugOWM1nDdYP-v00sETz5yeZUHbAKmfFvAJK-Q9rDSXsy-0PHV5EdRSWl6Fqb7xdCfJarJx2h93M1HNRBBA3GYtbBHHSRwVoGQMKEEWVFJnGDU28k2Qg4klUP2txlxhCB4GOpemoEYrt9ILz1l9vpjjBePEIyIqC6gkJyRTFqxJrJe4MB9ifWAuWcNVYLrcpl5Mq5e_-vv2PTvKxq-D6eB5-HLNjt1Q9lL0Ed2w-nq1wVt2aD7Ws_fVXfnRvgGdnZgF
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=HADFL%3A+Heterogeneity-aware+Decentralized+Federated+Learning+Framework&rft.au=Cao%2C+Jing&rft.au=Lian%2C+Zirui&rft.au=Liu%2C+Weihong&rft.au=Zhu%2C+Zongwei&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FDAC18074.2021.9586101&rft.externalDocID=9586101