Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning

Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Ročník 2022; s. 10051 - 10061
Hlavní autori: Qu, Liangqiong, Zhou, Yuyin, Liang, Paul Pu, Xia, Yingda, Wang, Feifei, Adeli, Ehsan, Fei-Fei, Li, Rubin, Daniel
Médium: Konferenčný príspevok.. Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.06.2022
Predmet:
ISSN:1063-6919, 1063-6919
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data. Concretely, we conduct the first rigorous empirical investigation of different neural architectures across a range of federated algorithms, real-world benchmarks, and heterogeneous data splits. Our experiments show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. We release our code and pretrained models to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.
AbstractList Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data. Concretely, we conduct the first rigorous empirical investigation of different neural architectures across a range of federated algorithms, real-world benchmarks, and heterogeneous data splits. Our experiments show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. We release our code and pretrained models to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.
Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data. Concretely, we conduct the first rigorous empirical investigation of different neural architectures across a range of federated algorithms, real-world benchmarks, and heterogeneous data splits. Our experiments show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. We release our code and pretrained models to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data. Concretely, we conduct the first rigorous empirical investigation of different neural architectures across a range of federated algorithms, real-world benchmarks, and heterogeneous data splits. Our experiments show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. We release our code and pretrained models to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.
Author Qu, Liangqiong
Adeli, Ehsan
Zhou, Yuyin
Liang, Paul Pu
Xia, Yingda
Wang, Feifei
Fei-Fei, Li
Rubin, Daniel
AuthorAffiliation 3 Carnegie Mellon University
4 Johns Hopkins University
1 Stanford University
2 UC Santa Cruz
AuthorAffiliation_xml – name: 3 Carnegie Mellon University
– name: 2 UC Santa Cruz
– name: 1 Stanford University
– name: 4 Johns Hopkins University
Author_xml – sequence: 1
  givenname: Liangqiong
  surname: Qu
  fullname: Qu, Liangqiong
  email: liangqiqu@gmail.com
  organization: Stanford University
– sequence: 2
  givenname: Yuyin
  surname: Zhou
  fullname: Zhou, Yuyin
  email: zhouyuyiner@gmail.com
  organization: UC Santa Cruz
– sequence: 3
  givenname: Paul Pu
  surname: Liang
  fullname: Liang, Paul Pu
  email: philyingdaxia@gmail.com
  organization: Carnegie Mellon University
– sequence: 4
  givenname: Yingda
  surname: Xia
  fullname: Xia, Yingda
  email: pliang@cs.cmu.edu
  organization: Johns Hopkins University
– sequence: 5
  givenname: Feifei
  surname: Wang
  fullname: Wang, Feifei
  email: ffwang@stanford.edu
  organization: Stanford University
– sequence: 6
  givenname: Ehsan
  surname: Adeli
  fullname: Adeli, Ehsan
  email: eadeli@stanford.edu
  organization: Stanford University
– sequence: 7
  givenname: Li
  surname: Fei-Fei
  fullname: Fei-Fei, Li
  email: feifeili@stanford.edu
  organization: Stanford University
– sequence: 8
  givenname: Daniel
  surname: Rubin
  fullname: Rubin, Daniel
  email: rubin@stanford.edu
  organization: Stanford University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/36624800$$D View this record in MEDLINE/PubMed
BookMark eNpVkFFLwzAUhaNMnM79AkX66MvmTdKmyYswNueEgSLT15K2N1u0S2eaCfv3VpyiT_fC-c45cE5Jx9UOCbmgMKQU1PX45fEpYULKIQPGhgBKsgNySoVIYqFiwQ_JCQXBB0JR1fnzd0m_aV4BgDNKhZLHpMuFYLEEOCGLJwwr696sW0YjX6xswCJsPUYTbOzSRab20UIXb9UXMNFBRzMM6OslOrRhF1kXTbFErwOW0Ry1dy14Ro6Mrhrs72-PPE9vF-PZYP5wdz8ezQeWCwiD2DAGiua6ZCaNS-AmgSLnBnipk1JpSlmuc2PSROYpMjSFQm2AijSXVEPJe-TmO3ezzddYFuiC11W28Xat_S6rtc3-K86usmX9kbXTCaGSNuBqH-Dr9y02IVvbpsCq0g7rbZOxVHDOEyqhRS__dv2W_EzZAuffgEXEX1nJ1ssF_wQEc4aw
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
NPM
7X8
5PM
DOI 10.1109/CVPR52688.2022.00982
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList
PubMed
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 1665469463
9781665469463
EISSN 1063-6919
EndPage 10061
ExternalDocumentID PMC9826695
36624800
9880336
Genre orig-research
Journal Article
GrantInformation_xml – fundername: NCI
  grantid: U01CA242879
  funderid: 10.13039/100000054
– fundername: NCI NIH HHS
  grantid: U01 CA242879
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
23M
29F
29O
6IK
ABDPE
ACGFS
IPLJI
M43
NPM
RIG
RNS
7X8
5PM
ID FETCH-LOGICAL-i360t-4f22091bad2f74d03f50cb3f03da5d9a112babff758b7e2efc9eaf0167b81a0d3
IEDL.DBID RIE
ISICitedReferencesCount 112
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000870759103013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-6919
IngestDate Thu Aug 21 18:38:10 EDT 2025
Sun Aug 24 03:51:23 EDT 2025
Wed Feb 19 02:25:39 EST 2025
Wed Aug 27 02:15:11 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i360t-4f22091bad2f74d03f50cb3f03da5d9a112babff758b7e2efc9eaf0167b81a0d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Equal contribution
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/9826695
PMID 36624800
PQID 2763335180
PQPubID 23479
PageCount 11
ParticipantIDs pubmed_primary_36624800
ieee_primary_9880336
pubmedcentral_primary_oai_pubmedcentral_nih_gov_9826695
proquest_miscellaneous_2763335180
PublicationCentury 2000
PublicationDate 20220601
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 6
  year: 2022
  text: 20220601
  day: 1
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationTitleAlternate Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
ssj0023720
Score 2.7095375
Snippet Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data...
SourceID pubmedcentral
proquest
pubmed
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 10051
SubjectTerms Computer architecture
Data models
Federated learning
Organizations
Privacy and federated learning
Robustness
Training
Transformers
Title Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
URI https://ieeexplore.ieee.org/document/9880336
https://www.ncbi.nlm.nih.gov/pubmed/36624800
https://www.proquest.com/docview/2763335180
https://pubmed.ncbi.nlm.nih.gov/PMC9826695
Volume 2022
WOSCitedRecordID wos000870759103013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDLYAceDEYzzGYwoSRwpp0ybNEQHTTtM0DbTblCfs0iHW8ftx2lIG4oJ6qdS0imyn-RzbnwGuGM9jroSJnJY2Sm1GI2USGVmuDcPLWk-rZhNiOMynUznagOu2FsY5VyWfuZtwW8Xy7cKswlHZrURjY4xvwqYQoq7Vas9TGHoyXOZNdVxM5e3982gcyExCAlcSaDllxbYXPvsXnPydFbm2zfR3_zfBPTj8rtcjo3Yn2ocNVxzAbgMwSbN8lx2YjF35WjdLIHdrEQTyUOVxEASwZKJMaOL-Qh5UqcggZMss0MgconUyL0g_kE8gPrWkYWZ9OYSn_uPkfhA1bRWiOeO0jFKfJIgStLKJF6mlzGfUaOYpsyqzUiEC00p7j56EFi5x3kinfChX0HmsqGVHsFUsCncCxAudGaoM08akXqOrqGOHPhcOR5xJZRc6QUizt5o5Y9bIpwuXX-KfoTWHEIUq3GK1nCX4u2Msi3PaheNaHe3LjPMkRXzbBfFDUe2AwJT980kxf60Ys1HznMvs9O_pnMFOsJA6Aewctsr3lbuAbfNRzpfvPTS2ad6rjO0TaB_YcA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT-MwEB6xgAQnlneBBSNx3CyOnTjxEcFWRUBVoYK4RX5CLymiKb-fcRICrLiscokUJ7JmxvE3nplvAE64yGOhMhM5LW2U2JRGyjAZWaENx8taT-tmE9lwmD88yNEC_O5qYZxzdfKZ-xNu61i-nZp5OCo7lWhsnIsfsJQmCYubaq3uRIWjLyNk3tbHxVSent-PbgOdSUjhYoGYU9Z8e-HD3wHKf_MiP200_bX_m-JP2Pqo2COjbi9ahwVXbsBaCzFJu4BnmzC-ddVT0y6BnH2KIZCLOpODIIQlY2VCG_dHcqEqRQYhX2aKZuYQr5NJSfqBfgIRqiUtN-vjFtz1_47PB1HbWCGacEGrKPGMIU7QyjKfJZZyn1KjuafcqtRKhRhMK-09-hI6c8x5I53yoWBB57Gilm_DYjkt3S4Qn-nUUGW4NibxGp1FHTv0unA4Ik0qe7AZhFQ8N9wZRSufHhy_i79Aew5BClW66XxWMPzhcZ7GOe3BTqOO7mUuBEsQ4fYg-6KobkDgyv76pJw81ZzZqHkhZLr3_XSOYGUwvrkuri-HV_uwGqylSQc7gMXqZe5-wbJ5rSazl8Pa5N4Arx3azw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Rethinking+Architecture+Design+for+Tackling+Data+Heterogeneity+in+Federated+Learning&rft.au=Qu%2C+Liangqiong&rft.au=Zhou%2C+Yuyin&rft.au=Liang%2C+Paul+Pu&rft.au=Xia%2C+Yingda&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=10051&rft.epage=10061&rft_id=info:doi/10.1109%2FCVPR52688.2022.00982&rft.externalDocID=9880336
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon