CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems
We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which...
Gespeichert in:
| Veröffentlicht in: | Proceedings - IEEE International Parallel and Distributed Processing Symposium S. 847 - 859 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
01.05.2015
|
| Schlagworte: | |
| ISSN: | 1530-2075 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel is efficiency of a state-of-the-art implementation scaled as W = Omega(P 3 ), where W is the problem size and P the number of processors, this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Omega(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM (CASVM) method that improves the is efficiency to nearly W = Omega(P). We evaluate these methods on 96 to 1536 processors, and show average speedups of 3 - 16× (7× on average) over Dis-SMO, and a 95% weak-scaling efficiency on six real world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at https://github.com/fastalgo/casvm. |
|---|---|
| AbstractList | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel is efficiency of a state-of-the-art implementation scaled as W = Omega(P 3 ), where W is the problem size and P the number of processors, this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Omega(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM (CASVM) method that improves the is efficiency to nearly W = Omega(P). We evaluate these methods on 96 to 1536 processors, and show average speedups of 3 - 16× (7× on average) over Dis-SMO, and a 95% weak-scaling efficiency on six real world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at https://github.com/fastalgo/casvm. |
| Author | Le Song Yang You Demmel, James Vuduc, Richard Czechowski, Kenneth |
| Author_xml | – sequence: 1 surname: Yang You fullname: Yang You email: you-y12@mails.tsinghua.edu.cn organization: Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China – sequence: 2 givenname: James surname: Demmel fullname: Demmel, James email: demmel@berkeley.edu organization: Comput. Sci. Div., Univ. of California at Berkeley, Berkeley, CA, USA – sequence: 3 givenname: Kenneth surname: Czechowski fullname: Czechowski, Kenneth email: kentcz@gatech.edu organization: Georgia Inst. of Technol., Coll. of Comput., Atlanta, GA, USA – sequence: 4 surname: Le Song fullname: Le Song email: lsong@gatech.edu organization: Comput. Sci. Div., Univ. of California at Berkeley, Berkeley, CA, USA – sequence: 5 givenname: Richard surname: Vuduc fullname: Vuduc, Richard email: richie@gatech.edu organization: Georgia Inst. of Technol., Coll. of Comput., Atlanta, GA, USA |
| BookMark | eNotj11LwzAYhSNMcJu79cab_IHON0mTtN6Nzo_BhoPqbkfbvNGATUaTCfv3FvTmHB4OPHBmZOKDR0LuGCwZg_Jhs1_v6yUHJkfWV2TGcl2WhcpLNSFTJgVkHLS8IYsYXQtc6XHiakp21SqrD7tHWoW-P3vXNckFn61-gjPOf9L6fDqFIdEDdikMdNd0X85jpMHTtYtpcO05oaH1JSbs4y25ts13xMV_z8nH89N79Zpt31421WqbOQ5FyqzgvDBCKAassS3neSGFMCW2uUXoZKs6tMp0IJk0YLXV4xlAYwWOYUHMyf2f1yHi8TS4vhkuR80Uk5qJX6bxUB0 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/IPDPS.2015.117 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1479986496 9781479986491 |
| EndPage | 859 |
| ExternalDocumentID | 7161571 |
| Genre | orig-research |
| GroupedDBID | 29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-i208t-f3228d336101afb2248533d9eb4fe0c5b6cef6dc0515d0f7f77990edf3eedff03 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 31 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000380545200082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1530-2075 |
| IngestDate | Wed Aug 27 01:42:48 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i208t-f3228d336101afb2248533d9eb4fe0c5b6cef6dc0515d0f7f77990edf3eedff03 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_7161571 |
| PublicationCentury | 2000 |
| PublicationDate | 20150501 |
| PublicationDateYYYYMMDD | 2015-05-01 |
| PublicationDate_xml | – month: 05 year: 2015 text: 20150501 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings - IEEE International Parallel and Distributed Processing Symposium |
| PublicationTitleAbbrev | IPDPS |
| PublicationYear | 2015 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib026764926 ssj0020349 |
| Score | 1.7686182 |
| Snippet | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 847 |
| SubjectTerms | Accuracy communication avoidance distributed memory algorithms Kernel Mathematical model Partitioning algorithms Program processors statistical machine learning Support vector machines Training |
| Title | CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems |
| URI | https://ieeexplore.ieee.org/document/7161571 |
| WOSCitedRecordID | wos000380545200082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1La8JAEB6s9NCTbbX0zR567NaY1ya9iVbagxKwFW-y2Qd4SYqv39-ZTbQIvfQWAoFlMsvMtzvf9wE8Id6SUapjHks_5KHFNJax9jlWWpXa3LdJmDuzCTGZJPN5mjXg-cCFMca44TPzQo_uLl-XaktHZV1B7QkRxk-EEBVXa587fixi0r47gC3SXam0Uj3MBBHVgo09L-1-ZMNsSlNdEV1ZHtmquKoyav1vPefQ-aXnsexQeC6gYYpLaO39GVi9XdswHvT5dDZ-ZUcsEN7flUv6kJGjJ3bfbOZO7tnYzVWaNSsLNiQ9XbLCMprVouYd-Bq9fQ7eeW2fwJe-l2y4xb2a6CDABqknMewkXhYEOjV5aI2nojxWxsZakcuL9qywQmBpMtoGuHxrveAKmkVZmGtgiGpSrPvUHuRhpBBlSUReMpfYXcpQqxtoU3QW35VCxqIOzO3fr-_gjIJfjQ3eQ3Oz2poHOFW7zXK9enS_9Qd4-6D4 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB7EFtqTbbX03Rx6bOq6m331JlpR6sqCVrxJNg_Yy27x9fub2V0tQi-9hUAgTCbMTDLf9wG8mHqLu6H0qMdtRpk2bsw9aVMTaUWoE1sHLCnEJvzJJFgswrgGrwcsjFKqaD5Tbzgs_vJlLrb4VNb2MT1BwPiJy5jdKdFae--xPd9D9rtDuYXMKyVbqmV8wXcrysaOFbZHcT-eYl-Xi5-WR8IqRVwZNP63owto_QL0SHwIPZdQU9kVNPYKDaS6sE2Iel06nUfv5AgHQru7PMWFBDU9Tf5N5sXbPYmKzkq1JnlG-sioi2JYSpKK1rwFX4OPWW9IKwEFmtpWsKHa3NZAOo5JkTrcGB7pyxxHhiphWlnCTTyhtCcF6rxIS_va901wUlI7ZvtaW8411LM8UzdATF0TmsiPCULCXGHqLG5qL55wk19yJsUtNNE6y--SI2NZGebu7-lnOBvOovFyPJp83sM5HkTZRPgA9c1qqx7hVOw26Xr1VBzxDwoppD8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=CA-SVM%3A+Communication-Avoiding+Support+Vector+Machines+on+Distributed+Systems&rft.au=Yang+You&rft.au=Demmel%2C+James&rft.au=Czechowski%2C+Kenneth&rft.au=Le+Song&rft.date=2015-05-01&rft.pub=IEEE&rft.issn=1530-2075&rft.spage=847&rft.epage=859&rft_id=info:doi/10.1109%2FIPDPS.2015.117&rft.externalDocID=7161571 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon |