CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems
We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which...
Uloženo v:
| Vydáno v: | Proceedings - IEEE International Parallel and Distributed Processing Symposium s. 847 - 859 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.05.2015
|
| Témata: | |
| ISSN: | 1530-2075 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel is efficiency of a state-of-the-art implementation scaled as W = Omega(P 3 ), where W is the problem size and P the number of processors, this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Omega(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM (CASVM) method that improves the is efficiency to nearly W = Omega(P). We evaluate these methods on 96 to 1536 processors, and show average speedups of 3 - 16× (7× on average) over Dis-SMO, and a 95% weak-scaling efficiency on six real world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at https://github.com/fastalgo/casvm. |
|---|---|
| AbstractList | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel is efficiency of a state-of-the-art implementation scaled as W = Omega(P 3 ), where W is the problem size and P the number of processors, this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Omega(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM (CASVM) method that improves the is efficiency to nearly W = Omega(P). We evaluate these methods on 96 to 1536 processors, and show average speedups of 3 - 16× (7× on average) over Dis-SMO, and a 95% weak-scaling efficiency on six real world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at https://github.com/fastalgo/casvm. |
| Author | Le Song Yang You Demmel, James Vuduc, Richard Czechowski, Kenneth |
| Author_xml | – sequence: 1 surname: Yang You fullname: Yang You email: you-y12@mails.tsinghua.edu.cn organization: Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China – sequence: 2 givenname: James surname: Demmel fullname: Demmel, James email: demmel@berkeley.edu organization: Comput. Sci. Div., Univ. of California at Berkeley, Berkeley, CA, USA – sequence: 3 givenname: Kenneth surname: Czechowski fullname: Czechowski, Kenneth email: kentcz@gatech.edu organization: Georgia Inst. of Technol., Coll. of Comput., Atlanta, GA, USA – sequence: 4 surname: Le Song fullname: Le Song email: lsong@gatech.edu organization: Comput. Sci. Div., Univ. of California at Berkeley, Berkeley, CA, USA – sequence: 5 givenname: Richard surname: Vuduc fullname: Vuduc, Richard email: richie@gatech.edu organization: Georgia Inst. of Technol., Coll. of Comput., Atlanta, GA, USA |
| BookMark | eNotj11LwzAYhSNMcJu79cab_IHON0mTtN6Nzo_BhoPqbkfbvNGATUaTCfv3FvTmHB4OPHBmZOKDR0LuGCwZg_Jhs1_v6yUHJkfWV2TGcl2WhcpLNSFTJgVkHLS8IYsYXQtc6XHiakp21SqrD7tHWoW-P3vXNckFn61-gjPOf9L6fDqFIdEDdikMdNd0X85jpMHTtYtpcO05oaH1JSbs4y25ts13xMV_z8nH89N79Zpt31421WqbOQ5FyqzgvDBCKAassS3neSGFMCW2uUXoZKs6tMp0IJk0YLXV4xlAYwWOYUHMyf2f1yHi8TS4vhkuR80Uk5qJX6bxUB0 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/IPDPS.2015.117 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1479986496 9781479986491 |
| EndPage | 859 |
| ExternalDocumentID | 7161571 |
| Genre | orig-research |
| GroupedDBID | 29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-i208t-f3228d336101afb2248533d9eb4fe0c5b6cef6dc0515d0f7f77990edf3eedff03 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 31 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000380545200082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1530-2075 |
| IngestDate | Wed Aug 27 01:42:48 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i208t-f3228d336101afb2248533d9eb4fe0c5b6cef6dc0515d0f7f77990edf3eedff03 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_7161571 |
| PublicationCentury | 2000 |
| PublicationDate | 20150501 |
| PublicationDateYYYYMMDD | 2015-05-01 |
| PublicationDate_xml | – month: 05 year: 2015 text: 20150501 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings - IEEE International Parallel and Distributed Processing Symposium |
| PublicationTitleAbbrev | IPDPS |
| PublicationYear | 2015 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib026764926 ssj0020349 |
| Score | 1.7685196 |
| Snippet | We consider the problem of how to design and implement communication-efficient versions of parallel support vector machines, a widely used classifier in... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 847 |
| SubjectTerms | Accuracy communication avoidance distributed memory algorithms Kernel Mathematical model Partitioning algorithms Program processors statistical machine learning Support vector machines Training |
| Title | CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems |
| URI | https://ieeexplore.ieee.org/document/7161571 |
| WOSCitedRecordID | wos000380545200082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07a8MwEBZp6NApbZPSNxo6Vo1s2ZbdLSQNLTTBkDZkC5Z0gix2yOv3V2c7KYEu3YRAQpwed5Lu-z5CnrSKE3cGSua76xULIquYAmNYqF00b2yccCiBwp9yPI5nsyRtkOcDFgYAyuQzeMFi-ZdvCr3Fp7KuxPAEAeMnUsoKq7VfO34kI-S-O1y2kHel4krlbiXIsCZs9HjS_UgH6QSzukL8sjySVSm9yrD1v_Gck84vPI-mB8dzQRqQX5LWXp-B1tu1TUb9HptMR6_0CAXCertigQ0pKnq66JtOy5d7OirzKmFNi5wOkE8XpbDA0JrUvEO-h29f_XdWyyewhc_jDbNur8ZGCBcgeZlVPpKXCWESUIEFrkMVabCR0ajyYriVVkrnmsBY4YZvLRdXpJkXOVwTqnwPAq2Em_E4UJmXZWHoehIBOP-vwuSGtNE682XFkDGvDXP7d_UdOUPjV2mD96S5WW3hgZzq3WaxXj2W0_oD1FCiTw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Pa8IwGA3iBtvJbTr2eznsuMy2aZp2N9GJMisFnXiTJvkCXtrhr79_SVsdwi67lUJDyJf0e0m-9x5CL1KEkfkHcuKZ7RXxAy2IAKUIkwbNKx1GDhRE4REfj8P5PEpq6PXAhQGAovgM3uxjcZevcrm1R2VtbuGJJYyfMN_33JKttZ89XsADq3532G5Z5ZVSLdUxc4GzSrLRdaL2MOklE1vXxeyl5ZGxSpFX-o3_9egCtX4Jejg5pJ5LVIPsCjX2Dg24WrBNFHc7ZDKL3_ERD4R0dvnSfoitp6fB33hWnN3juKishDXOM9yzirrWDAsUrmTNW-ir_zHtDkhloECWnhNuiDarNVSUGojkplp4Vr6MUhWB8DU4kolAgg6UtD4vytFcc26SEyhNTfe1dug1qmd5BjcIC88FXwpqYh76InXTlDHTEvXBIADBolvUtKOz-C41MhbVwNz9_foZnQ2m8WgxGo4_79G5DURZRPiA6pvVFh7RqdxtluvVUxHiH3ZupZY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=CA-SVM%3A+Communication-Avoiding+Support+Vector+Machines+on+Distributed+Systems&rft.au=Yang+You&rft.au=Demmel%2C+James&rft.au=Czechowski%2C+Kenneth&rft.au=Le+Song&rft.date=2015-05-01&rft.pub=IEEE&rft.issn=1530-2075&rft.spage=847&rft.epage=859&rft_id=info:doi/10.1109%2FIPDPS.2015.117&rft.externalDocID=7161571 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon |