Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines
We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, i...
Saved in:
| Published in: | IEEE transactions on parallel and distributed systems Vol. 28; no. 4; pp. 974 - 988 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.04.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1045-9219, 1558-2183 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel isoefficiency of a state-of-the-art implementation scaled as W = Ω(P 3 ), where W is the problem size and P the number of processors; this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Ω(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM method that improves the isoefficiency to nearly W = Ω(P). We evaluate these methods on 96 to 1,536 processors, and show average speedups of 3 - 16x (7× on average) over Dis-SMO, and a 95 percent weak-scaling efficiency on six real-world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at [1]. |
|---|---|
| AbstractList | Not provided. We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel isoefficiency of a state-of-the-art implementation scaled as W = Ω(P 3 ), where W is the problem size and P the number of processors; this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Ω(P 2 ). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM method that improves the isoefficiency to nearly W = Ω(P). We evaluate these methods on 96 to 1,536 processors, and show average speedups of 3 - 16x (7× on average) over Dis-SMO, and a 95 percent weak-scaling efficiency on six real-world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at [1]. We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel isoefficiency of a state-of-the-art implementation scaled as W = Ω(P3), where W is the problem size and P the number of processors; this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has W = Ω(P2). This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM method that improves the isoefficiency to nearly W = Ω(P). We evaluate these methods on 96 to 1,536 processors, and show average speedups of 3 - 16x (7× on average) over Dis-SMO, and a 95 percent weak-scaling efficiency on six real-world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at [1]. |
| Author | Song, Le Vuduc, Rich You, Yang Demmel, James Czechowski, Kent |
| Author_xml | – sequence: 1 givenname: Yang surname: You fullname: You, Yang email: youyang@cs.berkeley.edu organization: Computer Science Division, UC Berkeley, CA – sequence: 2 givenname: James surname: Demmel fullname: Demmel, James email: demmel@cs.berkeley.edu organization: Computer Science Division, UC Berkeley, CA – sequence: 3 givenname: Kent surname: Czechowski fullname: Czechowski, Kent email: kentcz@gatech.edu organization: College of Computing, Georgia Tech, GA – sequence: 4 givenname: Le surname: Song fullname: Song, Le email: lsong@gatech.edu organization: College of Computing, Georgia Tech, GA – sequence: 5 givenname: Rich surname: Vuduc fullname: Vuduc, Rich email: richie@gatech.edu organization: College of Computing, Georgia Tech, GA |
| BackLink | https://www.osti.gov/biblio/1536673$$D View this record in Osti.gov |
| BookMark | eNp9kE1PFTEUhhuDiYD8AOOm0fVce9rpfCzNvSAEDCaA26bTOZWSmXZsOwv-vb1c4sKFq3PSPO_pm-eEHPngkZAPwDYArP9y_2N3t-EMmg1vWNdx8YYcg5RdxaETR2Vntax6Dv07cpLSE2NQS1YfE7_D5H55qv1Ir-Zlwhl91tkFT4Olmm7DPK_emZen6nbJbtYT3U46JWcdRmpDpDuXcnTDmnGk1xg9TvRuXZYQM_2JJhfiuzaPzmN6T95aPSU8e52n5OHi_H57Wd3cfrvafr2pjOAsV6btymy1aQbOGi4lWIajHEVtgXNhuejs0Nu67YGNeqglgG5hGEfd9aLphTglnw53Q8pOJeMymkcTvC91FEjRNO0e-nyAlhh-r5iyegpr9KWX4tDWdWE6Vig4UCaGlCJatcQiIT4rYGrvXu3dq7179eq-ZNp_MqXBi8IctZv-m_x4SDpE_PtTKxspBRN_AAfxk28 |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1109_JSSC_2024_3412220 crossref_primary_10_1155_2022_4418606 crossref_primary_10_1007_s42514_021_00078_5 |
| Cites_doi | 10.1109/34.291440 10.1145/1961189.1961199 10.1109/TNN.2006.875989 10.1109/IPDPS.2014.88 10.1109/34.868688 10.1109/TNN.2006.878123 10.1145/1390156.1390170 10.1109/IPDPS.2013.32 10.1007/978-3-642-20429-6_10 10.1109/ICMLA.2010.53 10.1109/IPDPS.2015.117 10.1007/BF00994018 10.1016/S0305-0483(01)00026-3 10.1007/s11222-007-9033-z 10.1145/1150402.1150500 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
| CorporateAuthor | Univ. of California, Oakland, CA (United States) |
| CorporateAuthor_xml | – name: Univ. of California, Oakland, CA (United States) |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D OTOTI |
| DOI | 10.1109/TPDS.2016.2608823 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional OSTI.GOV |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 988 |
| ExternalDocumentID | 1536673 10_1109_TPDS_2016_2608823 7565530 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: BIGDATA grantid: 1R01GM108341; ONR N00014-15-1- 2340; NSF IIS-1218749 – fundername: Office of Advanced Scientific Computing Research – fundername: LGE – fundername: Office of Advanced Scientific Computing Research grantid: DE-SC0008700; AC02-05CH11231 – fundername: HP funderid: 10.13039/100004314 – fundername: Samsung funderid: 10.13039/100004358 – fundername: Applied Mathematics program grantid: DE-SC0010200 – fundername: Intel funderid: 10.13039/100002418 – fundername: Nokia funderid: 10.13039/100004356 – fundername: NSF funderid: 10.13039/100000001 |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D RIG ABPTK OTOTI PQEST RIC |
| ID | FETCH-LOGICAL-c320t-c783207ac6b2062551f0ed5d34f1223f238fb9f47910dab4511a71bdda8936933 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 10 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000397761600005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Fri May 19 02:19:07 EDT 2023 Mon Jun 30 04:13:29 EDT 2025 Tue Nov 18 22:32:09 EST 2025 Sat Nov 29 03:36:09 EST 2025 Wed Aug 27 02:30:46 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c320t-c783207ac6b2062551f0ed5d34f1223f238fb9f47910dab4511a71bdda8936933 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 USDOE Office of Science (SC) SC0008700; SC0010200; AC02-05CH11231 |
| PQID | 2174467380 |
| PQPubID | 85437 |
| PageCount | 15 |
| ParticipantIDs | proquest_journals_2174467380 osti_scitechconnect_1536673 crossref_primary_10_1109_TPDS_2016_2608823 ieee_primary_7565530 crossref_citationtrail_10_1109_TPDS_2016_2608823 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-April-1 2017-4-1 20170401 2017-04-01 |
| PublicationDateYYYYMMDD | 2017-04-01 |
| PublicationDate_xml | – month: 04 year: 2017 text: 2017-April-1 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York – name: United States |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2017 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | grama (ref8) 2003 prokhorov (ref25) 0; 1 ref34 ref12 ref37 ref15 you (ref7) 2015 bertin-mahieux (ref13) 0 ref14 forgy (ref17) 1965; 21 ref30 joachims (ref3) 1998 ref33 ref32 si (ref35) 0 ref2 ref16 ref19 ref18 liao (ref28) 2013 leslie (ref5) 0; 7 guyon (ref31) 0 dongarra (ref26) 2014 platt (ref9) 1999 you (ref1) 2015 webb (ref24) 0 ref23 zanni (ref20) 2006; 7 hsieh (ref11) 2014 sonnenburg (ref29) 0; 10 (ref6) 2014 graf (ref10) 2004; 17 ref27 joachims (ref21) 1999 ref4 (ref36) 1998 fan (ref22) 2005; 6 |
| References_xml | – year: 2014 ident: ref6 article-title: NERSC systems – volume: 1 start-page: 1583 year: 0 ident: ref25 article-title: Neural network competition publication-title: Proc Int Joint Conf Neural Network – ident: ref32 doi: 10.1109/34.291440 – year: 1998 ident: ref3 publication-title: Text Categorization with Support Vector Machines Learning with Many Relevant Features – ident: ref14 doi: 10.1145/1961189.1961199 – ident: ref12 doi: 10.1109/TNN.2006.875989 – ident: ref16 doi: 10.1109/IPDPS.2014.88 – year: 2015 ident: ref1 article-title: Source code of casvm – ident: ref33 doi: 10.1109/34.868688 – start-page: 566 year: 2014 ident: ref11 article-title: A divide-and-conquer solver for kernel support vector machines – volume: 21 start-page: 768 year: 1965 ident: ref17 article-title: Cluster analysis of multivariate data: Efficiency versus interpretability of classifications publication-title: Biometrics – volume: 10 start-page: 1937 year: 0 ident: ref29 article-title: Pascal large scale learning challenge publication-title: Proc 25th Int Conf Mach Learn – volume: 7 start-page: 566 year: 0 ident: ref5 article-title: The spectrum kernel: A string kernel for SVM protein classification publication-title: Proc Pacific Symp Biocomputing – ident: ref30 doi: 10.1109/TNN.2006.878123 – ident: ref15 doi: 10.1145/1390156.1390170 – ident: ref27 doi: 10.1109/IPDPS.2013.32 – year: 2003 ident: ref8 publication-title: Introduction to Parallel Computing – ident: ref18 doi: 10.1007/978-3-642-20429-6_10 – year: 0 ident: ref24 article-title: Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically – ident: ref23 doi: 10.1109/ICMLA.2010.53 – start-page: 185 year: 1999 ident: ref9 article-title: Fast training of support vector machines using sequential minimal optimization publication-title: Advances in Kernel Methods Support Vector Learning – volume: 17 start-page: 521 year: 2004 ident: ref10 article-title: Parallel support vector machines: The Cascade SVM publication-title: Advances Neural Inf Process Syst – start-page: 169 year: 1999 ident: ref21 article-title: Making large scale SVM learning practical publication-title: Advances in Kernel Methods Support Vector Learning – start-page: 701 year: 0 ident: ref35 article-title: Memory efficient kernel approximation publication-title: Proc 31st Int Conf Mach Learn – volume: 7 start-page: 1467 year: 2006 ident: ref20 article-title: Parallel software for training large scale support vector machines on multiprocessor systems publication-title: J Mach Learn Res – start-page: 545 year: 0 ident: ref31 article-title: Result analysis of the NIPS 2003 feature selection challenge publication-title: Proc Advances Neural Inf Process Syst – year: 2015 ident: ref7 article-title: Appendix of casvm – ident: ref37 doi: 10.1109/IPDPS.2015.117 – start-page: 591 year: 0 ident: ref13 article-title: The million song dataset publication-title: Proc Int Soc for Music Inf Retrieval Conf – year: 2013 ident: ref28 article-title: Parallel k-means – year: 2014 ident: ref26 – year: 1998 ident: ref36 article-title: Covertype data set – ident: ref2 doi: 10.1007/BF00994018 – ident: ref4 doi: 10.1016/S0305-0483(01)00026-3 – volume: 6 start-page: 1889 year: 2005 ident: ref22 article-title: Working set selection using second order information for training support vector machines publication-title: J Mach Learn Res – ident: ref34 doi: 10.1007/s11222-007-9033-z – ident: ref19 doi: 10.1145/1150402.1150500 |
| SSID | ssj0014504 |
| Score | 2.2698686 |
| Snippet | We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in... Not provided. |
| SourceID | osti proquest crossref ieee |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 974 |
| SubjectTerms | Classifiers Communication communication-avoidance Computer Science Data models Distributed memory Distributed memory algorithms Engineering Kernel Kernels Machine learning Mathematical analysis Matrix algebra Matrix methods Optimization Partitioning algorithms Processors Program processors Scaling Source code State of the art statistical machine learning Statistical models Supercomputers Support vector machines Training |
| Title | Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines |
| URI | https://ieeexplore.ieee.org/document/7565530 https://www.proquest.com/docview/2174467380 https://www.osti.gov/biblio/1536673 |
| Volume | 28 |
| WOSCitedRecordID | wos000397761600005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LaxRBEC5i8KAHo4mSNVH64EnspGe6Z3vnKK5BUGPAKLk1_aiGQDIru5v8_lT19i4GRfA0c5gXfF1dX009PoA3mkgwbY0o7QSzNAGNDLk1EpVPJhs7Nrkg_cWenk4uLvqzLXi36YVBxFJ8hkd8WnL5aRZv-FfZsSX20WkK0B9Ya1e9WpuMgemKVCBFF53syQxrBrNR_fH52fQ7F3GNj4i8E6PU93xQEVWhw4xM6o8NuXiZk53_-76n8KSySfF-Bf8z2MJhF3bWSg2iGu4uPP5t7OAeDNNStiH8kESZDnxdG5AGMcvCi3tNI_IbbSrX9JIin3mZyY0KIrpiyhN3WSwLk_iM8wGvBCuEEpsXP0smQHwtdZq4eA4_Tj6ef_gkq-6CjLpVSxktmbmyPo5Dqyg-6pqsMHVJm9wQm8jk5XPoCUiiGskHnnDmbRNS8hOWB9T6BWwPswH3QWhiHMl2XWhNNB5Dr7DR3GwbuzaG3o9ArZFwsQ4lZ22MK1eCE9U7Bs8xeK6CN4K3m1t-rSZy_OviPYZpc2FFaAQHDLcjlsGjciPXFMWlo92fVVBHcLheBa5a9MJx6GZYIlW9_PszD-BRyy6_VPUcwvZyfoOv4GG8XV4u5q_LYr0DNAbmZQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LbxMxEB5VBQl6aKEFkbaAD5wQbr1rO5s9IkJV1DRUIqDeLD-lSu2mStL-_s44TkQFQuK0e9iX9HnG3-w8PoAPEkkwusbIm0FMXLmouEu14lHYoJJq-iplpEfNeDy4vGwvNuDTuhcmxpiLz-IRneZcfpj6O_pVdtwg-9ASA_QnWqm6WnZrrXMGSmexQIwvNG_REEsOsxLt8eRi-IPKuPpHSN-RU8pHu1CWVcHDFI3qD5ec95mTnf_7whewXfgk-7xcAC9hI3a7sLPSamDFdHdh67fBg3vQDXPhBrNdYHk-8E1pQerYNDHLHrWN8O_oVm7wJVlA8yrhRsqQ6rIhzdwluawY2FmcdfGakUYo8nn2K-cC2Hmu1IzzV_Dz5Ovkyykvygvcy1osuG_Q0EVjfd_VAiMkXSURgw5SpQr5RMJ9PrkWoUSyEayjGWe2qVwIdkACgVK-hs1u2sU3wCRyjtBo7WrllY2uFbGS1G7rde1da3sgVkgYX8aSkzrGtcnhiWgNgWcIPFPA68HH9S23y5kc_7p4j2BaX1gQ6sEBwW2QZ9CwXE9VRX5h0P-TDmoPDlerwBSbnhsK3hSJpIr9vz_zPTw7nZyPzOjb-OwAntdEAHKNzyFsLmZ38S089feLq_nsXV64D1SJ6aw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+Implementation+of+a+Communication-Optimal+Classifier+for+Distributed+Kernel+Support+Vector+Machines&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=You%2C+Yang&rft.au=Demmel%2C+James&rft.au=Czechowski%2C+Kent&rft.au=Song%2C+Le&rft.date=2017-04-01&rft.pub=IEEE&rft.issn=1045-9219&rft.volume=28&rft.issue=4&rft.spage=974&rft.epage=988&rft_id=info:doi/10.1109%2FTPDS.2016.2608823&rft.externalDocID=7565530 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |