Learning-Based Coded Computation
Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of...
Saved in:
| Published in: | IEEE journal on selected areas in information theory Vol. 1; no. 1; pp. 227 - 236 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Piscataway
IEEE
01.05.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 2641-8770, 2641-8770 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of non-linear computations while requiring high resource overhead. In this work, we propose a learning-based coded computation framework to overcome the challenges of performing coded computation for general non-linear functions. We show that careful use of machine learning within the coded computation framework can extend the reach of coded computation to imparting resilience to more general non-linear computations. We showcase the applicability of learning-based coded computation to neural network inference, a major workload in production services. Our evaluation results show that learning-based coded computation enables accurate reconstruction of unavailable results from widely deployed neural networks for a variety of inference tasks such as image classification, speech recognition, and object localization. We implement our proposed approach atop an open-source prediction serving system and show its promise in alleviating slowdowns that occur in neural network inference. These results indicate the potential for learning-based approaches to open new doors for the use of coded computation for broader, non-linear computations. |
|---|---|
| AbstractList | Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of non-linear computations while requiring high resource overhead. In this work, we propose a learning-based coded computation framework to overcome the challenges of performing coded computation for general non-linear functions. We show that careful use of machine learning within the coded computation framework can extend the reach of coded computation to imparting resilience to more general non-linear computations. We showcase the applicability of learning-based coded computation to neural network inference, a major workload in production services. Our evaluation results show that learning-based coded computation enables accurate reconstruction of unavailable results from widely deployed neural networks for a variety of inference tasks such as image classification, speech recognition, and object localization. We implement our proposed approach atop an open-source prediction serving system and show its promise in alleviating slowdowns that occur in neural network inference. These results indicate the potential for learning-based approaches to open new doors for the use of coded computation for broader, non-linear computations. |
| Author | Venkataraman, Shivaram Kosaian, Jack Rashmi, K. V. |
| Author_xml | – sequence: 1 givenname: Jack orcidid: 0000-0001-8812-7847 surname: Kosaian fullname: Kosaian, Jack email: jkosaian@cs.cmu.edu organization: Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA – sequence: 2 givenname: K. V. orcidid: 0000-0002-2227-7460 surname: Rashmi fullname: Rashmi, K. V. email: rvinayak@cs.cmu.edu organization: Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA – sequence: 3 givenname: Shivaram orcidid: 0000-0001-9575-7935 surname: Venkataraman fullname: Venkataraman, Shivaram email: shivaram@cs.wisc.edu organization: Computer Science Department, University of Wisconsin, Madison, WI, USA |
| BookMark | eNp9kE1PwzAMhiM0JMbYH4DLJM4djpumyXFMfAxN4sA4R1nqoE5bO5L2wL-nWyeEOHCxfXgf23ou2aCqK2LsmsOUc9B3L2-zxWqKgDBFrVIuszM2RCl4ovIcBr_mCzaOcQMAiFzkKh-yyZJsqMrqI7m3kYrJvC6OdbdvG9uUdXXFzr3dRhqf-oi9Pz6s5s_J8vVpMZ8tE4c6axLlQK9d4dFaJb3HlCQIL70rcq1JOMI1IS80d1JQ1jGooCCnLDlKrVfpiN32e_eh_mwpNmZTt6HqThrMUp5JLoToUqpPuVDHGMgbV_Z_NsGWW8PBHJSYoxJzUGJOSjoU_6D7UO5s-Pofuumhkoh-AA0i10Kl31j9bd4 |
| CODEN | IJSTL5 |
| CitedBy_id | crossref_primary_10_1145_3529706_3529708 crossref_primary_10_1109_TIT_2023_3247860 crossref_primary_10_1109_TIT_2025_3565558 crossref_primary_10_1109_TWC_2022_3187427 crossref_primary_10_1109_TWC_2023_3307140 crossref_primary_10_1109_MBITS_2023_3322978 crossref_primary_10_1016_j_phycom_2024_102499 crossref_primary_10_1016_j_phycom_2021_101465 |
| Cites_doi | 10.1109/ISIT.2017.8006961 10.1109/JSTSP.2017.2788405 10.1109/TIT.2017.2736066 10.1109/5.726791 10.1109/TC.1984.1676475 10.1109/GLOCOMW.2016.7848828 10.1145/2556195.2556252 10.1109/CVPR.2016.90 10.1109/IPDPSW.2018.00137 10.1109/ISIT.2019.8849227 10.1145/3341301.3359654 10.1145/3366706 10.1109/ACSSC.2018.8645416 10.1145/2408776.2408794 10.1109/ISIT.2018.8437467 10.1109/ISIT.2019.8849514 10.1109/ISIT.2018.8437852 10.1109/ISIT.2017.8006960 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/JSAIT.2020.2983165 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2641-8770 |
| EndPage | 236 |
| ExternalDocumentID | 10_1109_JSAIT_2020_2983165 9047948 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Office of the Vice Chancellor for Research and Graduate Education, University of Wisconsin, Madison, through the Wisconsin Alumni Research Foundation funderid: 10.13039/100012787 – fundername: NSF grantid: CNS-1850483; CNS-1838733 funderid: 10.13039/501100008982 – fundername: Amazon Web Services funderid: 10.13039/100008536 – fundername: NSF Graduate Research Fellowship grantid: DGE-1745016; DGE-1252522 funderid: 10.13039/100000001 |
| GroupedDBID | 0R~ 97E AAJGR AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE JAVBF OCL RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c295t-8c09bcdf2aa86ff23e604f6fcd799e4ce2be21d91c64e5c29280dec8aece3af83 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001389446400019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2641-8770 |
| IngestDate | Mon Jun 30 06:40:14 EDT 2025 Sat Nov 29 05:41:01 EST 2025 Tue Nov 18 22:23:38 EST 2025 Wed Aug 27 02:38:03 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c295t-8c09bcdf2aa86ff23e604f6fcd799e4ce2be21d91c64e5c29280dec8aece3af83 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-2227-7460 0000-0001-8812-7847 0000-0001-9575-7935 |
| PQID | 2531561444 |
| PQPubID | 5075791 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9047948 proquest_journals_2531561444 crossref_citationtrail_10_1109_JSAIT_2020_2983165 crossref_primary_10_1109_JSAIT_2020_2983165 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-05-01 |
| PublicationDateYYYYMMDD | 2020-05-01 |
| PublicationDate_xml | – month: 05 year: 2020 text: 2020-05-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE journal on selected areas in information theory |
| PublicationTitleAbbrev | JSAIT |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | karakus (ref12) 2017 ref37 ref15 ref14 ref11 ref10 rashmi (ref26) 2014 elson (ref34) 2007 ref2 ref1 ref17 ref16 wang (ref7) 2018 tandon (ref20) 2017 lin (ref30) 2015 zhang (ref40) 2018 zoph (ref33) 2016 yu (ref6) 2017 crankshaw (ref39) 2017 welinder (ref36) 2010 yu (ref13) 2019 narra (ref18) 2019 kosaian (ref19) 2018 ref23 glorot (ref32) 2010 warden (ref35) 2018 yu (ref27) 2016 ref25 ref22 ref21 xiao (ref28) 2017 ref29 ref8 kingma (ref31) 2015 ref9 dutta (ref4) 2016 ref3 ref5 kim (ref24) 2018 simonyan (ref38) 2015 |
| References_xml | – start-page: 4406 year: 2017 ident: ref6 article-title: Polynomial codes: An optimal design for high-dimensional coded matrix multiplication publication-title: Proc 31st Int Conf Neural Inf Process Syst (NIPS) – ident: ref9 doi: 10.1109/ISIT.2017.8006961 – year: 2015 ident: ref30 publication-title: How far can we go without convolution Improving fully-connected networks – start-page: 5139 year: 2018 ident: ref7 article-title: Coded sparse matrix multiplication publication-title: Proc Int Conf Mach Learn (ICML) – year: 2010 ident: ref36 article-title: Caltech-UCSD birds 200 – ident: ref23 doi: 10.1109/JSTSP.2017.2788405 – ident: ref3 doi: 10.1109/TIT.2017.2736066 – year: 2019 ident: ref18 publication-title: Collage inference Tolerating stragglers in distributed neural network inference using coding – start-page: 1215 year: 2019 ident: ref13 article-title: Lagrange coded computing: Optimal design for resiliency, security and privacy publication-title: Proc Int Conf Artif Intell Statist (AISTATS) – start-page: 366 year: 2007 ident: ref34 article-title: Asirra: A captcha that exploits interest-aligned manual image categorization publication-title: Proc 14th ACM Conf Comput Commun Security (CCS) – year: 2015 ident: ref31 article-title: Adam: A method for stochastic optimization publication-title: Proc Int Conf Learn Represent (ICLR)ICLR – ident: ref37 doi: 10.1109/5.726791 – year: 2017 ident: ref28 publication-title: Fashion-mnist a novel image dataset for benchmarking machine learning algorithms – year: 2016 ident: ref33 publication-title: Neural architecture search with reinforcement learning – year: 2018 ident: ref35 publication-title: Speech commands A dataset for limited-vocabulary speech recognition – start-page: 951 year: 2018 ident: ref40 article-title: DeepCPU: Serving RNN-based deep learning models $10\times$ faster publication-title: Proc Ann Conf USENIX Ann Technical Conf (ATC) – start-page: 331 year: 2014 ident: ref26 article-title: A 'hitchhiker's' guide to fast and efficient data reconstruction in erasure-coded data centers publication-title: Proc ACM Conf SIGCOMM – ident: ref2 doi: 10.1109/TC.1984.1676475 – ident: ref5 doi: 10.1109/GLOCOMW.2016.7848828 – year: 2015 ident: ref38 article-title: Very deep convolutional networks for large-scale image recognition publication-title: Proc Int Conf Learn Represent (ICLR) – ident: ref16 doi: 10.1145/2556195.2556252 – start-page: 2100 year: 2016 ident: ref4 article-title: 'Short-Dot': Computing large linear transforms distributedly using coded short dot products publication-title: Proc 31st Int Conf Neural Inf Process Syst (NIPS) – year: 2018 ident: ref24 article-title: Communication algorithms via deep learning publication-title: Proc Int Conf Learn Represent (ICLR)ICLR – start-page: 613 year: 2017 ident: ref39 article-title: Clipper: A low-latency online prediction serving system publication-title: Proc 9th USENIX Conf Netw Syst Design Implement (NSDI) – year: 2018 ident: ref19 publication-title: Learning a code Machine learning for approximate non-linear coded computation – ident: ref29 doi: 10.1109/CVPR.2016.90 – ident: ref22 doi: 10.1109/IPDPSW.2018.00137 – ident: ref14 doi: 10.1109/ISIT.2019.8849227 – ident: ref15 doi: 10.1145/3341301.3359654 – start-page: 249 year: 2010 ident: ref32 article-title: Understanding the difficulty of training deep feedforward neural networks publication-title: Proc 14th Int Conf Artif Intell Statist (AISTATS) – ident: ref10 doi: 10.1145/3366706 – start-page: 3368 year: 2017 ident: ref20 article-title: Gradient coding: Avoiding stragglers in distributed learning publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref25 doi: 10.1109/ACSSC.2018.8645416 – ident: ref1 doi: 10.1145/2408776.2408794 – ident: ref21 doi: 10.1109/ISIT.2018.8437467 – start-page: 5440 year: 2017 ident: ref12 article-title: Straggler mitigation in distributed optimization through data encoding publication-title: Proc 31st Int Neural Inf Precess Syst (NIPS) – year: 2016 ident: ref27 article-title: Multi-scale context aggregation by dilated convolutions publication-title: Proc ICLR – ident: ref11 doi: 10.1109/ISIT.2019.8849514 – ident: ref17 doi: 10.1109/ISIT.2018.8437852 – ident: ref8 doi: 10.1109/ISIT.2017.8006960 |
| SSID | ssj0002214787 |
| Score | 2.2848885 |
| Snippet | Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems.... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 227 |
| SubjectTerms | codes Computational modeling Computer networks Decoding Distributed processing fault tolerance Image classification Image reconstruction Inference information theory Linear functions Machine learning Neural networks Object recognition redundancy Reliability Resilience Speech recognition Training |
| Title | Learning-Based Coded Computation |
| URI | https://ieeexplore.ieee.org/document/9047948 https://www.proquest.com/docview/2531561444 |
| Volume | 1 |
| WOSCitedRecordID | wos001389446400019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2641-8770 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002214787 issn: 2641-8770 databaseCode: RIE dateStart: 20200101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH5sw4MXp05xOqUHb9qtTdImOc7hUJEhOGG30iYvIsgm--Hfb5OmQ1AELyWH90L5XpOXpHnfB3DJaZwnVIswlWUYWGQwlIUoQpZwRVUeEVWpljzyyUTMZvKpAdfbWhhEdJfPsG-b7l--XqiNPSobSMeHLprQ5DytarW25ynECu4IXtfFRHLw8Dy8n5Y7QBL1iRQ0tvnjW-5xYio_ZmCXVsbt_73QPuz55WMwrOJ9AA2cH0K7lmYI_EjtQOB5U1_DmzJN6WC00O5p7VwsjuBlfDsd3YVeDCFURCbrUKhIFkobkuciNYZQTCNmUqM0lxKZQlIgibWMVcowKX2IiDQqkaNCmhtBj6E1X8zxBAKTFLI0QUx4zEhRrhljIzXRNKGMk7ToQlzDlCnPFG4FK94zt2OIZOagzSy0mYe2C1dbn4-KJ-NP644Fc2vpcexCr45G5ofSKiPlLGHpShk7_d3rDHZt39UtxB601ssNnsOO-ly_rZYX7iv5Aoy9uiQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH7MKejFqVOcTu3Bm3a2Sdomxzkcm84hOGG30iYvIsgm--Hfb5NmQ1AELyWH90j5XpOXpHnfB3CZ0DCLqOJ-LIowsECjL3Ke-yxKJJVZQGSpWjJIhkM-HounClyva2EQ0V4-w5Zp2n_5aiqX5qjsRlg-dL4Bm0Y5y1VrrU9UiJHc4cmqMiYQN_fP7f6o2AOSoEUEp6HJIN-yj5VT-TEH28TSrf3vlfZg1y0gvXYZ8X2o4OQAaitxBs-N1Tp4jjn11b8tEpXyOlNln8bORuMQXrp3o07Pd3IIviQiWvhcBiKXSpMs47HWhGIcMB1rqRIhkEkkOZJQiVDGDKPCh_BAoeQZSqSZ5vQIqpPpBI_B01EuChPEKAkZyYtVY6iFIopGlCUkzhsQrmBKpeMKN5IV76ndMwQitdCmBtrUQduAq7XPR8mU8ad13YC5tnQ4NqC5ikbqBtM8JcU8YQhLGTv53esCtnujx0E66A8fTmHH9FPeSWxCdTFb4hlsyc_F23x2br-YL20KvW0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning-Based+Coded+Computation&rft.jtitle=IEEE+journal+on+selected+areas+in+information+theory&rft.au=Kosaian%2C+Jack&rft.au=Rashmi%2C+K.+V.&rft.au=Venkataraman%2C+Shivaram&rft.date=2020-05-01&rft.issn=2641-8770&rft.eissn=2641-8770&rft.volume=1&rft.issue=1&rft.spage=227&rft.epage=236&rft_id=info:doi/10.1109%2FJSAIT.2020.2983165&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JSAIT_2020_2983165 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2641-8770&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2641-8770&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2641-8770&client=summon |