Learning-Based Coded Computation

Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal on selected areas in information theory Vol. 1; no. 1; pp. 227 - 236
Main Authors: Kosaian, Jack, Rashmi, K. V., Venkataraman, Shivaram
Format: Journal Article
Language:English
Published: Piscataway IEEE 01.05.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2641-8770, 2641-8770
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of non-linear computations while requiring high resource overhead. In this work, we propose a learning-based coded computation framework to overcome the challenges of performing coded computation for general non-linear functions. We show that careful use of machine learning within the coded computation framework can extend the reach of coded computation to imparting resilience to more general non-linear computations. We showcase the applicability of learning-based coded computation to neural network inference, a major workload in production services. Our evaluation results show that learning-based coded computation enables accurate reconstruction of unavailable results from widely deployed neural networks for a variety of inference tasks such as image classification, speech recognition, and object localization. We implement our proposed approach atop an open-source prediction serving system and show its promise in alleviating slowdowns that occur in neural network inference. These results indicate the potential for learning-based approaches to open new doors for the use of coded computation for broader, non-linear computations.
AbstractList Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems. However, existing coded computation approaches are either unable to support non-linear computations, or can only support a limited subset of non-linear computations while requiring high resource overhead. In this work, we propose a learning-based coded computation framework to overcome the challenges of performing coded computation for general non-linear functions. We show that careful use of machine learning within the coded computation framework can extend the reach of coded computation to imparting resilience to more general non-linear computations. We showcase the applicability of learning-based coded computation to neural network inference, a major workload in production services. Our evaluation results show that learning-based coded computation enables accurate reconstruction of unavailable results from widely deployed neural networks for a variety of inference tasks such as image classification, speech recognition, and object localization. We implement our proposed approach atop an open-source prediction serving system and show its promise in alleviating slowdowns that occur in neural network inference. These results indicate the potential for learning-based approaches to open new doors for the use of coded computation for broader, non-linear computations.
Author Venkataraman, Shivaram
Kosaian, Jack
Rashmi, K. V.
Author_xml – sequence: 1
  givenname: Jack
  orcidid: 0000-0001-8812-7847
  surname: Kosaian
  fullname: Kosaian, Jack
  email: jkosaian@cs.cmu.edu
  organization: Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
– sequence: 2
  givenname: K. V.
  orcidid: 0000-0002-2227-7460
  surname: Rashmi
  fullname: Rashmi, K. V.
  email: rvinayak@cs.cmu.edu
  organization: Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
– sequence: 3
  givenname: Shivaram
  orcidid: 0000-0001-9575-7935
  surname: Venkataraman
  fullname: Venkataraman, Shivaram
  email: shivaram@cs.wisc.edu
  organization: Computer Science Department, University of Wisconsin, Madison, WI, USA
BookMark eNp9kE1PwzAMhiM0JMbYH4DLJM4djpumyXFMfAxN4sA4R1nqoE5bO5L2wL-nWyeEOHCxfXgf23ou2aCqK2LsmsOUc9B3L2-zxWqKgDBFrVIuszM2RCl4ovIcBr_mCzaOcQMAiFzkKh-yyZJsqMrqI7m3kYrJvC6OdbdvG9uUdXXFzr3dRhqf-oi9Pz6s5s_J8vVpMZ8tE4c6axLlQK9d4dFaJb3HlCQIL70rcq1JOMI1IS80d1JQ1jGooCCnLDlKrVfpiN32e_eh_mwpNmZTt6HqThrMUp5JLoToUqpPuVDHGMgbV_Z_NsGWW8PBHJSYoxJzUGJOSjoU_6D7UO5s-Pofuumhkoh-AA0i10Kl31j9bd4
CODEN IJSTL5
CitedBy_id crossref_primary_10_1145_3529706_3529708
crossref_primary_10_1109_TIT_2023_3247860
crossref_primary_10_1109_TIT_2025_3565558
crossref_primary_10_1109_TWC_2022_3187427
crossref_primary_10_1109_TWC_2023_3307140
crossref_primary_10_1109_MBITS_2023_3322978
crossref_primary_10_1016_j_phycom_2024_102499
crossref_primary_10_1016_j_phycom_2021_101465
Cites_doi 10.1109/ISIT.2017.8006961
10.1109/JSTSP.2017.2788405
10.1109/TIT.2017.2736066
10.1109/5.726791
10.1109/TC.1984.1676475
10.1109/GLOCOMW.2016.7848828
10.1145/2556195.2556252
10.1109/CVPR.2016.90
10.1109/IPDPSW.2018.00137
10.1109/ISIT.2019.8849227
10.1145/3341301.3359654
10.1145/3366706
10.1109/ACSSC.2018.8645416
10.1145/2408776.2408794
10.1109/ISIT.2018.8437467
10.1109/ISIT.2019.8849514
10.1109/ISIT.2018.8437852
10.1109/ISIT.2017.8006960
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/JSAIT.2020.2983165
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2641-8770
EndPage 236
ExternalDocumentID 10_1109_JSAIT_2020_2983165
9047948
Genre orig-research
GrantInformation_xml – fundername: Office of the Vice Chancellor for Research and Graduate Education, University of Wisconsin, Madison, through the Wisconsin Alumni Research Foundation
  funderid: 10.13039/100012787
– fundername: NSF
  grantid: CNS-1850483; CNS-1838733
  funderid: 10.13039/501100008982
– fundername: Amazon Web Services
  funderid: 10.13039/100008536
– fundername: NSF Graduate Research Fellowship
  grantid: DGE-1745016; DGE-1252522
  funderid: 10.13039/100000001
GroupedDBID 0R~
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
JAVBF
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c295t-8c09bcdf2aa86ff23e604f6fcd799e4ce2be21d91c64e5c29280dec8aece3af83
IEDL.DBID RIE
ISICitedReferencesCount 15
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001389446400019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2641-8770
IngestDate Mon Jun 30 06:40:14 EDT 2025
Sat Nov 29 05:41:01 EST 2025
Tue Nov 18 22:23:38 EST 2025
Wed Aug 27 02:38:03 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-8c09bcdf2aa86ff23e604f6fcd799e4ce2be21d91c64e5c29280dec8aece3af83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-2227-7460
0000-0001-8812-7847
0000-0001-9575-7935
PQID 2531561444
PQPubID 5075791
PageCount 10
ParticipantIDs ieee_primary_9047948
proquest_journals_2531561444
crossref_citationtrail_10_1109_JSAIT_2020_2983165
crossref_primary_10_1109_JSAIT_2020_2983165
PublicationCentury 2000
PublicationDate 2020-05-01
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-05-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE journal on selected areas in information theory
PublicationTitleAbbrev JSAIT
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References karakus (ref12) 2017
ref37
ref15
ref14
ref11
ref10
rashmi (ref26) 2014
elson (ref34) 2007
ref2
ref1
ref17
ref16
wang (ref7) 2018
tandon (ref20) 2017
lin (ref30) 2015
zhang (ref40) 2018
zoph (ref33) 2016
yu (ref6) 2017
crankshaw (ref39) 2017
welinder (ref36) 2010
yu (ref13) 2019
narra (ref18) 2019
kosaian (ref19) 2018
ref23
glorot (ref32) 2010
warden (ref35) 2018
yu (ref27) 2016
ref25
ref22
ref21
xiao (ref28) 2017
ref29
ref8
kingma (ref31) 2015
ref9
dutta (ref4) 2016
ref3
ref5
kim (ref24) 2018
simonyan (ref38) 2015
References_xml – start-page: 4406
  year: 2017
  ident: ref6
  article-title: Polynomial codes: An optimal design for high-dimensional coded matrix multiplication
  publication-title: Proc 31st Int Conf Neural Inf Process Syst (NIPS)
– ident: ref9
  doi: 10.1109/ISIT.2017.8006961
– year: 2015
  ident: ref30
  publication-title: How far can we go without convolution Improving fully-connected networks
– start-page: 5139
  year: 2018
  ident: ref7
  article-title: Coded sparse matrix multiplication
  publication-title: Proc Int Conf Mach Learn (ICML)
– year: 2010
  ident: ref36
  article-title: Caltech-UCSD birds 200
– ident: ref23
  doi: 10.1109/JSTSP.2017.2788405
– ident: ref3
  doi: 10.1109/TIT.2017.2736066
– year: 2019
  ident: ref18
  publication-title: Collage inference Tolerating stragglers in distributed neural network inference using coding
– start-page: 1215
  year: 2019
  ident: ref13
  article-title: Lagrange coded computing: Optimal design for resiliency, security and privacy
  publication-title: Proc Int Conf Artif Intell Statist (AISTATS)
– start-page: 366
  year: 2007
  ident: ref34
  article-title: Asirra: A captcha that exploits interest-aligned manual image categorization
  publication-title: Proc 14th ACM Conf Comput Commun Security (CCS)
– year: 2015
  ident: ref31
  article-title: Adam: A method for stochastic optimization
  publication-title: Proc Int Conf Learn Represent (ICLR)ICLR
– ident: ref37
  doi: 10.1109/5.726791
– year: 2017
  ident: ref28
  publication-title: Fashion-mnist a novel image dataset for benchmarking machine learning algorithms
– year: 2016
  ident: ref33
  publication-title: Neural architecture search with reinforcement learning
– year: 2018
  ident: ref35
  publication-title: Speech commands A dataset for limited-vocabulary speech recognition
– start-page: 951
  year: 2018
  ident: ref40
  article-title: DeepCPU: Serving RNN-based deep learning models $10\times$ faster
  publication-title: Proc Ann Conf USENIX Ann Technical Conf (ATC)
– start-page: 331
  year: 2014
  ident: ref26
  article-title: A 'hitchhiker's' guide to fast and efficient data reconstruction in erasure-coded data centers
  publication-title: Proc ACM Conf SIGCOMM
– ident: ref2
  doi: 10.1109/TC.1984.1676475
– ident: ref5
  doi: 10.1109/GLOCOMW.2016.7848828
– year: 2015
  ident: ref38
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: Proc Int Conf Learn Represent (ICLR)
– ident: ref16
  doi: 10.1145/2556195.2556252
– start-page: 2100
  year: 2016
  ident: ref4
  article-title: 'Short-Dot': Computing large linear transforms distributedly using coded short dot products
  publication-title: Proc 31st Int Conf Neural Inf Process Syst (NIPS)
– year: 2018
  ident: ref24
  article-title: Communication algorithms via deep learning
  publication-title: Proc Int Conf Learn Represent (ICLR)ICLR
– start-page: 613
  year: 2017
  ident: ref39
  article-title: Clipper: A low-latency online prediction serving system
  publication-title: Proc 9th USENIX Conf Netw Syst Design Implement (NSDI)
– year: 2018
  ident: ref19
  publication-title: Learning a code Machine learning for approximate non-linear coded computation
– ident: ref29
  doi: 10.1109/CVPR.2016.90
– ident: ref22
  doi: 10.1109/IPDPSW.2018.00137
– ident: ref14
  doi: 10.1109/ISIT.2019.8849227
– ident: ref15
  doi: 10.1145/3341301.3359654
– start-page: 249
  year: 2010
  ident: ref32
  article-title: Understanding the difficulty of training deep feedforward neural networks
  publication-title: Proc 14th Int Conf Artif Intell Statist (AISTATS)
– ident: ref10
  doi: 10.1145/3366706
– start-page: 3368
  year: 2017
  ident: ref20
  article-title: Gradient coding: Avoiding stragglers in distributed learning
  publication-title: Proc Int Conf Mach Learn (ICML)
– ident: ref25
  doi: 10.1109/ACSSC.2018.8645416
– ident: ref1
  doi: 10.1145/2408776.2408794
– ident: ref21
  doi: 10.1109/ISIT.2018.8437467
– start-page: 5440
  year: 2017
  ident: ref12
  article-title: Straggler mitigation in distributed optimization through data encoding
  publication-title: Proc 31st Int Neural Inf Precess Syst (NIPS)
– year: 2016
  ident: ref27
  article-title: Multi-scale context aggregation by dilated convolutions
  publication-title: Proc ICLR
– ident: ref11
  doi: 10.1109/ISIT.2019.8849514
– ident: ref17
  doi: 10.1109/ISIT.2018.8437852
– ident: ref8
  doi: 10.1109/ISIT.2017.8006960
SSID ssj0002214787
Score 2.2848885
Snippet Recent advances have shown the potential for coded computation to impart resilience against slowdowns and failures that occur in distributed computing systems....
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 227
SubjectTerms codes
Computational modeling
Computer networks
Decoding
Distributed processing
fault tolerance
Image classification
Image reconstruction
Inference
information theory
Linear functions
Machine learning
Neural networks
Object recognition
redundancy
Reliability
Resilience
Speech recognition
Training
Title Learning-Based Coded Computation
URI https://ieeexplore.ieee.org/document/9047948
https://www.proquest.com/docview/2531561444
Volume 1
WOSCitedRecordID wos001389446400019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2641-8770
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002214787
  issn: 2641-8770
  databaseCode: RIE
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH5sw4MXp05xOqUHb9qtTdImOc7hUJEhOGG30iYvIsgm--Hfb5OmQ1AELyWH90L5XpOXpHnfB3DJaZwnVIswlWUYWGQwlIUoQpZwRVUeEVWpljzyyUTMZvKpAdfbWhhEdJfPsG-b7l--XqiNPSobSMeHLprQ5DytarW25ynECu4IXtfFRHLw8Dy8n5Y7QBL1iRQ0tvnjW-5xYio_ZmCXVsbt_73QPuz55WMwrOJ9AA2cH0K7lmYI_EjtQOB5U1_DmzJN6WC00O5p7VwsjuBlfDsd3YVeDCFURCbrUKhIFkobkuciNYZQTCNmUqM0lxKZQlIgibWMVcowKX2IiDQqkaNCmhtBj6E1X8zxBAKTFLI0QUx4zEhRrhljIzXRNKGMk7ToQlzDlCnPFG4FK94zt2OIZOagzSy0mYe2C1dbn4-KJ-NP644Fc2vpcexCr45G5ofSKiPlLGHpShk7_d3rDHZt39UtxB601ssNnsOO-ly_rZYX7iv5Aoy9uiQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH7MKejFqVOcTu3Bm3a2Sdomxzkcm84hOGG30iYvIsgm--Hfb5NmQ1AELyWH90j5XpOXpHnfB3CZ0DCLqOJ-LIowsECjL3Ke-yxKJJVZQGSpWjJIhkM-HounClyva2EQ0V4-w5Zp2n_5aiqX5qjsRlg-dL4Bm0Y5y1VrrU9UiJHc4cmqMiYQN_fP7f6o2AOSoEUEp6HJIN-yj5VT-TEH28TSrf3vlfZg1y0gvXYZ8X2o4OQAaitxBs-N1Tp4jjn11b8tEpXyOlNln8bORuMQXrp3o07Pd3IIviQiWvhcBiKXSpMs47HWhGIcMB1rqRIhkEkkOZJQiVDGDKPCh_BAoeQZSqSZ5vQIqpPpBI_B01EuChPEKAkZyYtVY6iFIopGlCUkzhsQrmBKpeMKN5IV76ndMwQitdCmBtrUQduAq7XPR8mU8ad13YC5tnQ4NqC5ikbqBtM8JcU8YQhLGTv53esCtnujx0E66A8fTmHH9FPeSWxCdTFb4hlsyc_F23x2br-YL20KvW0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning-Based+Coded+Computation&rft.jtitle=IEEE+journal+on+selected+areas+in+information+theory&rft.au=Kosaian%2C+Jack&rft.au=Rashmi%2C+K.+V.&rft.au=Venkataraman%2C+Shivaram&rft.date=2020-05-01&rft.issn=2641-8770&rft.eissn=2641-8770&rft.volume=1&rft.issue=1&rft.spage=227&rft.epage=236&rft_id=info:doi/10.1109%2FJSAIT.2020.2983165&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JSAIT_2020_2983165
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2641-8770&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2641-8770&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2641-8770&client=summon