Rateless Codes for Distributed Non-linear Computations

Machine learning today involves massive distributed computations running on cloud servers, which are highly susceptible to slowdown or straggling. Recent work has demonstrated the effectiveness of erasure codes in mitigating such slowdown for linear computations, by adding redundant computations suc...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	2021 11th International Symposium on Topics in Coding (ISTC) s. 1 - 5
Hlavní autoři:	Mallick, Ankur, Smith, Sophie, Joshi, Gauri
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 30.08.2021
Témata:	Codes Computational modeling Data models Distributed databases Encoding Machine learning Machine learning algorithms
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Machine learning today involves massive distributed computations running on cloud servers, which are highly susceptible to slowdown or straggling. Recent work has demonstrated the effectiveness of erasure codes in mitigating such slowdown for linear computations, by adding redundant computations such that the entire computation can be recovered as long as a subset of nodes finish their assigned tasks. However, most machine learning algorithms typically involve non-linear computations that cannot be directly handled by these coded computing approaches. In this work, we propose a coded computing strategy for mitigating the effect of stragglers on non-linear distributed computations. Our strategy relies on the observation that many expensive non-linear functions can be decomposed into sums of cheap non-linear functions. We show that erasure codes, specifically rateless codes can be used to generate and compute random linear combinations of these functions at the nodes such that the original function can be computed as long as a subset of nodes return their computations. Simulations and experiments on AWS Lambda demonstrate the superiority of our approach over various uncoded baselines.A full version of this paper is accessible at [1]
DOI:	10.1109/ISTC49272.2021.9594268