Probabilistic and temporal failure detectors for solving distributed problems
Failure detectors (FD)s are celebrated for their modularity in solving distributed problems. Algorithms are constructed using FD building blocks. Synchrony assumptions to implement FDs are studied separately and are typically expressed as eventual guarantees that need to hold, after some point in ti...
Uloženo v:
| Vydáno v: | Journal of parallel and distributed computing Ročník 158; s. 1 - 15 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier Inc
01.12.2021
|
| Témata: | |
| ISSN: | 0743-7315, 1096-0848 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Failure detectors (FD)s are celebrated for their modularity in solving distributed problems. Algorithms are constructed using FD building blocks. Synchrony assumptions to implement FDs are studied separately and are typically expressed as eventual guarantees that need to hold, after some point in time, forever and deterministically. But in practice, they may hold only probabilistically and temporarily. This paper studies FDs in a realistic system N, where asynchrony is inflicted by probabilistic synchronous communication. We first address a problem with ⋄S, the weakest FD to solve consensus: an implementation of “consensus with probability 1” is possible in N without randomness in the algorithm, while an implementation of “⋄S with probability 1” is impossible in N. We introduce ⋄S⁎, a new FD with probabilistic and temporal accuracy. We prove that ⋄S⁎ (i) is implementable in N and (ii) can replace ⋄S, in several existing deterministic consensus algorithms that use ⋄S, to yield an algorithm that solves “consensus with probability 1”. We extend our results to other FD classes, e.g., ⋄P, and to a larger set of problems (beyond consensus), which we call decisive problems.
•We propose a way to preserve the usefulness of failure detectors (FD)s as software building blocks in probabilistically synchronous systems as (N).•We define <>S*, a probabilistic FD with accuracy ensured for arbitrarily long finite periods and that can be implemented in systems as N.•We present an optimal <>S* algorithm, which achieves in the best case the lowest communication overhead (C-1) compared to all known <>S algorithms.•We extend our FD definitions for other FD classes and other distributed computing problems besides consensus, which we call decisive problems.•We encapsulate the randomization of probabilistic links in the very abstraction of FD, without affecting deterministic algorithms built on top. |
|---|---|
| ISSN: | 0743-7315 1096-0848 |
| DOI: | 10.1016/j.jpdc.2021.07.017 |