Flexible Symmetrical Global-Snapshot Algorithms for Large-Scale Distributed Systems

Most existing global-snapshot algorithms in distributed systems use control messages to coordinate the construction of a global snapshot among all processes. Since these algorithms typically assume the underlying logical overlay topology is fully connected, the number of control messages exchanged a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems Vol. 24; no. 3; pp. 493 - 505
Main Author: Tsai, Jichiang
Format: Journal Article
Language:English
Published: IEEE 01.03.2013
Subjects:
ISSN:1045-9219
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Most existing global-snapshot algorithms in distributed systems use control messages to coordinate the construction of a global snapshot among all processes. Since these algorithms typically assume the underlying logical overlay topology is fully connected, the number of control messages exchanged among the whole processes is proportional to the square of number of processes, resulting in higher possibility of network congestion. Hence, such algorithms are neither efficient nor scalable for a large-scale distributed system composed of a huge number of processes. Recently, some efforts have been presented to significantly reduce the number of control messages, but doing so incurs higher response time instead. In this paper, we propose an efficient global-snapshot algorithm able to let every process finish its local snapshot in a given number of rounds. Particularly, such an algorithm allows a tradeoff between the response time and the message complexity. Moreover, our global-snapshot algorithm is symmetrical in the sense that identical steps are executed by every process. This means that our algorithm is able to achieve better workload balance and less network congestion. Most importantly, based on our framework, we demonstrate that the minimum number of control messages required by a symmetrical global-snapshot algorithm is Ω(N log N), where N is the number of processes. Finally, we also assume non-FIFO channels.
ISSN:1045-9219
DOI:10.1109/TPDS.2012.139