Efficient Distributed Algorithms for Minimum Spanning Tree in Dense Graphs

In recent years, the Massively Parallel Computation (MPC) model capturing the MapReduce framework has become the de facto standard model for large-scale data analysis, given the ubiquity of efficient and affordable cloud implementations. In this model, an input of size m is initially distributed amo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE ... International Conference on Data Mining workshops S. 777 - 786
Hauptverfasser: Bateni, MohammadHossein, Monemzadeh, Morteza, Voorintholt, Kees
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.11.2022
Schlagworte:
ISSN:2375-9259
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In recent years, the Massively Parallel Computation (MPC) model capturing the MapReduce framework has become the de facto standard model for large-scale data analysis, given the ubiquity of efficient and affordable cloud implementations. In this model, an input of size m is initially distributed among t machines, each with a local space of size s . Computation proceeds in synchronous rounds in which each machine performs arbitrary local computation on its data and then sends messages to other machines. In this paper, we study the Minimum Spanning Tree (MST) problem for dense graphs in the MPC model. We say a graph G(V,\ E) is relatively dense if m=\Theta(n^{1+c}) where n=\vert V\vert is the number of vertices, m=\vert E\vert is the number of edges in this graph, and 0 < c\leq 1 . We develop the first work- and space-efficient MPC algorithm that with high probability computes an MST of G using \lceil\log\frac{c}{\epsilon}\rceil+1 rounds of communication. As an MPC algorithm, our algorithm uses t=O(n^{c-\epsilon}) machines each one having local storage of size s=O(n^{1+\epsilon}) for any 0 < \epsilon\leq c . Indeed, not only is this algorithm very simple and easy to implement, it also simultaneously achieves optimal total work, per-machine space, and number of rounds.
ISSN:2375-9259
DOI:10.1109/ICDMW58026.2022.00106