Partition-Merge: Distributed Inference and Modularity Optimization

This paper presents a novel meta-algorithm, Partition-Merge (PM), which takes existing centralized algorithms for graph computation and makes them distributed and faster. In a nutshell, PM divides the graph into small subgraphs using our novel randomized partitioning scheme, runs the centralized alg...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE access Ročník 9; s. 54032 - 54055
Hlavní autori: Blondel, Vincent, Jung, Kyomin, Kohli, Pushmeet, Shah, Devavrat, Won, Seungpil
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:2169-3536, 2169-3536
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:This paper presents a novel meta-algorithm, Partition-Merge (PM), which takes existing centralized algorithms for graph computation and makes them distributed and faster. In a nutshell, PM divides the graph into small subgraphs using our novel randomized partitioning scheme, runs the centralized algorithm on each partition separately, and then stitches the resulting solutions to produce a global solution. We demonstrate the efficiency of the PM algorithm on two popular problems: computation of Maximum A Posteriori (MAP) assignment in an arbitrary pairwise Markov Random Field (MRF) and modularity optimization for community detection. We show that the resulting distributed algorithms for these problems become fast, which run in time linear in the number of nodes in the graph. Furthermore, PM leads to performance comparable - or even better - to that of the centralized algorithms as long as the graph has polynomial growth property. More precisely, if the centralized algorithm is a <inline-formula> <tex-math notation="LaTeX">\mathcal {C}- </tex-math></inline-formula>factor approximation with constant <inline-formula> <tex-math notation="LaTeX">\mathcal {C}\ge 1 </tex-math></inline-formula>, the resulting distributed algorithm is a <inline-formula> <tex-math notation="LaTeX">(\mathcal {C}+\delta) </tex-math></inline-formula>-factor approximation for any small <inline-formula> <tex-math notation="LaTeX">\delta >0 </tex-math></inline-formula>; and even if the centralized algorithm is a non-constant (e.g., logarithmic) factor approximation, then the resulting distributed algorithm becomes a constant factor approximation. For general graphs, we compute explicit bounds on the loss of performance of the resulting distributed algorithm with respect to the centralized algorithm. To show the efficiency of our algorithm, we conducted extensive experiments both on real-world networks and on synthetic networks. The experiments demonstrate that the PM algorithm provides a good trade-off between accuracy and running time.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3070490