DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs

Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detecting communities for such evolving networks poses unique challenges. Detecting and...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:International journal of parallel programming Ročník 53; číslo 1; s. 4
Hlavní autoři: Sattar, Naw Safrin, Ibrahim, Khaled Z., Buluc, Aydin, Arifuzzaman, Shaikh
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.02.2025
Springer Nature B.V
Springer
Témata:
ISSN:0885-7458, 1573-7640
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detecting communities for such evolving networks poses unique challenges. Detecting and analyzing the evolution of communities within dynamic graphs unlocks valuable insights into the underlying structural and temporal patterns of real-world systems. However, the sheer volume of modern graph data and the inherent complexity of the temporal dimension pose significant challenges to scalable community detection algorithms. Addressing this gap, our work explores the limited landscape of scalable distributed-memory parallel methods specifically designed for dynamic network community detection. We propose a novel parallel algorithm, DyG-DPCD ( Dy namic G raph D istributed P arallel C ommunity D etection), to detect communities in dynamic networks using the Message Passing Interface (MPI) framework. We present a vertex-centric approach, allowing us to detect communities through local optimization. Furthermore, we enhance our baseline algorithm by incorporating three heuristics, which improve the algorithm’s performance significantly while maintaining the quality of the solutions. We demonstrate the efficiency of our algorithm by experimenting on several real-world large-scale networks with hundreds of millions of edges spanning diverse domains. Notably, DyG-DPCD achieves speedups between 25 × and 30 × for large networks that we experimented on using NERSC compute nodes. Our algorithm outperforms the STINGER parallel re-agglomeration algorithm by 30 × .
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
National Science Foundation (NSF)
AC05-00OR22725; AC02-05CH11231
USDOE National Nuclear Security Administration (NNSA)
ISSN:0885-7458
1573-7640
DOI:10.1007/s10766-024-00780-1