SOOP: Efficient Distributed Graph Computation Supporting Second-Order Random Walks

The second-order random walk has recently been shown to effectively improve the accuracy in graph analysis tasks. Existing work mainly focuses on centralized second-order random walk (SOW) algorithms. SOW algorithms rely on edge-to-edge transition probabilities to generate next random steps. However...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computer science and technology Vol. 36; no. 5; pp. 985 - 1001
Main Authors: Niu, Songjie, Zhou, Dongyan
Format: Journal Article
Language:English
Published: Singapore Springer Singapore 01.10.2021
Springer
Springer Nature B.V
University of Chinese Academy of Sciences,Beijing 100049,China%Bytedance Technology,Beijing 100086,China
State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences Beijing 100190,China
Subjects:
ISSN:1000-9000, 1860-4749
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The second-order random walk has recently been shown to effectively improve the accuracy in graph analysis tasks. Existing work mainly focuses on centralized second-order random walk (SOW) algorithms. SOW algorithms rely on edge-to-edge transition probabilities to generate next random steps. However, it is prohibitively costly to store all the probabilities for large-scale graphs, and restricting the number of probabilities to consider can negatively impact the accuracy of graph analysis tasks. In this paper, we propose and study an alternative approach, SOOP (second-order random walks with on-demand probability computation), that avoids the space overhead by computing the edge-to-edge transition probabilities on demand during the random walk. However, the same probabilities may be computed multiple times when the same edge appears multiple times in SOW, incurring extra cost for redundant computation and communication. We propose two optimization techniques that reduce the complexity of computing edge-to-edge transition probabilities to generate next random steps, and reduce the cost of communicating out-neighbors for the probability computation, respectively. Our experiments on real-world and synthetic graphs show that SOOP achieves orders of magnitude better performance than baseline precompute solutions, and it can efficiently computes SOW algorithms on billion-scale graphs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1000-9000
1860-4749
DOI:10.1007/s11390-021-1234-y