CGCGraph: Efficient CPU-GPU Co-execution for Concurrent Dynamic Graph Processing

Uložené v:
Podrobná bibliografia
Názov: CGCGraph: Efficient CPU-GPU Co-execution for Concurrent Dynamic Graph Processing
Autori: Yiming Sun, Jie Zhang, Huawei Cao, Yuan Zhang, Xuejun An, Junying Huang, Xiaochun Ye
Zdroj: ACM Transactions on Architecture and Code Optimization. 22:1-26
Informácie o vydavateľovi: Association for Computing Machinery (ACM), 2025.
Rok vydania: 2025
Popis: With the continuous growth of user scale and application data, the demand for large-scale concurrent graph processing is increasing. Typically, large-scale concurrent graph processing jobs need to process corresponding snapshots of dynamically changing graph data to obtain information at different time points. To enhance the throughput of such applications, current solutions concurrently process multiple graph snapshots on the GPU. However, when dealing with rapidly changing graph data, transferring multiple snapshots of concurrent jobs to the GPU results in high data transfer overhead between CPU and GPU. Additionally, the execution mode of existing work suffers from underutilization of GPU computational resources. In this work, we introduce CGCGraph, which can be integrated into existing GPU graph processing systems like Subway, to enable efficient concurrent graph snapshot processing jobs and enhance overall system resource utilization. The key idea is to offload unshared graph data of multiple concurrent snapshots to the CPU, reducing CPU-GPU transfer overhead. By implementing CPU-GPU co-execution, there is potential for enhanced utilization of GPU computing resources. Specifically, CGCGraph leverages kernel fusion to process shared graph data concurrently on the GPU, while executing all snapshots in parallel on the CPU, with each snapshot assigned a dedicated thread. This approach enables efficient concurrent processing within a novel CPU-GPU co-execution model, incorporating three optimization strategies targeting storage, computation, and synchronization. We integrate CGCGraph with Subway, an existing system designed for out-of-GPU-memory static graph processing. Experimental results show that the integration of CGCGraph with current GPU-based systems obtains performance improvements ranging from 1.7 to 4.5 times.
Druh dokumentu: Article
Jazyk: English
ISSN: 1544-3973
1544-3566
DOI: 10.1145/3744904
Prístupové číslo: edsair.doi...........3be68c50c9d6cf83c495da3491f4b2a0
Databáza: OpenAIRE
Popis
Abstrakt:With the continuous growth of user scale and application data, the demand for large-scale concurrent graph processing is increasing. Typically, large-scale concurrent graph processing jobs need to process corresponding snapshots of dynamically changing graph data to obtain information at different time points. To enhance the throughput of such applications, current solutions concurrently process multiple graph snapshots on the GPU. However, when dealing with rapidly changing graph data, transferring multiple snapshots of concurrent jobs to the GPU results in high data transfer overhead between CPU and GPU. Additionally, the execution mode of existing work suffers from underutilization of GPU computational resources. In this work, we introduce CGCGraph, which can be integrated into existing GPU graph processing systems like Subway, to enable efficient concurrent graph snapshot processing jobs and enhance overall system resource utilization. The key idea is to offload unshared graph data of multiple concurrent snapshots to the CPU, reducing CPU-GPU transfer overhead. By implementing CPU-GPU co-execution, there is potential for enhanced utilization of GPU computing resources. Specifically, CGCGraph leverages kernel fusion to process shared graph data concurrently on the GPU, while executing all snapshots in parallel on the CPU, with each snapshot assigned a dedicated thread. This approach enables efficient concurrent processing within a novel CPU-GPU co-execution model, incorporating three optimization strategies targeting storage, computation, and synchronization. We integrate CGCGraph with Subway, an existing system designed for out-of-GPU-memory static graph processing. Experimental results show that the integration of CGCGraph with current GPU-based systems obtains performance improvements ranging from 1.7 to 4.5 times.
ISSN:15443973
15443566
DOI:10.1145/3744904