An Efficient GPU Implementation of Inclusion-Based Pointer Analysis

We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the gra...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on parallel and distributed systems Ročník 27; číslo 2; s. 353 - 366
Hlavní autoři: Su, Yu, Ye, Ding, Xue, Jingling, Liao, Xiang-Ke
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.02.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1045-9219, 1558-2183
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2015.2397933