An Efficient GPU Implementation of Inclusion-Based Pointer Analysis

We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the gra...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems Jg. 27; H. 2; S. 353 - 366
Hauptverfasser: Su, Yu, Ye, Ding, Xue, Jingling, Liao, Xiang-Ke
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.02.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1045-9219, 1558-2183
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
AbstractList We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarse-grain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
Author Jingling Xue
Xiang-Ke Liao
Ding Ye
Yu Su
Author_xml – sequence: 1
  givenname: Yu
  surname: Su
  fullname: Su, Yu
– sequence: 2
  givenname: Ding
  surname: Ye
  fullname: Ye, Ding
– sequence: 3
  givenname: Jingling
  surname: Xue
  fullname: Xue, Jingling
– sequence: 4
  givenname: Xiang-Ke
  surname: Liao
  fullname: Liao, Xiang-Ke
BookMark eNp9kLFuwjAQhq2KSgXaB6i6ROrSJdQXx3E8UkopElKRCnNknLNklDg0DgNvX0egDgydfCd9__nuG5GBaxwS8gh0AkDl62b9_j1JKPBJwqSQjN2QIXCexwnkbBBqmvJYJiDvyMj7PaWQcpoOyWzqorkxVlt0XbRYb6NlfaiwDp3qbOOixkRLp6ujD038pjyW0bqxrsM2mjpVnbz19-TWqMrjw-Udk-3HfDP7jFdfi-Vsuoo1S7IuFqXRGiXbmRxKQfOSM2Y0SIYpUpQCFc-R5ZnZZVybXS5VIo2hqaBJCtIINiYv57mHtvk5ou-K2nqNVaUcNkdfQLg7EZBmENDnK3TfHNuwb09xIWTGAjwmcKZ023jfoikOra1VeyqAFr3Wotda9FqLi9aQEVcZbc-qulbZ6t_k0zlpEfHvp3CdBBDsF6K3hfk
CODEN ITDSEO
CitedBy_id crossref_primary_10_1109_TSE_2018_2869336
crossref_primary_10_1134_S0361768819070041
crossref_primary_10_1007_s10664_025_10720_3
Cites_doi 10.1109/CGO.2011.5764696
10.1109/SC.2010.46
10.1145/1250734.1250767
10.1145/1290520.1290524
10.1145/1926385.1926445
10.1109/CGO.2009.9
10.1145/2491894.2466483
10.1109/CGO.2011.5764694
10.1145/2442516.2442531
10.1007/978-3-642-03237-0_12
10.1145/2259016.2259050
10.1145/781131.781144
10.1145/1941553.1941590
10.1023/B:SQJO.0000039791.93071.a2
10.1145/2442516.2442523
10.1145/996841.996859
10.1007/3-540-47764-0_15
10.1109/TSE.2014.2302311
10.1109/HiPC.2012.6507474
10.1007/978-3-540-77220-0_21
10.1002/spe.2214
10.1007/BFb0053565
10.1145/1390630.1390658
10.1145/1944862.1944872
10.1145/277650.277667
10.1109/PACT.2011.14
10.1007/978-3-642-28652-0_4
10.1109/PACT.2013.6618800
10.1007/978-3-642-19861-8_5
10.1145/1455770.1455778
10.1145/231379.231389
10.1145/2025113.2025160
10.1109/IPDPS.2011.59
10.1145/1772954.1772985
10.1145/2145816.2145831
10.1145/1869459.1869495
10.1145/1926385.1926390
10.1145/2338965.2336784
10.1109/ICPP.2014.54
10.1109/IPDPS.2013.37
10.1145/2581122.2544154
10.1145/1133981.1134027
10.1145/2370816.2370866
10.1109/HiPC.2013.6799110
10.1145/2541228.2555296
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Feb 2016
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Feb 2016
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TPDS.2015.2397933
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database
Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 366
ExternalDocumentID 3924551691
10_1109_TPDS_2015_2397933
7029117
Genre orig-research
GrantInformation_xml – fundername: Australian Research Council
  grantid: DP130101970; DP150102109
  funderid: 10.13039/501100000923
– fundername: National Natural Science Foundation of China
  grantid: 61170049
  funderid: 10.13039/501100001809
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AGSQL
AHBIQ
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c326t-7dfcce93bf81d708d533fc193e4e0e97ea58e386fb65cfb89a29ff04702419f73
IEDL.DBID RIE
ISICitedReferencesCount 9
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000370925200005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Sep 28 09:38:53 EDT 2025
Sun Nov 09 08:22:23 EST 2025
Tue Nov 18 20:47:35 EST 2025
Sat Nov 29 03:36:08 EST 2025
Wed Aug 27 02:52:21 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords pointer analysis
compilers
GPGPU
Parallel graph algorithms
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/OAPA.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c326t-7dfcce93bf81d708d533fc193e4e0e97ea58e386fb65cfb89a29ff04702419f73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PQID 1757796379
PQPubID 85437
PageCount 14
ParticipantIDs proquest_miscellaneous_1793271461
crossref_primary_10_1109_TPDS_2015_2397933
crossref_citationtrail_10_1109_TPDS_2015_2397933
ieee_primary_7029117
proquest_journals_1757796379
PublicationCentury 2000
PublicationDate 2016-02-01
PublicationDateYYYYMMDD 2016-02-01
PublicationDate_xml – month: 02
  year: 2016
  text: 2016-02-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2016
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref11
ref17
ref16
ref19
ref18
ref50
ref46
ref45
ref48
ref47
ref42
ref41
ref44
ref43
ref49
ref8
ref7
ref9
ref4
ref3
ref6
ref5
ref40
ref35
ref37
ref36
ref31
ref30
ref32
ref2
ref1
ref39
ref38
(ref14) 0
fink (ref25) 0; 5673
ye (ref33) 0
ref24
ref23
ref26
ref20
ref22
ref21
ref28
ref27
ref29
fink (ref34) 0
sui (ref10) 0
References_xml – start-page: 112
  year: 0
  ident: ref34
  article-title: Thin slicing
  publication-title: Proc of the ACM SIGPLAN 2003 Conf on Programming Language Design and Implementation
– ident: ref12
  doi: 10.1109/CGO.2011.5764696
– start-page: 1
  year: 0
  ident: ref10
  article-title: Query-directed adaptive heap cloning for optimizing compilers
  publication-title: Proc Int l Symp Code Generations and Optimization
– ident: ref44
  doi: 10.1109/SC.2010.46
– ident: ref15
  doi: 10.1145/1250734.1250767
– ident: ref5
  doi: 10.1145/1290520.1290524
– ident: ref50
  doi: 10.1145/1926385.1926445
– ident: ref19
  doi: 10.1109/CGO.2009.9
– ident: ref32
  doi: 10.1145/2491894.2466483
– ident: ref20
  doi: 10.1109/CGO.2011.5764694
– ident: ref42
  doi: 10.1145/2442516.2442531
– ident: ref35
  doi: 10.1007/978-3-642-03237-0_12
– ident: ref30
  doi: 10.1145/2259016.2259050
– ident: ref28
  doi: 10.1145/781131.781144
– ident: ref27
  doi: 10.1145/1941553.1941590
– ident: ref18
  doi: 10.1023/B:SQJO.0000039791.93071.a2
– ident: ref26
  doi: 10.1145/2442516.2442523
– ident: ref6
  doi: 10.1145/996841.996859
– ident: ref1
  doi: 10.1007/3-540-47764-0_15
– ident: ref38
  doi: 10.1109/TSE.2014.2302311
– ident: ref46
  doi: 10.1109/HiPC.2012.6507474
– ident: ref43
  doi: 10.1007/978-3-540-77220-0_21
– ident: ref11
  doi: 10.1002/spe.2214
– ident: ref17
  doi: 10.1007/BFb0053565
– ident: ref29
  doi: 10.1145/1390630.1390658
– ident: ref39
  doi: 10.1145/1944862.1944872
– ident: ref16
  doi: 10.1145/277650.277667
– ident: ref45
  doi: 10.1109/PACT.2011.14
– ident: ref22
  doi: 10.1007/978-3-642-28652-0_4
– ident: ref40
  doi: 10.1109/PACT.2013.6618800
– ident: ref31
  doi: 10.1007/978-3-642-19861-8_5
– ident: ref4
  doi: 10.1145/1455770.1455778
– ident: ref2
  doi: 10.1145/231379.231389
– volume: 5673
  start-page: 205
  year: 0
  ident: ref25
  article-title: The complexity of Andersen's analysis in practice
  publication-title: Proc 16th Int Static Anal Symp
– ident: ref13
  doi: 10.1145/2025113.2025160
– ident: ref49
  doi: 10.1109/IPDPS.2011.59
– ident: ref8
  doi: 10.1145/1772954.1772985
– start-page: 319
  year: 0
  ident: ref33
  article-title: Region-based selective flow-sensitive pointer analysis
  publication-title: Proc Int Symp Static Analysis
– ident: ref23
  doi: 10.1145/2145816.2145831
– ident: ref21
  doi: 10.1145/1869459.1869495
– ident: ref9
  doi: 10.1145/1926385.1926390
– ident: ref37
  doi: 10.1145/2338965.2336784
– ident: ref41
  doi: 10.1109/ICPP.2014.54
– year: 0
  ident: ref14
– ident: ref48
  doi: 10.1109/IPDPS.2013.37
– ident: ref3
  doi: 10.1145/2581122.2544154
– ident: ref7
  doi: 10.1145/1133981.1134027
– ident: ref47
  doi: 10.1145/2370816.2370866
– ident: ref24
  doi: 10.1109/HiPC.2013.6799110
– ident: ref36
  doi: 10.1145/2541228.2555296
SSID ssj0014504
Score 2.204676
Snippet We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based,...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 353
SubjectTerms Adaptation models
Algorithm design and analysis
Algorithms
Balancing
Benchmarks
compilers
Computer information security
GPGPU
Graphics processing units
Graphs
Instruction sets
Optimization
Parallel graph algorithms
Partitioning algorithms
pointer analysis
Switches
Synchronization
Vectors
Workload
Title An Efficient GPU Implementation of Inclusion-Based Pointer Analysis
URI https://ieeexplore.ieee.org/document/7029117
https://www.proquest.com/docview/1757796379
https://www.proquest.com/docview/1793271461
Volume 27
WOSCitedRecordID wos000370925200005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED90-KAPTjfF-UUEn8TM2LRN86hz6oOMgRvsrbRpAoK2sg__fi9ZVhRF8K20aZrmcrnf5ZLfAZwHOGZNZDhlhcxoKGJGpeIxNcZkgVCKs9xJ-kkMBslkIodrcFmfhdFau81numsvXSy_qNTCLpVdCRagbop1WBciXp7VqiMGYeRSBaJ3EVGJaugjmNdMXo2Gd892E1fUDWwUi_NvNsglVfkxEzvzct_8X8N2YNvDSHKzlPsurOmyBc1VigbiNbYFW1_4BtvQuylJ33FGYHXkYTgmjhz4zZ8_KkllCE4Yrwu7hEZv0cAVZFhZRokpWbGX7MH4vj_qPVKfRYEqhGZzKgqjlJY8NwhNBUsKBHhGIW7ToWZaCp1FieZJbPI4UiZPZBZIY1iI_xReSyP4PjTKqtQHQPB2JEPJlUaYleUILlQg0UFjBq1-Ik0H2KpfU-Upxm2mi9fUuRpMplYUqRVF6kXRgYv6lfclv8Zfhdu27-uCvts7cLwSXuo1cJYiLBICZxchO3BWP0bdsQGRrNTVwpZB9CpsZvPD32s-gk38vt-lfQyN-XShT2BDfcxfZtNTNwA_ATYE1sU
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_8AvXBj6k4nRrBJzFbbNqlefRbcY6BE3wrbZqAMFuZm3-_lywriiL4VtprSHK53C-55HcARwGOWRMZTlkuUxqKNqNS8TY1xqSBUIqzzGm6I7rd-PlZ9mbgpLoLo7V2h8900z66WH5eqrHdKmsJFqBtilmYj8IwYJPbWlXMIIxcskBcX0RUoiH6GOYpk61-7_LRHuOKmoGNY3H-zQu5tCo_5mLnYK5X_1e1NVjxQJKcTTS_DjO6qMHqNEkD8TZbg-UvjIMbcHFWkCvHGoHFkZveE3H0wK_-BlJBSkNwyhiM7SYaPUcXl5NeaTklhmTKX7IJT9dX_Ytb6vMoUIXgbERFbpTSkmcGwalgcY4QzyhEbjrUTEuh0yjWPG6brB0pk8UyDaQxLMQ2hafSCL4Fc0VZ6G0g-DqSoeRKI9BKM4QXKpC4RGMG_X4sTR3YtF8T5UnGba6LQeIWG0wmVhWJVUXiVVGH4-qXtwnDxl_CG7bvK0Hf7XVoTJWXeBt8TxAYCYHzi5B1OKw-o_XYkEha6HJsZRC_CpvbfOf3kg9g8bb_0Ek6d937XVjCuvgz2w2YGw3Heg8W1Mfo5X247wbjJ4sQ2gw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Efficient+GPU+Implementation+of+Inclusion-Based+Pointer+Analysis&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Su%2C+Yu&rft.au=Ye%2C+Ding&rft.au=Xue%2C+Jingling&rft.au=Liao%2C+Xiang-Ke&rft.date=2016-02-01&rft.issn=1045-9219&rft.volume=27&rft.issue=2&rft.spage=353&rft.epage=366&rft_id=info:doi/10.1109%2FTPDS.2015.2397933&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TPDS_2015_2397933
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon