An Efficient GPU Implementation of Inclusion-Based Pointer Analysis

We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the gra...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems Jg. 27; H. 2; S. 353 - 366
Hauptverfasser:	Su, Yu, Ye, Ding, Xue, Jingling, Liao, Xiang-Ke
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 01.02.2016 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Adaptation models Algorithm design and analysis Algorithms Balancing Benchmarks compilers Computer information security GPGPU Graphics processing units Graphs Instruction sets Optimization Parallel graph algorithms Partitioning algorithms pointer analysis Switches Synchronization Vectors Workload pointer analysis compilers GPGPU Parallel graph algorithms
ISSN:	1045-9219, 1558-2183
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
AbstractList	We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarse-grain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU. We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.
Author	Jingling Xue Xiang-Ke Liao Ding Ye Yu Su
Author_xml	– sequence: 1 givenname: Yu surname: Su fullname: Su, Yu – sequence: 2 givenname: Ding surname: Ye fullname: Ye, Ding – sequence: 3 givenname: Jingling surname: Xue fullname: Xue, Jingling – sequence: 4 givenname: Xiang-Ke surname: Liao fullname: Liao, Xiang-Ke
BookMark	eNp9kLFuwjAQhq2KSgXaB6i6ROrSJdQXx3E8UkopElKRCnNknLNklDg0DgNvX0egDgydfCd9__nuG5GBaxwS8gh0AkDl62b9_j1JKPBJwqSQjN2QIXCexwnkbBBqmvJYJiDvyMj7PaWQcpoOyWzqorkxVlt0XbRYb6NlfaiwDp3qbOOixkRLp6ujD038pjyW0bqxrsM2mjpVnbz19-TWqMrjw-Udk-3HfDP7jFdfi-Vsuoo1S7IuFqXRGiXbmRxKQfOSM2Y0SIYpUpQCFc-R5ZnZZVybXS5VIo2hqaBJCtIINiYv57mHtvk5ou-K2nqNVaUcNkdfQLg7EZBmENDnK3TfHNuwb09xIWTGAjwmcKZ023jfoikOra1VeyqAFr3Wotda9FqLi9aQEVcZbc-qulbZ6t_k0zlpEfHvp3CdBBDsF6K3hfk
CODEN	ITDSEO
CitedBy_id	crossref_primary_10_1109_TSE_2018_2869336 crossref_primary_10_1134_S0361768819070041 crossref_primary_10_1007_s10664_025_10720_3
Cites_doi	10.1109/CGO.2011.5764696 10.1109/SC.2010.46 10.1145/1250734.1250767 10.1145/1290520.1290524 10.1145/1926385.1926445 10.1109/CGO.2009.9 10.1145/2491894.2466483 10.1109/CGO.2011.5764694 10.1145/2442516.2442531 10.1007/978-3-642-03237-0_12 10.1145/2259016.2259050 10.1145/781131.781144 10.1145/1941553.1941590 10.1023/B:SQJO.0000039791.93071.a2 10.1145/2442516.2442523 10.1145/996841.996859 10.1007/3-540-47764-0_15 10.1109/TSE.2014.2302311 10.1109/HiPC.2012.6507474 10.1007/978-3-540-77220-0_21 10.1002/spe.2214 10.1007/BFb0053565 10.1145/1390630.1390658 10.1145/1944862.1944872 10.1145/277650.277667 10.1109/PACT.2011.14 10.1007/978-3-642-28652-0_4 10.1109/PACT.2013.6618800 10.1007/978-3-642-19861-8_5 10.1145/1455770.1455778 10.1145/231379.231389 10.1145/2025113.2025160 10.1109/IPDPS.2011.59 10.1145/1772954.1772985 10.1145/2145816.2145831 10.1145/1869459.1869495 10.1145/1926385.1926390 10.1145/2338965.2336784 10.1109/ICPP.2014.54 10.1109/IPDPS.2013.37 10.1145/2581122.2544154 10.1145/1133981.1134027 10.1145/2370816.2370866 10.1109/HiPC.2013.6799110 10.1145/2541228.2555296
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Feb 2016
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Feb 2016
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3
DOI	10.1109/TPDS.2015.2397933
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList	Technology Research Database Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EISSN	1558-2183
EndPage	366
ExternalDocumentID	3924551691 10_1109_TPDS_2015_2397933 7029117
Genre	orig-research
GrantInformation_xml	– fundername: Australian Research Council grantid: DP130101970; DP150102109 funderid: 10.13039/501100000923 – fundername: National Natural Science Foundation of China grantid: 61170049 funderid: 10.13039/501100001809
GroupedDBID	--Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AGSQL AHBIQ AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3
ID	FETCH-LOGICAL-c326t-7dfcce93bf81d708d533fc193e4e0e97ea58e386fb65cfb89a29ff04702419f73
IEDL.DBID	RIE
ISICitedReferencesCount	9
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000370925200005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1045-9219
IngestDate	Sun Sep 28 09:38:53 EDT 2025 Sun Nov 09 08:22:23 EST 2025 Tue Nov 18 20:47:35 EST 2025 Sat Nov 29 03:36:08 EST 2025 Wed Aug 27 02:52:21 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	2
Keywords	pointer analysis compilers GPGPU Parallel graph algorithms
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/OAPA.html
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c326t-7dfcce93bf81d708d533fc193e4e0e97ea58e386fb65cfb89a29ff04702419f73
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
PQID	1757796379
PQPubID	85437
PageCount	14
ParticipantIDs	proquest_miscellaneous_1793271461 crossref_primary_10_1109_TPDS_2015_2397933 crossref_citationtrail_10_1109_TPDS_2015_2397933 ieee_primary_7029117 proquest_journals_1757796379
PublicationCentury	2000
PublicationDate	2016-02-01
PublicationDateYYYYMMDD	2016-02-01
PublicationDate_xml	– month: 02 year: 2016 text: 2016-02-01 day: 01
PublicationDecade	2010
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev	TPDS
PublicationYear	2016
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref12 ref15 ref11 ref17 ref16 ref19 ref18 ref50 ref46 ref45 ref48 ref47 ref42 ref41 ref44 ref43 ref49 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref35 ref37 ref36 ref31 ref30 ref32 ref2 ref1 ref39 ref38 (ref14) 0 fink (ref25) 0; 5673 ye (ref33) 0 ref24 ref23 ref26 ref20 ref22 ref21 ref28 ref27 ref29 fink (ref34) 0 sui (ref10) 0
References_xml	– start-page: 112 year: 0 ident: ref34 article-title: Thin slicing publication-title: Proc of the ACM SIGPLAN 2003 Conf on Programming Language Design and Implementation – ident: ref12 doi: 10.1109/CGO.2011.5764696 – start-page: 1 year: 0 ident: ref10 article-title: Query-directed adaptive heap cloning for optimizing compilers publication-title: Proc Int l Symp Code Generations and Optimization – ident: ref44 doi: 10.1109/SC.2010.46 – ident: ref15 doi: 10.1145/1250734.1250767 – ident: ref5 doi: 10.1145/1290520.1290524 – ident: ref50 doi: 10.1145/1926385.1926445 – ident: ref19 doi: 10.1109/CGO.2009.9 – ident: ref32 doi: 10.1145/2491894.2466483 – ident: ref20 doi: 10.1109/CGO.2011.5764694 – ident: ref42 doi: 10.1145/2442516.2442531 – ident: ref35 doi: 10.1007/978-3-642-03237-0_12 – ident: ref30 doi: 10.1145/2259016.2259050 – ident: ref28 doi: 10.1145/781131.781144 – ident: ref27 doi: 10.1145/1941553.1941590 – ident: ref18 doi: 10.1023/B:SQJO.0000039791.93071.a2 – ident: ref26 doi: 10.1145/2442516.2442523 – ident: ref6 doi: 10.1145/996841.996859 – ident: ref1 doi: 10.1007/3-540-47764-0_15 – ident: ref38 doi: 10.1109/TSE.2014.2302311 – ident: ref46 doi: 10.1109/HiPC.2012.6507474 – ident: ref43 doi: 10.1007/978-3-540-77220-0_21 – ident: ref11 doi: 10.1002/spe.2214 – ident: ref17 doi: 10.1007/BFb0053565 – ident: ref29 doi: 10.1145/1390630.1390658 – ident: ref39 doi: 10.1145/1944862.1944872 – ident: ref16 doi: 10.1145/277650.277667 – ident: ref45 doi: 10.1109/PACT.2011.14 – ident: ref22 doi: 10.1007/978-3-642-28652-0_4 – ident: ref40 doi: 10.1109/PACT.2013.6618800 – ident: ref31 doi: 10.1007/978-3-642-19861-8_5 – ident: ref4 doi: 10.1145/1455770.1455778 – ident: ref2 doi: 10.1145/231379.231389 – volume: 5673 start-page: 205 year: 0 ident: ref25 article-title: The complexity of Andersen's analysis in practice publication-title: Proc 16th Int Static Anal Symp – ident: ref13 doi: 10.1145/2025113.2025160 – ident: ref49 doi: 10.1109/IPDPS.2011.59 – ident: ref8 doi: 10.1145/1772954.1772985 – start-page: 319 year: 0 ident: ref33 article-title: Region-based selective flow-sensitive pointer analysis publication-title: Proc Int Symp Static Analysis – ident: ref23 doi: 10.1145/2145816.2145831 – ident: ref21 doi: 10.1145/1869459.1869495 – ident: ref9 doi: 10.1145/1926385.1926390 – ident: ref37 doi: 10.1145/2338965.2336784 – ident: ref41 doi: 10.1109/ICPP.2014.54 – year: 0 ident: ref14 – ident: ref48 doi: 10.1109/IPDPS.2013.37 – ident: ref3 doi: 10.1145/2581122.2544154 – ident: ref7 doi: 10.1145/1133981.1134027 – ident: ref47 doi: 10.1145/2370816.2370866 – ident: ref24 doi: 10.1109/HiPC.2013.6799110 – ident: ref36 doi: 10.1145/2541228.2555296
SSID	ssj0014504
Score	2.204676
Snippet	We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based,...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	353
SubjectTerms	Adaptation models Algorithm design and analysis Algorithms Balancing Benchmarks compilers Computer information security GPGPU Graphics processing units Graphs Instruction sets Optimization Parallel graph algorithms Partitioning algorithms pointer analysis Switches Synchronization Vectors Workload
Title	An Efficient GPU Implementation of Inclusion-Based Pointer Analysis
URI	https://ieeexplore.ieee.org/document/7029117 https://www.proquest.com/docview/1757796379 https://www.proquest.com/docview/1793271461
Volume	27
WOSCitedRecordID	wos000370925200005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED90-KAPTjfF-UUEn8TM2LRN86hz6oOMgRvsrbRpAoK2sg__fi9ZVhRF8K20aZrmcrnf5ZLfAZwHOGZNZDhlhcxoKGJGpeIxNcZkgVCKs9xJ-kkMBslkIodrcFmfhdFau81numsvXSy_qNTCLpVdCRagbop1WBciXp7VqiMGYeRSBaJ3EVGJaugjmNdMXo2Gd892E1fUDWwUi_NvNsglVfkxEzvzct_8X8N2YNvDSHKzlPsurOmyBc1VigbiNbYFW1_4BtvQuylJ33FGYHXkYTgmjhz4zZ8_KkllCE4Yrwu7hEZv0cAVZFhZRokpWbGX7MH4vj_qPVKfRYEqhGZzKgqjlJY8NwhNBUsKBHhGIW7ToWZaCp1FieZJbPI4UiZPZBZIY1iI_xReSyP4PjTKqtQHQPB2JEPJlUaYleUILlQg0UFjBq1-Ik0H2KpfU-Upxm2mi9fUuRpMplYUqRVF6kXRgYv6lfclv8Zfhdu27-uCvts7cLwSXuo1cJYiLBICZxchO3BWP0bdsQGRrNTVwpZB9CpsZvPD32s-gk38vt-lfQyN-XShT2BDfcxfZtNTNwA_ATYE1sU
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_8AvXBj6k4nRrBJzFbbNqlefRbcY6BE3wrbZqAMFuZm3-_lywriiL4VtprSHK53C-55HcARwGOWRMZTlkuUxqKNqNS8TY1xqSBUIqzzGm6I7rd-PlZ9mbgpLoLo7V2h8900z66WH5eqrHdKmsJFqBtilmYj8IwYJPbWlXMIIxcskBcX0RUoiH6GOYpk61-7_LRHuOKmoGNY3H-zQu5tCo_5mLnYK5X_1e1NVjxQJKcTTS_DjO6qMHqNEkD8TZbg-UvjIMbcHFWkCvHGoHFkZveE3H0wK_-BlJBSkNwyhiM7SYaPUcXl5NeaTklhmTKX7IJT9dX_Ytb6vMoUIXgbERFbpTSkmcGwalgcY4QzyhEbjrUTEuh0yjWPG6brB0pk8UyDaQxLMQ2hafSCL4Fc0VZ6G0g-DqSoeRKI9BKM4QXKpC4RGMG_X4sTR3YtF8T5UnGba6LQeIWG0wmVhWJVUXiVVGH4-qXtwnDxl_CG7bvK0Hf7XVoTJWXeBt8TxAYCYHzi5B1OKw-o_XYkEha6HJsZRC_CpvbfOf3kg9g8bb_0Ek6d937XVjCuvgz2w2YGw3Heg8W1Mfo5X247wbjJ4sQ2gw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Efficient+GPU+Implementation+of+Inclusion-Based+Pointer+Analysis&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Su%2C+Yu&rft.au=Ye%2C+Ding&rft.au=Xue%2C+Jingling&rft.au=Liao%2C+Xiang-Ke&rft.date=2016-02-01&rft.issn=1045-9219&rft.volume=27&rft.issue=2&rft.spage=353&rft.epage=366&rft_id=info:doi/10.1109%2FTPDS.2015.2397933&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TPDS_2015_2397933
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon