GPU-accelerated Path-based Timing Analysis

Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2021 58th ACM/IEEE Design Automation Conference (DAC) s. 721 - 726
Hlavní autoři: Guo, Guannan, Huang, Tsung-Wei, Lin, Yibo, Wong, Martin
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 05.12.2021
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times.
AbstractList Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times.
Author Huang, Tsung-Wei
Guo, Guannan
Wong, Martin
Lin, Yibo
Author_xml – sequence: 1
  givenname: Guannan
  surname: Guo
  fullname: Guo, Guannan
  organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA
– sequence: 2
  givenname: Tsung-Wei
  surname: Huang
  fullname: Huang, Tsung-Wei
  organization: University of Utah,Department of Electrical and Computer Engineering,Salt Lake City,UT,USA
– sequence: 3
  givenname: Yibo
  surname: Lin
  fullname: Lin, Yibo
  organization: Peking University,Department of Computer Science,Beijing,China
– sequence: 4
  givenname: Martin
  surname: Wong
  fullname: Wong, Martin
  organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA
BookMark eNotj81Kw0AYRUewoLZ5AhG6FhLn_2cZoq1CwS7adflm5hsdSKNksunbG7Cbe8_qcu4DuR1-BiTkidGGMepeXtuOWWpkwylnjVNWC6ZvSOWMZVorKbiR9I5UpWRPNVVWznlPnrf7Yw0hYI8jTBjXe5i-aw9lxkM-5-Fr3Q7QX0ouK7JI0Besrr0kx83boXuvd5_bj67d1cCtmWqBIdggedI2Ss-dCMJb5Mkl60wAkxhI7ZWKiVoK0cfIGBeYZncDnmuxJI__uxkRT79jPsN4OV0fiT-88kLL
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/DAC18074.2021.9586316
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665432740
1665432748
EndPage 726
ExternalDocumentID 9586316
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIO
ID FETCH-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263
IEDL.DBID RIE
ISICitedReferencesCount 23
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:29 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263
PageCount 6
ParticipantIDs ieee_primary_9586316
PublicationCentury 2000
PublicationDate 2021-Dec.-5
PublicationDateYYYYMMDD 2021-12-05
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.-5
  day: 05
PublicationDecade 2020
PublicationTitle 2021 58th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev DAC
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060584060
Score 2.3195329
Snippet Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years...
SourceID ieee
SourceType Publisher
StartPage 721
SubjectTerms Data structures
Graphics processing units
Instruction sets
Logic gates
Parallel processing
Runtime
Timing
Title GPU-accelerated Path-based Timing Analysis
URI https://ieeexplore.ieee.org/document/9586316
WOSCitedRecordID wos000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4sGTSiu-2YMnMW12k80mR6lWD1L20EJvJY-JeNlK3fr7TbbbiuDFWwiEMJOQbyYz8w3AreSa8cwqIkTmCLcY3kFLYzf3SF9HtZeuIXF9LaZTuViosgP3-1oYRGySz3AYh00s363sJn6VjVQuBUtFF7pFIba1Wru7E6N7AZtoW6STUjV6fBinkeolOIFZOmzX_mqi0mDI5Oh_ux_D4KcYLyn3MHMCHaz6cPdczom2NoBG5HpwSRksORIhySWz2KjrLdnRjQxgPnmajV9I2_aA6OC-1IQFrUrLMy-k4ybAsWVGYuaVl6qwuvCp5sLkufPB2tPOOBdsJoY-UuFrkwl2Cr1qVeEZJIopJphG4xTy1ORS2iIYPBSpttQYfw79KOfyY8tssWxFvPh7-hIOoyqbZI78Cnr1eoPXcGC_6vfP9U1zHN-gBYqA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4gmuhJDRjf7sGTsdBtu932aFDEiGQPkHAjfRouYBD8_bbLgjHx4q1p0jQzbfrNdGa-AbgVTFFGjEScE4uYceEdNDh2c4_0dVh5YUsS134-GIjxWBY1uN_WwjjnyuQz14rDMpZv52YVv8raMhOcpnwHdjPGCF5Xa21uT4zvBXTCVZlOimX78aGTRrKX4AaStFWt_tVGpUSR7uH_9j-C5k85XlJsgeYYam7WgLvnYoSUMQE2ItuDTYpgy6EISjYZxlZd78mGcKQJo-7TsNNDVeMDpIIDs0Q06FUYRjwXlukAyIZq4YiXXsjcqNyninGdZdYHe09ZbW2wmqjzkQxfacLpCdRn85k7hURSSTlVTlvpWKozIUweTB7ssDJYa38GjSjn5GPNbTGpRDz_e_oG9nvDt_6k_zJ4vYCDqNYytSO7hPpysXJXsGe-ltPPxXV5NN8Iro3H
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=GPU-accelerated+Path-based+Timing+Analysis&rft.au=Guo%2C+Guannan&rft.au=Huang%2C+Tsung-Wei&rft.au=Lin%2C+Yibo&rft.au=Wong%2C+Martin&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=721&rft.epage=726&rft_id=info:doi/10.1109%2FDAC18074.2021.9586316&rft.externalDocID=9586316