Exploring GPU-Accelerated Routing for FPGAs
Field Programmable Gate Arrays (FPGAs) are reconfigurable architectures able to provide a good balance between energy efficiency and flexibility with respect to CPUs and ASICs. The main drawback in using FPGAs, however, is their timing-consuming routing process, significantly hindering the designer...
Uložené v:
| Vydané v: | IEEE transactions on parallel and distributed systems Ročník 30; číslo 6; s. 1331 - 1345 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.06.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1045-9219, 1558-2183 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Field Programmable Gate Arrays (FPGAs) are reconfigurable architectures able to provide a good balance between energy efficiency and flexibility with respect to CPUs and ASICs. The main drawback in using FPGAs, however, is their timing-consuming routing process, significantly hindering the designer productivity. An emerging solution to this problem is to accelerate the routing by parallelization. Existing attempts of parallelizing the FPGA routing either do not fully exploit the parallelism or suffer from an excessive quality loss. Massive parallelism using GPUs has the potential to solve this issue but faces non-trivial challenges. To cope with these challenges, this paper explores GPU-accelerated routing approach for FPGAs. We leverage the idea of problem size reduction by limiting the single-net routing in a small subgraph rather than in an entire graph, further enabling the GPU-friendly shortest path algorithm to be used in FPGA routing. We maintain the convergence after problem size reduction by using the dynamic expansion of the routing resource subgraph, where the routing region of subgraph will be progressively expanded to find a feasible solution to each net. In addition, we are based on a GPU platform to explore the fine-grained single-net parallel routing in three ways and propose a hybrid approach to combine the static and dynamic parallelization for better speedup in FPGA routing. To explore the coarse-grained multi-net parallelization, We propose a dynamic programming-based partitioning algorithm to parallelize the routing of multiple nets while generating the equivalent routing results as the original single-net routing. Experimental results show that our proposed approach can provide an average of about 21.53× speedup on a single GPU with a tolerable loss in the routing quality and maintain a scalable speedup on large-scale routing resource graphs. To our knowledge, this is the first work to demonstrate the effectiveness of GPU-accelerated routing for FPGAs. |
|---|---|
| AbstractList | Field Programmable Gate Arrays (FPGAs) are reconfigurable architectures able to provide a good balance between energy efficiency and flexibility with respect to CPUs and ASICs. The main drawback in using FPGAs, however, is their timing-consuming routing process, significantly hindering the designer productivity. An emerging solution to this problem is to accelerate the routing by parallelization. Existing attempts of parallelizing the FPGA routing either do not fully exploit the parallelism or suffer from an excessive quality loss. Massive parallelism using GPUs has the potential to solve this issue but faces non-trivial challenges. To cope with these challenges, this paper explores GPU-accelerated routing approach for FPGAs. We leverage the idea of problem size reduction by limiting the single-net routing in a small subgraph rather than in an entire graph, further enabling the GPU-friendly shortest path algorithm to be used in FPGA routing. We maintain the convergence after problem size reduction by using the dynamic expansion of the routing resource subgraph, where the routing region of subgraph will be progressively expanded to find a feasible solution to each net. In addition, we are based on a GPU platform to explore the fine-grained single-net parallel routing in three ways and propose a hybrid approach to combine the static and dynamic parallelization for better speedup in FPGA routing. To explore the coarse-grained multi-net parallelization, We propose a dynamic programming-based partitioning algorithm to parallelize the routing of multiple nets while generating the equivalent routing results as the original single-net routing. Experimental results show that our proposed approach can provide an average of about 21.53× speedup on a single GPU with a tolerable loss in the routing quality and maintain a scalable speedup on large-scale routing resource graphs. To our knowledge, this is the first work to demonstrate the effectiveness of GPU-accelerated routing for FPGAs. |
| Author | Luo, Guojie Shen, Minghua Xiao, Nong |
| Author_xml | – sequence: 1 givenname: Minghua orcidid: 0000-0003-4747-8020 surname: Shen fullname: Shen, Minghua email: shenmh6@mail.sysu.edu.cn organization: School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China – sequence: 2 givenname: Guojie orcidid: 0000-0003-4932-3655 surname: Luo fullname: Luo, Guojie email: gluo@pku.edu.cn organization: Center for Energy-Efficient Computing and Applications, School of Electronics Engineering and Computer Science, Peking University, Beijing, China – sequence: 3 givenname: Nong surname: Xiao fullname: Xiao, Nong email: xiaon6@mail.sysu.edu.cn organization: School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China |
| BookMark | eNp9kMtKAzEUhoNUsK0-gLgpuJSpOblMkmWpbRUKFm3XIaaJTBknNUlB394ZWly4cHUu_N858A1QrwmNQ-ga8BgAq_v16uF1TDDIMZGSC8bPUB84lwUBSXttjxkvFAF1gQYp7TAGxjHro7vZ174OsWreR4vVpphY62oXTXbb0Us45G7vQxzNV4tJukTn3tTJXZ3qEG3ms_X0sVg-L56mk2VhiaK5ECXeOkqN2BLPHAYPlFH2BqVTtu0MNpRZ7wWUljjVhrmVlDFrgFDcznSIbo939zF8HlzKehcOsWlfakJICSVw0aXEMWVjSCk6r22VTa5Ck6Opag1Yd2Z0Z0Z3ZvTJTEvCH3Ifqw8Tv_9lbo5M5Zz7zUteCsUU_QFydW3X |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1016_j_vlsi_2025_102532 crossref_primary_10_1109_TPDS_2020_3035787 |
| Cites_doi | 10.1109/IPDPS.2014.45 10.1145/1950413.1950447 10.1109/TPDS.2015.2485994 10.1109/43.856973 10.1109/SAAHPC.2011.16 10.1145/1687399.1687451 10.1109/FPGA.1995.242049 10.1145/2145816.2145832 10.1145/2847263.2847266 10.1007/978-3-540-77220-0_21 10.1109/TEC.1961.5219222 10.1145/3020078.3021732 10.1007/BFb0097950 10.1145/2629579 10.1109/FPGA.1997.624617 10.1109/ISCA.2014.6853195 10.1561/1000000028 10.1109/FPT.2010.5681758 10.1109/IPDPSW.2015.130 10.1109/FCCM.2011.17 10.1109/FPL.2010.33 10.1145/1993498.1993501 10.1145/1878961.1878966 10.1109/ICCD.2013.6657028 10.1109/FPGA.2002.1106675 10.1145/2380445.2380491 10.1109/ICCAD.2015.7372558 10.1109/IISWC.2012.6402918 10.1007/978-1-4615-5145-4 10.1145/2593069.2593177 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TPDS.2018.2885745 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 1345 |
| ExternalDocumentID | 10_1109_TPDS_2018_2885745 8567949 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61433019; 61802446 funderid: 10.13039/501100001809 – fundername: Guangdong Introducing Innovative and Entrepreneurial Teams grantid: 2016ZT06D211 |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D RIG |
| ID | FETCH-LOGICAL-c293t-760de33a7d2f4e01f13434b16e9c343a0a34cff716c2e960d5c8344ca12309603 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000468237800009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Sun Jun 29 16:19:47 EDT 2025 Tue Nov 18 22:24:37 EST 2025 Sat Nov 29 06:06:46 EST 2025 Wed Aug 27 02:46:22 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c293t-760de33a7d2f4e01f13434b16e9c343a0a34cff716c2e960d5c8344ca12309603 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-4747-8020 0000-0003-4932-3655 |
| PQID | 2226161570 |
| PQPubID | 85437 |
| PageCount | 15 |
| ParticipantIDs | crossref_citationtrail_10_1109_TPDS_2018_2885745 ieee_primary_8567949 proquest_journals_2226161570 crossref_primary_10_1109_TPDS_2018_2885745 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-06-01 |
| PublicationDateYYYYMMDD | 2019-06-01 |
| PublicationDate_xml | – month: 06 year: 2019 text: 2019-06-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2019 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref35 ref13 ref34 gupta (ref3) 2015 sedgewick (ref39) 2011 ref37 ref15 rose (ref43) 2014; 7 ref36 ref14 ref31 ref30 ref33 ref11 ref32 ref17 ref38 ref19 ref18 esmaeilzadeh (ref1) 2011 dehon (ref28) 2006 zhu (ref12) 2013 brewer (ref4) 2015 deng (ref23) 2013; 7 ref26 hoo (ref16) 2015 ref25 ref20 ref42 ref41 catanzaro (ref10) 2008 mcmurchie (ref9) 1995 ref21 bleiweiss (ref24) 2008 ref27 ref29 ref8 ref7 atasu (ref2) 2015 ref5 meyer (ref22) 1998 ref40 ovtcharov (ref6) 2015 |
| References_xml | – ident: ref27 doi: 10.1109/IPDPS.2014.45 – start-page: 393 year: 1998 ident: ref22 article-title: Delta-stepping: A parallel single source shortest path algorithm publication-title: Proc European Symp Algorithms – ident: ref35 doi: 10.1145/1950413.1950447 – start-page: 143 year: 2006 ident: ref28 article-title: GraphStep: A system architecture for sparse-graph algorithm publication-title: Proc IEEE Symp Field Programm Custom Comput Mach – start-page: 1 year: 2013 ident: ref12 article-title: A novel net-partition-based multithreaded FPGA routing method publication-title: Proc IEEE Int Conf Field Programmable Logic Appl – ident: ref26 doi: 10.1109/TPDS.2015.2485994 – volume: 7 year: 2014 ident: ref43 article-title: VTR 7.0: Next generation architecture and CAD system for FPGAs publication-title: ACM Trans Reconfigurable Technol Syst – ident: ref34 doi: 10.1109/43.856973 – ident: ref20 doi: 10.1109/SAAHPC.2011.16 – ident: ref17 doi: 10.1145/1687399.1687451 – start-page: 111 year: 1995 ident: ref9 article-title: pathfinder: a negotiation-based performance-driven router for fpgas publication-title: Third International ACM Symposium on Field-Programmable Gate Arrays doi: 10.1109/FPGA.1995.242049 – ident: ref33 doi: 10.1145/2145816.2145832 – start-page: 365 year: 2011 ident: ref1 article-title: Dark silicon and the end of multicore scaling publication-title: 2011 38th Annual International Symposium on Computer Architecture (ISCA) ISCA – ident: ref18 doi: 10.1145/2847263.2847266 – year: 2015 ident: ref6 article-title: Accelerating deep convolutional neural networks using specialized hardware publication-title: White Paper – start-page: 12 year: 2008 ident: ref10 article-title: Parallelizaing CAD: A timely research agenda for EDA publication-title: Proc Annu ACM Des Autom Conf – start-page: 1 year: 2015 ident: ref16 article-title: ParaLaR: A parallel FPGA router based on lagrangian relaxation publication-title: Proc IEEE Int Conf Field Programmable Logic Appl – ident: ref21 doi: 10.1007/978-3-540-77220-0_21 – ident: ref8 doi: 10.1109/TEC.1961.5219222 – year: 2015 ident: ref3 article-title: Xeon+FPGA platform for the data center publication-title: Proc 4th Workshop Intersections Comput Archit Reconfigurable Logic (CARL) – ident: ref29 doi: 10.1145/3020078.3021732 – ident: ref38 doi: 10.1007/BFb0097950 – start-page: 65 year: 2008 ident: ref24 article-title: GPU accelerated pathfinding publication-title: Proc Symp Graph Hardware – ident: ref7 doi: 10.1145/2629579 – year: 2011 ident: ref39 publication-title: Algorithms – ident: ref11 doi: 10.1109/FPGA.1997.624617 – ident: ref5 doi: 10.1109/ISCA.2014.6853195 – volume: 7 start-page: 1 year: 2013 ident: ref23 article-title: Electronic design automation with graphic processors: A survey publication-title: Proc Int Conf Found Trends Electron Des Autom doi: 10.1561/1000000028 – ident: ref13 doi: 10.1109/FPT.2010.5681758 – ident: ref37 doi: 10.1109/IPDPSW.2015.130 – ident: ref40 doi: 10.1109/FCCM.2011.17 – ident: ref19 doi: 10.1109/FPL.2010.33 – year: 2015 ident: ref2 article-title: Accelerating text analytics queries on reconfigurable platforms publication-title: Proc 4th Workshop Intersections Comput Archit Reconfigurable Logic (CARL) – ident: ref32 doi: 10.1145/1993498.1993501 – ident: ref42 doi: 10.1145/1878961.1878966 – ident: ref25 doi: 10.1109/ICCD.2013.6657028 – ident: ref36 doi: 10.1109/FPGA.2002.1106675 – ident: ref41 doi: 10.1145/2380445.2380491 – year: 2015 ident: ref4 article-title: Convey's acceleration of the Memcached and Imagemagick applications publication-title: Proc 4th Workshop Intersections Comput Archit Reconfigurable Logic (CARL) – ident: ref15 doi: 10.1109/ICCAD.2015.7372558 – ident: ref31 doi: 10.1109/IISWC.2012.6402918 – ident: ref30 doi: 10.1007/978-1-4615-5145-4 – ident: ref14 doi: 10.1145/2593069.2593177 |
| SSID | ssj0014504 |
| Score | 2.3003004 |
| Snippet | Field Programmable Gate Arrays (FPGAs) are reconfigurable architectures able to provide a good balance between energy efficiency and flexibility with respect... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1331 |
| SubjectTerms | Acceleration Algorithms Dynamic programming Energy conversion efficiency Field programmable gate arrays FPGAs Gate arrays GPU parallelization Graph theory Graphics processing units Hardware Heuristic algorithms Nickel Parallel processing reconfigurable architectures Routing Shortest-path problems Size reduction |
| Title | Exploring GPU-Accelerated Routing for FPGAs |
| URI | https://ieeexplore.ieee.org/document/8567949 https://www.proquest.com/docview/2226161570 |
| Volume | 30 |
| WOSCitedRecordID | wos000468237800009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED-24YM-ON0U5xd98EnNlrRp0zwOdfNpFNxgbyVNUhBkk33495ukWRkogm-hJKX8Lpe7613uB3CnkyQULCIoKWWJaMFDVFCikCIqFEVBcKFKRzbBJpN0PudZAx7ruzBaa1d8pvt26HL5aim39lfZII0Ts314E5qMJdVdrTpjQGNHFWiiixhxo4Y-g0kwH0yz5zdbxJX2wzSNmb25tGeDHKnKj5PYmZdR-38fdgLH3o0MhpXcT6GhFx1o7ygaAq-xHTja6zfYhYe64i4YZzM0lNIYHdsrQgW2Msg-Nz5sMMrGw_UZzEYv06dX5NkSkDQme4NYgpWOIsFUWFKNSUkiGtGCJJpLMxJYRFSWpYmPZKhN3KJiaTk2pDC2y8Yx0Tm0FsuFvoBAkpSllulPppoWmBp5cVIKbanWYyHiHuAdfrn0rcQto8VH7kIKzHMLeW4hzz3kPbivl3xWfTT-mty1GNcTPbw9uN4JKfeats6Nf5NYr5Xhy99XXcGheTevyruuobVZbfUNHMivzft6des20Td7n8HZ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_mFNQH5ydOp_bBJzWatOlHHoe6TZxj4Aa-lTRJQZBN9uHfb67LykARfAsloeV3udxd73I_gEsTRb6MA0aiXOWEZ8InGWeaaKZ9mWWMZjovyCbiXi95exP9CtyUd2GMMUXxmbnFYZHL12M1x19ld0kY2e0j1mAdmbPcba0yZ8DDgizQxhchEVYRXQ6TUXE36D-8YhlXcusnSRjj3aUVK1TQqvw4iwsD06r979N2Ycc5kl5zIfk9qJjRPtSWJA2e09l92F7pOHgA12XNndfuD0lTKWt2sFuE9rA2CJ9bL9Zr9dvN6SEMW4-D-w5xfAlEWaM9I3FEtQkCGWs_54aynAU84BmLjFB2JKkMuMpzGyEp39jIRYcKWTaUtNYLI5ngCKqj8cgcg6dYEifI9acSwzPKrcQEy6VBsvVQyrAOdIlfqlwzceS0-EiLoIKKFCFPEfLUQV6Hq3LJ56KTxl-TDxDjcqKDtw6NpZBSp2vT1Ho4EfqtMT35fdUFbHYGL920-9R7PoUt-x6xKPZqQHU2mZsz2FBfs_fp5LzYUN_FkcUi |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploring+GPU-Accelerated+Routing+for+FPGAs&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Shen%2C+Minghua&rft.au=Luo%2C+Guojie&rft.au=Xiao%2C+Nong&rft.date=2019-06-01&rft.pub=IEEE&rft.issn=1045-9219&rft.volume=30&rft.issue=6&rft.spage=1331&rft.epage=1345&rft_id=info:doi/10.1109%2FTPDS.2018.2885745&rft.externalDocID=8567949 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |