Design and Optimization of LLVM Compiler for Domestic High Performance Accelerator
National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single I...
Uložené v:
| Vydané v: | Ji suan ji gong cheng Ročník 50; číslo 4; s. 321 - 331 |
|---|---|
| Hlavný autor: | |
| Médium: | Journal Article |
| Jazyk: | Chinese English |
| Vydavateľské údaje: |
Editorial Office of Computer Engineering
15.04.2024
|
| Predmet: | |
| ISSN: | 1000-3428 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single Instruction Multiple Datastream(SIMD) vectorization structure,is the main support for the peak performance acceleration core. However,mainstream compilers cannot adequately support high-performance accelerators in intensive data calculation instruction layouts,static allocation of hardware execution units for instructions,and GPDSP-specific vector instructions. In this study,based on the Low Level Virtual Machine(LLVM) compilation framework,the PERP method,Ant Colony Optimization(ACO) algorithm,and GPDSP structural characteristics are combined to optimize the cost model in the pre-RA-sched stage,and the instruction scheduling module is designed to support register pressure awareness. This study proposes an instruction scheduling strategy that supports static functional unit allocation in the post-RA-sched stage,which guarantees correct functional unit allocation through a conflict detection mechanism,and provides a software basis for the parallel execution of instructions. Furthermore,a series of rich and regular vector instruction interfaces are encapsulated in the backend to support the GPDSP vector instructions. The experimental results demonstrate that the LLVM compilation architecture optimization method proposed in this study provides good support for the GPDSP in terms of both functionality and performance. Specifically,the overall performance average speedup ratio of GCC testsuite is 4.539,the overall performance average speedup ratio of SPEC CPU 2017 floating-point test is 4.49,and the overall performance average speedup ratio of SPEC CPU 2017 integer test is 3.24. Additionally,the vector program using vector interfaces achieves an average performance improvement ratio of 97.1%. |
|---|---|
| AbstractList | National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single Instruction Multiple Datastream(SIMD) vectorization structure,is the main support for the peak performance acceleration core. However,mainstream compilers cannot adequately support high-performance accelerators in intensive data calculation instruction layouts,static allocation of hardware execution units for instructions,and GPDSP-specific vector instructions. In this study,based on the Low Level Virtual Machine(LLVM) compilation framework,the PERP method,Ant Colony Optimization(ACO) algorithm,and GPDSP structural characteristics are combined to optimize the cost model in the pre-RA-sched stage,and the instruction scheduling module is designed to support register pressure awareness. This study proposes an instruction scheduling strategy that supports static functional unit allocation in the post-RA-sched stage,which guarantees correct functional unit allocation through a conflict detection mechanism,and provides a software basis for the parallel execution of instructions. Furthermore,a series of rich and regular vector instruction interfaces are encapsulated in the backend to support the GPDSP vector instructions. The experimental results demonstrate that the LLVM compilation architecture optimization method proposed in this study provides good support for the GPDSP in terms of both functionality and performance. Specifically,the overall performance average speedup ratio of GCC testsuite is 4.539,the overall performance average speedup ratio of SPEC CPU 2017 floating-point test is 4.49,and the overall performance average speedup ratio of SPEC CPU 2017 integer test is 3.24. Additionally,the vector program using vector interfaces achieves an average performance improvement ratio of 97.1%. |
| Author | SONG Qiang, TANG Junlong, CHEN Zhaoyun, SHI Yang, TAN Qixuan, XIAO Ziyang, ZOU Wanghui |
| Author_xml | – sequence: 1 fullname: SONG Qiang, TANG Junlong, CHEN Zhaoyun, SHI Yang, TAN Qixuan, XIAO Ziyang, ZOU Wanghui organization: 1. School of Physical and Electronic Sciences, Changsha University of Science and Technology, Changsha 410114, Hunan, China;2. School of Computer, National University of Defense Technology, Changsha 410073, Hunan, China |
| BookMark | eNo9jstOwzAURL0oEm3hH8wHJNzYrmMvq5ZHpaAiBGyjG_umuGriyskGvp4IEKvRHI2OZsFmfeyJsZsC8sLq0twe8zAMfV4AQCaVMDmALqcyY_N_dskWw3AEUEIAzNnLloZw6Dn2nu_PY-jCF44h9jy2vKren_gmdudwosTbmPg2djSMwfHHcPjgz5Qm2GHviK-do2mFY0xX7KLF00DXf7lkb_d3r5vHrNo_7DbrKvOF1GNWukJZRdKVhfagpGnQisI2pKWSUKL26BCt87CixjiwYmUaT60nR7IRRi7Z7tfrIx7rcwodps86Yqh_QEyHGtN09kS1sEZZMjApStWQQqKVFqBRlw5aq-Q3edxg-g |
| ContentType | Journal Article |
| DBID | DOA |
| DOI | 10.19678/j.issn.1000-3428.0067000 |
| DatabaseName | DOAJ Directory of Open Access Journals |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EndPage | 331 |
| ExternalDocumentID | oai_doaj_org_article_29849e8025874be4aee56206a67c0f94 |
| GroupedDBID | -0Y 5XA 5XJ 92H 92I ABJNI ACGFS ALMA_UNASSIGNED_HOLDINGS CCEZO CUBFJ CW9 GROUPED_DOAJ TCJ TGT U1G U5S |
| ID | FETCH-LOGICAL-d136t-7c1494e3c716d0438ba9219be634307a6dacaa9cd05eb8c09258bdefdece3b283 |
| IEDL.DBID | DOA |
| ISSN | 1000-3428 |
| IngestDate | Mon Nov 03 21:56:07 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | Chinese English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-d136t-7c1494e3c716d0438ba9219be634307a6dacaa9cd05eb8c09258bdefdece3b283 |
| OpenAccessLink | https://doaj.org/article/29849e8025874be4aee56206a67c0f94 |
| PageCount | 11 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_29849e8025874be4aee56206a67c0f94 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-04-15 |
| PublicationDateYYYYMMDD | 2024-04-15 |
| PublicationDate_xml | – month: 04 year: 2024 text: 2024-04-15 day: 15 |
| PublicationDecade | 2020 |
| PublicationTitle | Ji suan ji gong cheng |
| PublicationYear | 2024 |
| Publisher | Editorial Office of Computer Engineering |
| Publisher_xml | – name: Editorial Office of Computer Engineering |
| SSID | ssj0042200 |
| Score | 2.2529566 |
| Snippet | National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a... |
| SourceID | doaj |
| SourceType | Open Website |
| StartPage | 321 |
| SubjectTerms | general purpose digital signal processor(gpdsp)|low level virtual machine(llvm)|compiler|instruction scheduling|vector instruction interface |
| Title | Design and Optimization of LLVM Compiler for Domestic High Performance Accelerator |
| URI | https://doaj.org/article/29849e8025874be4aee56206a67c0f94 |
| Volume | 50 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: Directory of Open Access Journals issn: 1000-3428 databaseCode: DOA dateStart: 20160101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.doaj.org/ omitProxy: false ssIdentifier: ssj0042200 providerName: Directory of Open Access Journals |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NS8NAEF2kiOhB_MRvVvCadpNsstljtRYPtRZR6S3szk5Aoa3U6u93dhO1nrx4TSCEGd68GWbmDWMXCeaZUYWLlBZVJLNCEKTQRkY4AKiIosCEYxNqOCzGYz1aOvXlZ8JqeeDacJ1EF1JjQdRcKGlRGkSibJGbXIGodFACFUp_FVN1DJZJImodAkFRhjLsNXYeNvMoNHdeArza3-_a9a6K-KXaH-ilv8U2m7yQd-v_2WYrON1hG0tqgbvsvhemLTjV_vyOkD5pVij5rOKDwdMt99gmkM85JaK8N5t4AQ3gfpKDj372A3gXgLgmtNf32GP_-uHqJmpOIkQuTvNFpIAqGokpUJnjfBPPGk0xx2KeSkKryZ0BYzQ4kaEtQGgym3VYOQRMLaUS-6w1nU3xgHFrlE6UE67yFJ8Wvh9oEXUM0oBM4kN26c1RvtaqF6XXoQ4PyDtl453yL-8c_cdHjtl6QqmE7-HE2QlrLebveMpW4WPx_DY_C47_BBdAsBc |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+Optimization+of+LLVM+Compiler+for+Domestic+High+Performance+Accelerator&rft.jtitle=Ji+suan+ji+gong+cheng&rft.au=SONG+Qiang%2C+TANG+Junlong%2C+CHEN+Zhaoyun%2C+SHI+Yang%2C+TAN+Qixuan%2C+XIAO+Ziyang%2C+ZOU+Wanghui&rft.date=2024-04-15&rft.pub=Editorial+Office+of+Computer+Engineering&rft.issn=1000-3428&rft.volume=50&rft.issue=4&rft.spage=321&rft.epage=331&rft_id=info:doi/10.19678%2Fj.issn.1000-3428.0067000&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_29849e8025874be4aee56206a67c0f94 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1000-3428&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1000-3428&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1000-3428&client=summon |