Design and Optimization of LLVM Compiler for Domestic High Performance Accelerator

National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single I...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Ji suan ji gong cheng Ročník 50; číslo 4; s. 321 - 331
Hlavný autor:	SONG Qiang, TANG Junlong, CHEN Zhaoyun, SHI Yang, TAN Qixuan, XIAO Ziyang, ZOU Wanghui
Médium:	Journal Article
Jazyk:	Chinese English
Vydavateľské údaje:	Editorial Office of Computer Engineering 15.04.2024
Predmet:	general purpose digital signal processor(gpdsp)\|low level virtual machine(llvm)\|compiler\|instruction scheduling\|vector instruction interface
ISSN:	1000-3428
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single Instruction Multiple Datastream(SIMD) vectorization structure,is the main support for the peak performance acceleration core. However,mainstream compilers cannot adequately support high-performance accelerators in intensive data calculation instruction layouts,static allocation of hardware execution units for instructions,and GPDSP-specific vector instructions. In this study,based on the Low Level Virtual Machine(LLVM) compilation framework,the PERP method,Ant Colony Optimization(ACO) algorithm,and GPDSP structural characteristics are combined to optimize the cost model in the pre-RA-sched stage,and the instruction scheduling module is designed to support register pressure awareness. This study proposes an instruction scheduling strategy that supports static functional unit allocation in the post-RA-sched stage,which guarantees correct functional unit allocation through a conflict detection mechanism,and provides a software basis for the parallel execution of instructions. Furthermore,a series of rich and regular vector instruction interfaces are encapsulated in the backend to support the GPDSP vector instructions. The experimental results demonstrate that the LLVM compilation architecture optimization method proposed in this study provides good support for the GPDSP in terms of both functionality and performance. Specifically,the overall performance average speedup ratio of GCC testsuite is 4.539,the overall performance average speedup ratio of SPEC CPU 2017 floating-point test is 4.49,and the overall performance average speedup ratio of SPEC CPU 2017 integer test is 3.24. Additionally,the vector program using vector interfaces achieves an average performance improvement ratio of 97.1%.
AbstractList	National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a Central Processing Unit(CPU) and General Purpose Digital Signal Processor(GPDSP). The GPDSP,with its Very Long Instruction Word(VLIW)+ Single Instruction Multiple Datastream(SIMD) vectorization structure,is the main support for the peak performance acceleration core. However,mainstream compilers cannot adequately support high-performance accelerators in intensive data calculation instruction layouts,static allocation of hardware execution units for instructions,and GPDSP-specific vector instructions. In this study,based on the Low Level Virtual Machine(LLVM) compilation framework,the PERP method,Ant Colony Optimization(ACO) algorithm,and GPDSP structural characteristics are combined to optimize the cost model in the pre-RA-sched stage,and the instruction scheduling module is designed to support register pressure awareness. This study proposes an instruction scheduling strategy that supports static functional unit allocation in the post-RA-sched stage,which guarantees correct functional unit allocation through a conflict detection mechanism,and provides a software basis for the parallel execution of instructions. Furthermore,a series of rich and regular vector instruction interfaces are encapsulated in the backend to support the GPDSP vector instructions. The experimental results demonstrate that the LLVM compilation architecture optimization method proposed in this study provides good support for the GPDSP in terms of both functionality and performance. Specifically,the overall performance average speedup ratio of GCC testsuite is 4.539,the overall performance average speedup ratio of SPEC CPU 2017 floating-point test is 4.49,and the overall performance average speedup ratio of SPEC CPU 2017 integer test is 3.24. Additionally,the vector program using vector interfaces achieves an average performance improvement ratio of 97.1%.
Author	SONG Qiang, TANG Junlong, CHEN Zhaoyun, SHI Yang, TAN Qixuan, XIAO Ziyang, ZOU Wanghui
Author_xml	– sequence: 1 fullname: SONG Qiang, TANG Junlong, CHEN Zhaoyun, SHI Yang, TAN Qixuan, XIAO Ziyang, ZOU Wanghui organization: 1. School of Physical and Electronic Sciences, Changsha University of Science and Technology, Changsha 410114, Hunan, China;2. School of Computer, National University of Defense Technology, Changsha 410073, Hunan, China
BookMark	eNo9jstOwzAURL0oEm3hH8wHJNzYrmMvq5ZHpaAiBGyjG_umuGriyskGvp4IEKvRHI2OZsFmfeyJsZsC8sLq0twe8zAMfV4AQCaVMDmALqcyY_N_dskWw3AEUEIAzNnLloZw6Dn2nu_PY-jCF44h9jy2vKren_gmdudwosTbmPg2djSMwfHHcPjgz5Qm2GHviK-do2mFY0xX7KLF00DXf7lkb_d3r5vHrNo_7DbrKvOF1GNWukJZRdKVhfagpGnQisI2pKWSUKL26BCt87CixjiwYmUaT60nR7IRRi7Z7tfrIx7rcwodps86Yqh_QEyHGtN09kS1sEZZMjApStWQQqKVFqBRlw5aq-Q3edxg-g
ContentType	Journal Article
DBID	DOA
DOI	10.19678/j.issn.1000-3428.0067000
DatabaseName	DOAJ Directory of Open Access Journals
DatabaseTitleList
Database_xml	– sequence: 1 dbid: DOA name: Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EndPage	331
ExternalDocumentID	oai_doaj_org_article_29849e8025874be4aee56206a67c0f94
GroupedDBID	-0Y 5XA 5XJ 92H 92I ABJNI ACGFS ALMA_UNASSIGNED_HOLDINGS CCEZO CUBFJ CW9 GROUPED_DOAJ TCJ TGT U1G U5S
ID	FETCH-LOGICAL-d136t-7c1494e3c716d0438ba9219be634307a6dacaa9cd05eb8c09258bdefdece3b283
IEDL.DBID	DOA
ISSN	1000-3428
IngestDate	Mon Nov 03 21:56:07 EST 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	4
Language	Chinese English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-d136t-7c1494e3c716d0438ba9219be634307a6dacaa9cd05eb8c09258bdefdece3b283
OpenAccessLink	https://doaj.org/article/29849e8025874be4aee56206a67c0f94
PageCount	11
ParticipantIDs	doaj_primary_oai_doaj_org_article_29849e8025874be4aee56206a67c0f94
PublicationCentury	2000
PublicationDate	2024-04-15
PublicationDateYYYYMMDD	2024-04-15
PublicationDate_xml	– month: 04 year: 2024 text: 2024-04-15 day: 15
PublicationDecade	2020
PublicationTitle	Ji suan ji gong cheng
PublicationYear	2024
Publisher	Editorial Office of Computer Engineering
Publisher_xml	– name: Editorial Office of Computer Engineering
SSID	ssj0042200
Score	2.2529566
Snippet	National University of Defense Technology independently developed a high-performance accelerator that uses an on-chip heterogeneous fusion architecture of a...
SourceID	doaj
SourceType	Open Website
StartPage	321
SubjectTerms	general purpose digital signal processor(gpdsp)\|low level virtual machine(llvm)\|compiler\|instruction scheduling\|vector instruction interface
Title	Design and Optimization of LLVM Compiler for Domestic High Performance Accelerator
URI	https://doaj.org/article/29849e8025874be4aee56206a67c0f94
Volume	50
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVAON databaseName: Directory of Open Access Journals issn: 1000-3428 databaseCode: DOA dateStart: 20160101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.doaj.org/ omitProxy: false ssIdentifier: ssj0042200 providerName: Directory of Open Access Journals
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NS8NAEF2kiOhB_MRvVvCadpNsstljtRYPtRZR6S3szk5Aoa3U6u93dhO1nrx4TSCEGd68GWbmDWMXCeaZUYWLlBZVJLNCEKTQRkY4AKiIosCEYxNqOCzGYz1aOvXlZ8JqeeDacJ1EF1JjQdRcKGlRGkSibJGbXIGodFACFUp_FVN1DJZJImodAkFRhjLsNXYeNvMoNHdeArza3-_a9a6K-KXaH-ilv8U2m7yQd-v_2WYrON1hG0tqgbvsvhemLTjV_vyOkD5pVij5rOKDwdMt99gmkM85JaK8N5t4AQ3gfpKDj372A3gXgLgmtNf32GP_-uHqJmpOIkQuTvNFpIAqGokpUJnjfBPPGk0xx2KeSkKryZ0BYzQ4kaEtQGgym3VYOQRMLaUS-6w1nU3xgHFrlE6UE67yFJ8Wvh9oEXUM0oBM4kN26c1RvtaqF6XXoQ4PyDtl453yL-8c_cdHjtl6QqmE7-HE2QlrLebveMpW4WPx_DY_C47_BBdAsBc
linkProvider	Directory of Open Access Journals
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+Optimization+of+LLVM+Compiler+for+Domestic+High+Performance+Accelerator&rft.jtitle=Ji+suan+ji+gong+cheng&rft.au=SONG+Qiang%2C+TANG+Junlong%2C+CHEN+Zhaoyun%2C+SHI+Yang%2C+TAN+Qixuan%2C+XIAO+Ziyang%2C+ZOU+Wanghui&rft.date=2024-04-15&rft.pub=Editorial+Office+of+Computer+Engineering&rft.issn=1000-3428&rft.volume=50&rft.issue=4&rft.spage=321&rft.epage=331&rft_id=info:doi/10.19678%2Fj.issn.1000-3428.0067000&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_29849e8025874be4aee56206a67c0f94
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1000-3428&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1000-3428&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1000-3428&client=summon