Clozemaster: Fuzzing Rust Compiler by Harnessing Llms for Infilling Masked Real Programs

Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / International Conference on Software Engineering s. 1422 - 1435
Hlavní autori: Gao, Hongyan, Yang, Yibiao, Sun, Maolin, Wu, Jiangchang, Zhou, Yuming, Xu, Baowen
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 26.04.2025
Predmet:
ISSN:1558-1225
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's complex syntax and strict requirements. With the growing popularity of large language models (LLMs), much research in software testing has explored using LLMs to generate test cases. Still, directly using LLMs to generate Rust programs often results in a large number of invalid test cases. Existing studies have indicated that test cases triggering historical compiler bugs can assist in software testing. Our investigation into Rust compiler bug issues supports this observation. Inspired by existing work and our empirical research, we introduce a bracket-based masking and filling strategy called clozeMask. The clozeMask strategy involves extracting test code from historical issue reports, identifying and masking code snippets with specific structures, and using an LLM to fill in the masked portions for synthesizing new test programs. This approach harnesses the generative capabilities of LLMs while retaining the ability to trigger Rust compiler bugs. It enables comprehensive testing of the compiler's behavior, particularly exploring edge cases. We implemented our approach as a prototype ClozeMaster. ClozeMaster has identified 27 confirmed bugs for rustc and mrustc, of which 10 have been fixed by developers. Furthermore, our experimental results indicate that ClozeMaster outperforms existing fuzzers in terms of code coverage and effectiveness.
AbstractList Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's complex syntax and strict requirements. With the growing popularity of large language models (LLMs), much research in software testing has explored using LLMs to generate test cases. Still, directly using LLMs to generate Rust programs often results in a large number of invalid test cases. Existing studies have indicated that test cases triggering historical compiler bugs can assist in software testing. Our investigation into Rust compiler bug issues supports this observation. Inspired by existing work and our empirical research, we introduce a bracket-based masking and filling strategy called clozeMask. The clozeMask strategy involves extracting test code from historical issue reports, identifying and masking code snippets with specific structures, and using an LLM to fill in the masked portions for synthesizing new test programs. This approach harnesses the generative capabilities of LLMs while retaining the ability to trigger Rust compiler bugs. It enables comprehensive testing of the compiler's behavior, particularly exploring edge cases. We implemented our approach as a prototype ClozeMaster. ClozeMaster has identified 27 confirmed bugs for rustc and mrustc, of which 10 have been fixed by developers. Furthermore, our experimental results indicate that ClozeMaster outperforms existing fuzzers in terms of code coverage and effectiveness.
Author Zhou, Yuming
Xu, Baowen
Gao, Hongyan
Yang, Yibiao
Wu, Jiangchang
Sun, Maolin
Author_xml – sequence: 1
  givenname: Hongyan
  surname: Gao
  fullname: Gao, Hongyan
  email: hongyangao2023@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
– sequence: 2
  givenname: Yibiao
  surname: Yang
  fullname: Yang, Yibiao
  email: yangyibiao@nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
– sequence: 3
  givenname: Maolin
  surname: Sun
  fullname: Sun, Maolin
  email: merlin@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
– sequence: 4
  givenname: Jiangchang
  surname: Wu
  fullname: Wu, Jiangchang
  email: jiangchangwu@smail.nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
– sequence: 5
  givenname: Yuming
  surname: Zhou
  fullname: Zhou, Yuming
  email: zhouyuming@nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
– sequence: 6
  givenname: Baowen
  surname: Xu
  fullname: Xu, Baowen
  email: bwxu@nju.edu.cn
  organization: Nanjing University,State Key Laboratory for Novel Software Technology,Nanjing,China
BookMark eNotkMFOAjEURavRRED-gEV_YPC1b9pO3ZkJyCQYDbJwRwq8ktHODGlhAV8vRFc3uSe5Obl9dtd2LTE2EjAWAuxTVX5OlMLcjCVINQYQRt2woTW2QBQKlLbilvWEUkUmpFQPrJ_SNwDo3Noe-ypDd6bGpQPFZz49ns91u-OLYzrwsmv2daDI1yc-c7GllK5sHprEfRd51fo6hGv15tIPbfmCXOAfsdtF16RHdu9dSDT8zwFbTifLcpbN31-r8mWeOanhkDlTFLhBJeXFUwlrocgBSUgPXhdC0wa136J1Br3zBsF6Y4zWa3QSBeGAjf5mayJa7WPduHhaXZ6R1kiLv6DXUx0
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICSE55347.2025.00175
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library (LUT)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798331505691
EISSN 1558-1225
EndPage 1435
ExternalDocumentID 11029729
Genre orig-research
GroupedDBID -~X
.4S
.DC
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
FEDTE
I-F
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-a260t-a7883c3522691519908403e12f0f6816ec36fd39a73faf7309f77766b3a231e3
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001538318100111&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 01:40:13 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a260t-a7883c3522691519908403e12f0f6816ec36fd39a73faf7309f77766b3a231e3
PageCount 14
ParticipantIDs ieee_primary_11029729
PublicationCentury 2000
PublicationDate 2025-April-26
PublicationDateYYYYMMDD 2025-04-26
PublicationDate_xml – month: 04
  year: 2025
  text: 2025-April-26
  day: 26
PublicationDecade 2020
PublicationTitle Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev ICSE
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0006499
Score 2.3037012
Snippet Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its...
SourceID ieee
SourceType Publisher
StartPage 1422
SubjectTerms bug detection
Codes
Computer bugs
Fuzzing
Instruction sets
large language model
Large language models
Prototypes
Reliability
Rust Compiler
Safety
Software engineering
Syntactics
Title Clozemaster: Fuzzing Rust Compiler by Harnessing Llms for Infilling Masked Real Programs
URI https://ieeexplore.ieee.org/document/11029729
WOSCitedRecordID wos001538318100111&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6EePCED4ziIz14Xdnduu3WK4FAooQgB25kttsmRFjMPkzk1zstC3rx4K1pmzSZZub7pp0HIQ9xDFJHMvICqQEdlMB4oKLUE34iANAiGmNcswkxHsfzuZzUyeouF0Zr7YLP9KMdur_8dKMq-1TWRagKJbLBBmkIIXbJWgezy5G717lxgS-7o95bP4rYk0AfMLTvJoENJfzVQcUByKD1z6NPSfsnFY9ODiBzRo50dk5a-14MtFbNCzLvrTZbvQZb9-CZDqrtFrfTaVWU1G5G3c9p8kWHkFvbZtdeVuuCImWlo8wsXWVu-grFu07pFMmjPdQGbhVtMhv0Z72hV3dN8AB9k9IDdGqZcrxKIpwj2qAPxzRK3Tc8DrhWjJuUSRDMgEEFlwaFyXnCALmeZpekmW0yfUWoRDoEoIwKUbhgUpn4YQIyUIh8vuDmmrStoBYfu7oYi72MOn_M35ATexf2Lybkt6RZ5pW-I8fqs1wW-b27zW8IWKCl
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4UTfSED4xve_C6srtl261XAoEIhCAHbmS22yZEWcw-TOTXO10W9OLBW9M2aTLNzPdNOw9CHsMQpA5k4HhSAzoonnFABbEj3EgAoEU0xpTNJsRoFM5mclwlq5e5MFrrMvhMP9lh-Zcfr1Rhn8qaCFW-RDa4Tw6CVsv3NulaO8PLkb1X2XGeK5v99msnCFhLoBfo25cTzwYT_uqhUkJIt_7Pw09I4ycZj453MHNK9nRyRurbbgy0Us5zMmu_r9Z6CbbywTPtFus1bqeTIsup3Yzan9Loi_YgtdbNrg3elxlF0kr7iVmUtbnpELI3HdMJ0kd7qA3dyhpk2u1M2z2n6pvgAHonuQPo1jJVMiuJgI54g14c0yh31_DQ41oxbmImQTADBlVcGiEE5xEDZHuaXZBaskr0JaESCRGAMspH4YKJZeT6EUhPIfa5gpsr0rCCmn9sKmPMtzK6_mP-gRz1psPBfNAfvdyQY3sv9mfG57eklqeFviOH6jNfZOl9ebPfk2Kj7A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Clozemaster%3A+Fuzzing+Rust+Compiler+by+Harnessing+Llms+for+Infilling+Masked+Real+Programs&rft.au=Gao%2C+Hongyan&rft.au=Yang%2C+Yibiao&rft.au=Sun%2C+Maolin&rft.au=Wu%2C+Jiangchang&rft.date=2025-04-26&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=1422&rft.epage=1435&rft_id=info:doi/10.1109%2FICSE55347.2025.00175&rft.externalDocID=11029729