Clozemaster: Fuzzing Rust Compiler by Harnessing Llms for Infilling Masked Real Programs

Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings / International Conference on Software Engineering S. 1422 - 1435
Hauptverfasser:	Gao, Hongyan, Yang, Yibiao, Sun, Maolin, Wu, Jiangchang, Zhou, Yuming, Xu, Baowen
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 26.04.2025
Schlagworte:	bug detection Codes Computer bugs Fuzzing Instruction sets large language model Large language models Prototypes Reliability Rust Compiler Safety Software engineering Syntactics
ISSN:	1558-1225
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's complex syntax and strict requirements. With the growing popularity of large language models (LLMs), much research in software testing has explored using LLMs to generate test cases. Still, directly using LLMs to generate Rust programs often results in a large number of invalid test cases. Existing studies have indicated that test cases triggering historical compiler bugs can assist in software testing. Our investigation into Rust compiler bug issues supports this observation. Inspired by existing work and our empirical research, we introduce a bracket-based masking and filling strategy called clozeMask. The clozeMask strategy involves extracting test code from historical issue reports, identifying and masking code snippets with specific structures, and using an LLM to fill in the masked portions for synthesizing new test programs. This approach harnesses the generative capabilities of LLMs while retaining the ability to trigger Rust compiler bugs. It enables comprehensive testing of the compiler's behavior, particularly exploring edge cases. We implemented our approach as a prototype ClozeMaster. ClozeMaster has identified 27 confirmed bugs for rustc and mrustc, of which 10 have been fixed by developers. Furthermore, our experimental results indicate that ClozeMaster outperforms existing fuzzers in terms of code coverage and effectiveness.
ISSN:	1558-1225
DOI:	10.1109/ICSE55347.2025.00175