Late Breaking Results: Fine-Tuning LLMs for Test Stimuli Generation

The understanding and reasoning capabilities of large language models (LLMs) with text data have made them widely used for test stimuli generation. Existing studies have primarily focused on methods such as prompt engineering or providing feedback to the LLMs' generated outputs to improve test...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2025 62nd ACM/IEEE Design Automation Conference (DAC) s. 1 - 2
Hlavní autoři: Park, Hyeonwoo, Park, Seonghyeon, Kang, Seokhyeong
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 22.06.2025
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The understanding and reasoning capabilities of large language models (LLMs) with text data have made them widely used for test stimuli generation. Existing studies have primarily focused on methods such as prompt engineering or providing feedback to the LLMs' generated outputs to improve test stimuli generation. However, these approaches have not been successful in enhancing the LLMs' domain-specific performance in generating test stimuli. In this paper, we introduce a framework for finetuning LLMs for test stimuli generation through dataset generation and reinforcement learning (RL). Our dataset generation approach creates a table-shaped test stimuli dataset, which helps ensure that the LLM produces consistent outputs. Additionally, our two-stage fine-tuning process involves training the LLMs on domain-specific data and using RL to provide feedback on the generated outputs, further enhancing the LLMs' performance in test stimuli generation. Experimental results confirm that our framework improves syntax correctness and code coverage of test stimuli, outperforming commercial models.
DOI:10.1109/DAC63849.2025.11132967