Late Breaking Results: Fine-Tuning LLMs for Test Stimuli Generation

The understanding and reasoning capabilities of large language models (LLMs) with text data have made them widely used for test stimuli generation. Existing studies have primarily focused on methods such as prompt engineering or providing feedback to the LLMs' generated outputs to improve test...

Full description

Saved in:

Bibliographic Details
Published in:	2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 2
Main Authors:	Park, Hyeonwoo, Park, Seonghyeon, Kang, Seokhyeong
Format:	Conference Proceeding
Language:	English
Published:	IEEE 22.06.2025
Subjects:	Codes Cognition Design automation Large language models Prompt engineering Reinforcement learning Syntactics Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The understanding and reasoning capabilities of large language models (LLMs) with text data have made them widely used for test stimuli generation. Existing studies have primarily focused on methods such as prompt engineering or providing feedback to the LLMs' generated outputs to improve test stimuli generation. However, these approaches have not been successful in enhancing the LLMs' domain-specific performance in generating test stimuli. In this paper, we introduce a framework for finetuning LLMs for test stimuli generation through dataset generation and reinforcement learning (RL). Our dataset generation approach creates a table-shaped test stimuli dataset, which helps ensure that the LLM produces consistent outputs. Additionally, our two-stage fine-tuning process involves training the LLMs on domain-specific data and using RL to provide feedback on the generated outputs, further enhancing the LLMs' performance in test stimuli generation. Experimental results confirm that our framework improves syntax correctness and code coverage of test stimuli, outperforming commercial models.
DOI:	10.1109/DAC63849.2025.11132967