CADialogue: A multimodal LLM-powered conversational assistant for intuitive parametric CAD modeling
Recent advances in generative Artificial Intelligence (AI)—particularly Large Language Models (LLMs)—offer a new paradigm for CAD interaction by enabling natural and intuitive input through texts, images, and context-aware selections. In this study, we present CADialogue, a multimodal LLM-powered co...
Saved in:
| Published in: | Computer aided design Vol. 191; p. 104006 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Ltd
01.02.2026
|
| Subjects: | |
| ISSN: | 0010-4485 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Recent advances in generative Artificial Intelligence (AI)—particularly Large Language Models (LLMs)—offer a new paradigm for CAD interaction by enabling natural and intuitive input through texts, images, and context-aware selections. In this study, we present CADialogue, a multimodal LLM-powered conversational assistant to enable intuitive parametric CAD modeling through natural language, speech, image, and selection-based geometry interactions. Built on a general-purpose large language model, CADialogue translates user prompts into executable code to support geometry creation and context-aware editing. The system features a modular architecture that decouples prompt handling, refinement logic, and execution—allowing seamless model replacement as LLMs develop—and includes caching for rapid reuse of validated designs. We evaluate the system on 70 modeling and 10 editing tasks across varying difficulty levels, assessing performance in terms of accuracy, refinement behavior, and execution time. Results show an overall success rate of 95.71%, combining a 91.43% baseline under Text-Only input with additional recoveries enabled by Text + Image input, with robust recovery from failure via self-correction and human-in-the-loop refinement. Comparative analysis reveals that image input improves success in semantically complex prompts but introduces additional processing time. Furthermore, caching confirmed macros yields over 85.71% speedup in repeated executions. These findings highlight the potential of general-purpose LLMs for enabling accessible, iterative, and accurate CAD modeling workflows without domain-specific fine-tuning. The source code and dataset for CADialogue are available at https://github.com/Hiram31/CADialogue. |
|---|---|
| ISSN: | 0010-4485 |
| DOI: | 10.1016/j.cad.2025.104006 |