View in EDS

An agentic vision-action framework for generative 3D architectural modeling from sketches.

Saved in:

Bibliographic Details
Title:	An agentic vision-action framework for generative 3D architectural modeling from sketches.
Authors:	Zhong, Ximing, Liang, Jiadong, Meng, Xianchuan, Li, Yingkai, Fricker, Pia, Koh, Immanuel
Source:	International Journal of Architectural Computing; Sep2025, Vol. 23 Issue 3, p679-700, 22p
Subject Terms:	THREE-dimensional modeling, ARCHITECTURAL design, SPACE perception, HUMAN-computer interaction, INTELLIGENT agents, GENERATIVE artificial intelligence
Abstract:	In recent years, advances in generative AI have enabled the direct generation of 3D models from sketches or images, offering new possibilities in architectural design. However, most current AI-driven modeling approaches still operate as "black boxes," exhibiting issues such as opaque modeling processes, non-editable outputs, and a lack of semantic depth. In the field of architectural design, ideal tools should not only support structured component generation and spatial reasoning but also facilitate iterative workflows and collaborative creation. To address these challenges, inspired by the iterative design processes of human architects, we propose an agentic vision-action framework to assist architects in reasoning controllable and explainable 3D models from simple sketches. The framework involves the collaboration of multiple AI agents—including a Vision Agent, a 3D Reasoning Agent, a Reflection Agent, and a Data-Driven 3D Layout Agent—that collectively support sketch interpretation, spatial reasoning, and the generation of editable, structured 3D models. By integrating vision-language models (VLMs) with data-driven techniques, the system predicts detailed 3D spatial layouts and enables intuitive modifications through both visual and language inputs. Experimental results show that our approach surpasses existing methods in sketch interpretation, spatial reasoning, and structured 3D model generation. The outputs are not only editable and semantically rich but also composed of interpretable and traceable modeling steps, highlighting the potential of AI to assist architects in explainable and controllable design workflows. Instead of replicating human cognition, the framework is designed to augment it by enabling iterative feedback loops that interpret ambiguity, co-evolve design intent, and support co-constructive human–AI collaboration. [ABSTRACT FROM AUTHOR]
	Copyright of International Journal of Architectural Computing is the property of Sage Publications Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database:	Complementary Index

Nájsť tento článok vo Web of Science

Be the first to leave a comment!