View in EDS

Do LLMs Speak BPMN? An Evaluation of Their Process Modeling Capabilities Based on Quality Measures.

Saved in:

Bibliographic Details
Title:	Do LLMs Speak BPMN? An Evaluation of Their Process Modeling Capabilities Based on Quality Measures.
Authors:	Drakopoulos, Panagiotis, Malousoudis, Panagiotis, Nousias, Nikolaos, Tsakalidis, George, Vergidis, Kostas
Source:	Computation; Jan2026, Vol. 14 Issue 1, p10, 19p
Subject Terms:	BUSINESS process modeling, LANGUAGE models, READABILITY (Literary style), AUTOMATION, EVALUATION research, STATISTICAL accuracy, FLOW charts
Abstract:	Large Language Models (LLMs) are emerging as powerful tools for automating business process modeling, promising to streamline the translation of textual process descriptions into Business Process Model and Notation (BPMN) diagrams. However, the extent to which these Al systems can produce high-quality BPMN models has not yet been rigorously evaluated. This paper presents an early evaluation of five LLM-powered BPMN generation tools that automatically convert textual process descriptions into BPMN models. To assess the external quality of these Al-generated models, we introduce a novel structured evaluation framework that scores each BPMN diagram across three key process model quality dimensions: clarity, correctness, and completeness, covering both accuracy and diagram understandability. Using this framework, we conducted experiments where each tool was tasked with modeling the same set of textual process scenarios, and the resulting diagrams were systematically scored based on the criteria. This approach provides a consistent and repeatable evaluation procedure and offers a new lens for comparing LLM-based modeling capabilities. Given the focused scope of the study, the results should be interpreted as an exploratory benchmark that surfaces initial observations about tool performance rather than definitive conclusions. Our findings reveal that while current LLM-based tools can produce BPMN diagrams that capture the main elements of a process description, they often exhibit errors such as missing steps, inconsistent logic, or modeling rule violations, highlighting limitations in achieving fully correct and complete models. The clarity and readability of the generated diagrams also vary, indicating that these Al models are still maturing in generating easily interpretable process flows. We conclude that although LLMs show promise in automating BPMN modeling, significant improvements are needed for them to consistently generate both syntactically and semantically valid process models. [ABSTRACT FROM AUTHOR]
	Copyright of Computation is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database:	Biomedical Index

Full Text Finder

Nájsť tento článok vo Web of Science

Description
Abstract:	Large Language Models (LLMs) are emerging as powerful tools for automating business process modeling, promising to streamline the translation of textual process descriptions into Business Process Model and Notation (BPMN) diagrams. However, the extent to which these Al systems can produce high-quality BPMN models has not yet been rigorously evaluated. This paper presents an early evaluation of five LLM-powered BPMN generation tools that automatically convert textual process descriptions into BPMN models. To assess the external quality of these Al-generated models, we introduce a novel structured evaluation framework that scores each BPMN diagram across three key process model quality dimensions: clarity, correctness, and completeness, covering both accuracy and diagram understandability. Using this framework, we conducted experiments where each tool was tasked with modeling the same set of textual process scenarios, and the resulting diagrams were systematically scored based on the criteria. This approach provides a consistent and repeatable evaluation procedure and offers a new lens for comparing LLM-based modeling capabilities. Given the focused scope of the study, the results should be interpreted as an exploratory benchmark that surfaces initial observations about tool performance rather than definitive conclusions. Our findings reveal that while current LLM-based tools can produce BPMN diagrams that capture the main elements of a process description, they often exhibit errors such as missing steps, inconsistent logic, or modeling rule violations, highlighting limitations in achieving fully correct and complete models. The clarity and readability of the generated diagrams also vary, indicating that these Al models are still maturing in generating easily interpretable process flows. We conclude that although LLMs show promise in automating BPMN modeling, significant improvements are needed for them to consistently generate both syntactically and semantically valid process models. [ABSTRACT FROM AUTHOR]
ISSN:	20793197
DOI:	10.3390/computation14010010