Using LLMs to Extract UML Class Diagrams from Java and Python Programs:An Empirical Study

Uloženo v:
Podrobná bibliografie
Název: Using LLMs to Extract UML Class Diagrams from Java and Python Programs:An Empirical Study
Autoři: Siala, Hanan, Lano, Kevin
Zdroj: Siala, H & Lano, K 2025, 'Using LLMs to Extract UML Class Diagrams from Java and Python Programs : An Empirical Study', CEUR Workshop Proceedings, vol. 4122.
Rok vydání: 2025
Sbírka: King's College, London: Research Portal
Témata: Unified Modeling Language (UML), UML Class Diagram, Model-driven Reverse Engineering (MDRE), Machine Learning, Large Language Models (LLMs), Java programs, Python programs
Popis: In this paper, we present a comprehensive study of the capabilities of five large language models (LLMs), namely StarCoder2, LLaMA, CodeLlama, Mistral, and DeepSeek, for abstracting UML class diagrams from code, with the aim to provide researchers and developers with insights into the capabilities and limitations of using various LLMs in a model-driven reverse engineering process. We evaluate the LLMs by prompting them to generate UML class diagrams for both Java and Python programs, with the key focus on accuracy, consistency, and F1 score. Our findings reveal that all LLMs have higher accuracy and F1 scores for Python than for Java. DeepSeek and Mistral perform best overall, while LLaMA consistently performs the lowest in all metrics and for both languages.
Druh dokumentu: article in journal/newspaper
Popis souboru: application/pdf
Jazyk: English
Dostupnost: https://kclpure.kcl.ac.uk/portal/en/publications/73e37044-11bc-48ac-8cff-42d183c617b6
https://kclpure.kcl.ac.uk/ws/files/362383754/Using_LLMs_to_extract_UML_class_diagrams_from_Java_and_Version_of_Record.pdf
https://ceur-ws.org/Vol-4122/
Rights: info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0/
Přístupové číslo: edsbas.EE6BED0B
Databáze: BASE
Popis
Abstrakt:In this paper, we present a comprehensive study of the capabilities of five large language models (LLMs), namely StarCoder2, LLaMA, CodeLlama, Mistral, and DeepSeek, for abstracting UML class diagrams from code, with the aim to provide researchers and developers with insights into the capabilities and limitations of using various LLMs in a model-driven reverse engineering process. We evaluate the LLMs by prompting them to generate UML class diagrams for both Java and Python programs, with the key focus on accuracy, consistency, and F1 score. Our findings reveal that all LLMs have higher accuracy and F1 scores for Python than for Java. DeepSeek and Mistral perform best overall, while LLaMA consistently performs the lowest in all metrics and for both languages.