Using LLMs to Extract OCL Specifications from Java and Python Programs:An Empirical Study

Uloženo v:
Podrobná bibliografie
Název: Using LLMs to Extract OCL Specifications from Java and Python Programs:An Empirical Study
Autoři: Siala, Hanan, Lano, Kevin
Zdroj: Siala, H & Lano, K 2025, 'Using LLMs to Extract OCL Specifications from Java and Python Programs : An Empirical Study', CEUR Workshop Proceedings, vol. 4122.
Rok vydání: 2025
Sbírka: King's College, London: Research Portal
Témata: Object Constraint Language (OCL), Machine Learning, Large Language Models (LLMs), Reverse engineering, Java programs, Python programs
Popis: This paper presents a comprehensive study of the application of several open-source Large Language Models (LLMs) for abstracting Object Constraint Language (OCL) specifications from source code. We aim to provide researchers and developers with insights into the capabilities and limitations of using different LLMs to abstract OCL specifications from code. We evaluate a collection of open-source LLMs of comparable size (StarCoder2, LLaMA, CodeLlama, Mistral, and DeepSeek) by prompting them to generate OCL specifications for both Java and Python programs. The results show that both Mistral and DeepSeek outperform other LLMs in abstracting OCL specifications from both languages.
Druh dokumentu: article in journal/newspaper
Popis souboru: application/pdf
Jazyk: English
Dostupnost: https://kclpure.kcl.ac.uk/portal/en/publications/1badd32b-4bb5-4104-8a41-f0d7543d96bc
https://kclpure.kcl.ac.uk/ws/files/362382813/Using_LLMs_to_extract_OCL_specifications_from_Java_and_Python_programs_Version_of_Record.pdf
https://ceur-ws.org/Vol-4122/
Rights: info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0/
Přístupové číslo: edsbas.B08DF974
Databáze: BASE
Popis
Abstrakt:This paper presents a comprehensive study of the application of several open-source Large Language Models (LLMs) for abstracting Object Constraint Language (OCL) specifications from source code. We aim to provide researchers and developers with insights into the capabilities and limitations of using different LLMs to abstract OCL specifications from code. We evaluate a collection of open-source LLMs of comparable size (StarCoder2, LLaMA, CodeLlama, Mistral, and DeepSeek) by prompting them to generate OCL specifications for both Java and Python programs. The results show that both Mistral and DeepSeek outperform other LLMs in abstracting OCL specifications from both languages.