Zobrazit v EDS

Investigating large language models capabilities for automatic code repair in Python.

Uloženo v:

Podrobná bibliografie
Název:	Investigating large language models capabilities for automatic code repair in Python.
Autoři:	Omari, Safwan, Basnet, Kshitiz, Wardat, Mohammad
Zdroj:	Cluster Computing; Nov2024, Vol. 27 Issue 8, p10717-10731, 15p
Témata:	LANGUAGE models, CHATGPT, ALGORITHMS, ENGINEERING, DOCUMENTATION
Abstrakt:	Developers often encounter challenges with their introductory programming tasks as part of the development process. Unfortunately, rectifying these mistakes manually can be time-consuming and demanding. Automated program repair (APR) techniques offer a potential solution by synthesizing fixes for such errors. Previous research has investigated the utilization of both symbolic and neural techniques within the APR domain. However, these approaches typically demand significant engineering efforts or extensive datasets and training. In this paper, we explore the potential of using a large language model trained on code, specifically, we assess ChatGPT's capability to detect and repair bugs in simple Python programs. The experimental evaluation encompasses two benchmarks: QuixBugs and Textbook. Each benchmark consists of simple Python functions that implement well-known algorithms and each function contains a single bug. To gauge repair performance in various settings, several benchmark variations were introduced including addition of plain English documentation and code obfuscation. Based on thorough experiments, we found that ChatGPT was able to correctly detect and fix about 50% of the methods, when code is documented. Repair performance drops to 25% when code is obfuscated, and 15% when documentation is removed and code is obfuscated. Furthermore, when compared to existing APR systems, ChatGPT considerably outperformed them. [ABSTRACT FROM AUTHOR]
	Copyright of Cluster Computing is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze:	Complementary Index

Full Text Finder

Nájsť tento článok vo Web of Science

Popis
Abstrakt:	Developers often encounter challenges with their introductory programming tasks as part of the development process. Unfortunately, rectifying these mistakes manually can be time-consuming and demanding. Automated program repair (APR) techniques offer a potential solution by synthesizing fixes for such errors. Previous research has investigated the utilization of both symbolic and neural techniques within the APR domain. However, these approaches typically demand significant engineering efforts or extensive datasets and training. In this paper, we explore the potential of using a large language model trained on code, specifically, we assess ChatGPT's capability to detect and repair bugs in simple Python programs. The experimental evaluation encompasses two benchmarks: QuixBugs and Textbook. Each benchmark consists of simple Python functions that implement well-known algorithms and each function contains a single bug. To gauge repair performance in various settings, several benchmark variations were introduced including addition of plain English documentation and code obfuscation. Based on thorough experiments, we found that ChatGPT was able to correctly detect and fix about 50% of the methods, when code is documented. Repair performance drops to 25% when code is obfuscated, and 15% when documentation is removed and code is obfuscated. Furthermore, when compared to existing APR systems, ChatGPT considerably outperformed them. [ABSTRACT FROM AUTHOR]
ISSN:	13867857
DOI:	10.1007/s10586-024-04490-8