EX-CODE: A Robust and Explainable Model to Detect AI-Generated Code.

Gespeichert in:
Bibliographische Detailangaben
Titel: EX-CODE: A Robust and Explainable Model to Detect AI-Generated Code.
Autoren: Bulla, Luana, Midolo, Alessandro, Mongiovì, Misael, Tramontana, Emiliano
Quelle: Information; Dec2024, Vol. 15 Issue 12, p819, 17p
Schlagwörter: LANGUAGE models, ARTIFICIAL intelligence, EDUCATION ethics, CHATGPT, CLASSIFICATION
Abstract: Distinguishing whether some code portions were implemented by humans or generated by a tool based on artificial intelligence has become hard. However, such a classification would be important as it could point developers towards some further validation for the produced code. Additionally, it holds significant importance in security, legal contexts, and educational settings, where upholding academic integrity is of utmost importance. We present EX-CODE, a novel and explainable model that leverages the probability of the occurrence of some tokens, within a code snippet, estimated according to a language model, to distinguish human-written from AI-generated code. EX-CODE has been evaluated on a heterogeneous real-world dataset and stands out for its ability to provide human-understandable explanations of its outcomes. It achieves this by uncovering the features that for a snippet of code make it classified as human-written code (or AI-generated code). [ABSTRACT FROM AUTHOR]
Copyright of Information is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank: Complementary Index
Beschreibung
Abstract:Distinguishing whether some code portions were implemented by humans or generated by a tool based on artificial intelligence has become hard. However, such a classification would be important as it could point developers towards some further validation for the produced code. Additionally, it holds significant importance in security, legal contexts, and educational settings, where upholding academic integrity is of utmost importance. We present EX-CODE, a novel and explainable model that leverages the probability of the occurrence of some tokens, within a code snippet, estimated according to a language model, to distinguish human-written from AI-generated code. EX-CODE has been evaluated on a heterogeneous real-world dataset and stands out for its ability to provide human-understandable explanations of its outcomes. It achieves this by uncovering the features that for a snippet of code make it classified as human-written code (or AI-generated code). [ABSTRACT FROM AUTHOR]
ISSN:20782489
DOI:10.3390/info15120819