Zobraziť v EDS

SemetonBug: A Machine Learning Model for Automatic Bug Detection in Python Code Based on Syntactic Analysis

Uložené v:

Podrobná bibliografia
Názov:	SemetonBug: A Machine Learning Model for Automatic Bug Detection in Python Code Based on Syntactic Analysis
Autori:	Bahtiar Imran, Selamet Riadi, Emi Suryadi, M. Zulpahmi, Zaeniah Zaeniah, Erfan Wahyudi
Zdroj:	Jurnal Informatika. 12:75-80
Informácie o vydavateľovi:	Universitas Bina Sarana Informatika, 2025.
Rok vydania:	2025
Popis:	Bug detection in Python programming is a crucial aspect of software development. This study develops an automated bug detection system using feature extraction based on Abstract Syntax Tree (AST) and a Random Forest Classifier model. The dataset consists of 100 manually classified bugged files and 100 non-bugged files. The model is trained using structural code features such as the number of functions, classes, variables, conditions, and exception handling. Evaluation results indicate an accuracy of 86.67%, with balanced precision and recall across both classes. Confusion matrix analysis identifies the presence of false positives and false negatives, albeit in relatively low numbers. The accuracy curve suggests a potential overfitting issue, as training accuracy is higher than testing accuracy. This study demonstrates that the combination of AST-based feature extraction and Random Forest can be an effective approach for automated bug detection, with potential improvements through model optimization and a larger dataset.
Druh dokumentu:	Article
ISSN:	2528-2247 2355-6579
DOI:	10.31294/inf.v12i2.25340
Rights:	CC BY SA
Prístupové číslo:	edsair.doi...........ee702b6f060dd02ced4a2a60eff657db
Databáza:	OpenAIRE

Nájsť tento článok vo Web of Science

Popis
Abstrakt:	Bug detection in Python programming is a crucial aspect of software development. This study develops an automated bug detection system using feature extraction based on Abstract Syntax Tree (AST) and a Random Forest Classifier model. The dataset consists of 100 manually classified bugged files and 100 non-bugged files. The model is trained using structural code features such as the number of functions, classes, variables, conditions, and exception handling. Evaluation results indicate an accuracy of 86.67%, with balanced precision and recall across both classes. Confusion matrix analysis identifies the presence of false positives and false negatives, albeit in relatively low numbers. The accuracy curve suggests a potential overfitting issue, as training accuracy is higher than testing accuracy. This study demonstrates that the combination of AST-based feature extraction and Random Forest can be an effective approach for automated bug detection, with potential improvements through model optimization and a larger dataset.
ISSN:	25282247 23556579
DOI:	10.31294/inf.v12i2.25340

Cannot write session to /tmp/vufind_sessions/sess_fj2s1qjlchh0s306eai8leelkv