Advanced Deep Learning Model for Improving Student Evaluation in Programming Education

This paper investigates the potential applications of the pre-trained language model UniXcoder in programming education, with a focus on the utility of its embeddings in evaluating students' solutions. Due to its transformer-based architecture, UniXcoder is capable of generating syntactically a...

Full description

Saved in:

Bibliographic Details
Published in:	2025 International Symposium on iNnovative Informatics of Biskra (ISNIB) pp. 1 - 6
Main Authors:	Chachoui, Yasmine, Azizi, Nabiha
Format:	Conference Proceeding
Language:	English
Published:	IEEE 28.01.2025
Subjects:	Accuracy Automated Assessment Bagging Code Embeddings Codes Education Feature extraction Informatics Machine Learning Predictive models Programming Education Programming profession Random forests Transformers UniXcoder
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper investigates the potential applications of the pre-trained language model UniXcoder in programming education, with a focus on the utility of its embeddings in evaluating students' solutions. Due to its transformer-based architecture, UniXcoder is capable of generating syntactically and semantically meaningful code embeddings, enabling it to effectively understand and represent code patterns. The study employs these embeddings as input for predictive models to predict the correctness of student submissions. Using the Dublin introductory programming submission dataset (7,490 instances), Singapore dataset (4,393 instances), and Annaba dataset (87 instances), various machine learning models were compared. The primary contribution is the comparison of VniXcoder-based embeddings with traditional feature extraction methods. Results demonstrate that UniXcoder embeddings enhance prediction performance effectively, with the best model being the MLP with UniXcoder embeddings, achieving an accuracy of 96.06%, F1 score of 96.63%, Cohen's Kappa of 0.91, and ROC-AUC of 97.83% for the Dublin dataset. For the Singapore dataset, the MLP model achieved an accuracy of 87.71%, F1 score of 89.09%, Cohen's Kappa of 0.7504, and ROC-AUC of 92.73%. Additionally, the Balanced Bagging model on the Annaba dataset showed strong performance with an accuracy of 94.12%, F1 score of 95.24%, Cohen's Kappa of 0.8759, and ROC-AUC of 98.48%. These findings indicate that UniXcoder embeddings represent a promising approach for improving automated programming assessment.
DOI:	10.1109/ISNIB64820.2025.10983895