Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment

Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation tech...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:International journal of computational intelligence systems Ročník 18; číslo 1; s. 209 - 16
Hlavní autoři: Bansal, Sanskar, Gupta, Vinay, Gupta, Eshita, Garg, Peeyush
Médium: Journal Article
Jazyk:angličtina
Vydáno: Dordrecht Springer Netherlands 12.08.2025
Springer Nature B.V
Springer
Témata:
ISSN:1875-6883, 1875-6891, 1875-6883
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation techniques are influenced by the evaluator's mental and physical state, environmental factors, human bias, emotional swings, and logistical challenges like storage and retrieval. Although sequence-to-sequence neural networks and other existing AI evaluation methods have demonstrated promise, they are constrained by their reliance on high-performance hardware, such as GPUs, lengthy training periods, and challenges in managing a variety of scenarios. The state-of-the-art technique known as Bidirectional Encoders Representation from Transformer (BERT) has overcome the drawbacks of previous NLP techniques like Bag of Words, TF-IDF, and Word2Vec. But BERT depends on surface-level keyword similarity, if the keywords are different then the accuracy is not perfect. This study presents a technique that combines optical character recognition (OCR) technology with DeepSeek-R1 1.5B model to create a robust, efficient, and accurate grading system. To overcome the above-mentioned challenges, we proposed an evaluation technique that uses the Google Cloud Vision API to extract and convert handwritten responses into machine-readable text, thereby providing a pre-processed input for further evaluation in this study. The main aim of this study is to develop a scalable, automated, and effective system for grading handwritten responses by combining DeepSeek for response evaluation with the Google Cloud Vision API for text extraction. To check the performance of the proposed DeepSeek evaluation method, we compare its results with cosine similarity metrics. After testing on multiple assignments, DeepSeek’s independent evaluation method gave the best results: lowest MAE—0.0580, lowest RMSE—0.147, and strongest correlation—0.895. The finding of this technique has shown that the proposed technique is reliable and accurate.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1875-6883
1875-6891
1875-6883
DOI:10.1007/s44196-025-00946-w