Ensemble learning with RAG model to reduce redundant question topics in auto-generated exam questions

Abstract Reducing redundant question topics during automatic question generation (AQG) is essential for enhancing the quality of test sheets in assessment. Existing AQG models frequently generate repetitive questions due to insufficient named entity (Question Topic) diversity. This study aims to red...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Discover Computing Ročník 28; číslo 1; s. 1 - 17
Hlavní autori: R. Tharaniya Sairaj, S. R. Balasundaram
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Springer 01.08.2025
Predmet:
ISSN:2948-2992
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Abstract Reducing redundant question topics during automatic question generation (AQG) is essential for enhancing the quality of test sheets in assessment. Existing AQG models frequently generate repetitive questions due to insufficient named entity (Question Topic) diversity. This study aims to reduce redundancy in auto generated questions by improving diversity in question topics. The methodology is organised in three main phases: first, a fuzzy ontology mapping technique with ensemble learning is applied to generate expanded entity-relationship set generation from external Knowledge Graphs. Second, the generated entity-relationship set is integrated with Retrieval Augmented Generation (RAG) model for source text expansion via augmentation. Third, T5-shd, a pretrained AQG model is adopted to reduce repetition in generated questions. Comparison against baselines such as T5-e2e and T5-ppl shows substantial performance gain as well as reduction in redundancy. Experimental results on various datasets show that the proposed RAG + T5 model reduces redundant question topics along with an improvement in terms of ROUGE-2 metric (up to 5%) and BERTScore (up to 12%) over existing methods. The application of Ensemble Pruning, specifically with the m-EPIC algorithm, further enhances accuracy while reducing computational overhead (around 26%). These findings highlight the efficacy of combining ensemble learning with RAG-based transformers to refine AQG, ensuring improved question diversity and balanced relevance of generated questions. Additionally, this approach helps to reduce model complexity in Automatic Question Generation.
ISSN:2948-2992
DOI:10.1007/s10791-025-09683-2