A Novel Oversampling Technique to Solve Class Imbalance Problem: A Case Study of Students' Grades Evaluation
The academic performance of the students is one of the critical aspects in ranking educational institutions, particularly at the secondary level. If the student's performance is not appropriately defined, then the institution's reputation is at risk. Therefore, data mining could be used fo...
Gespeichert in:
| Veröffentlicht in: | 2021 International Conference on Computing, Networking, Telecommunications & Engineering Sciences Applications (CoNTESA) S. 69 - 75 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
09.12.2021
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | The academic performance of the students is one of the critical aspects in ranking educational institutions, particularly at the secondary level. If the student's performance is not appropriately defined, then the institution's reputation is at risk. Therefore, data mining could be used for this purpose, to attain high accuracy. However, the data being incomplete, inaccurate and/or noisy, or with an imbalance class label in the dataset, is highly likely to affect the accuracy of the data mining model. This paper proposes a semi-supervised oversampling method to first prepare a balanced dataset and then to classify the students' grades into a binary class with overall performance in any given course. The student performance dataset from the UCI machine learning repository is used, which contains student performance related data of two different courses. A detailed validation result shows that the decision tree algorithm performs better with the balanced dataset compared to the imbalanced one. |
|---|---|
| DOI: | 10.1109/CoNTESA52813.2021.9657151 |