Predicting Diabetes Mellitus with Machine Learning Techniques

Blood sugar issues are a major health issue worldwide, with their incidence growing rapidly and affecting human health, economic systems, and societal structures. If diabetes remains untreated and undiagnosed, it can cause blood sugar levels to vary significantly, potentially damaging essential orga...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Al-Iraqia Journal for Scientific Engineering Research Ročník 4; číslo 2; s. 20 - 32
Hlavní autori: Ahmed Jassim, Heba, R. Kadhim, Omar, Khduair Taha, Zahraa, Siaw Paw, Johnny Koh, Tak, Yaw Chong, Kiong, Tiong Sieh
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Al-Iraqia University - College of Engineering 19.06.2025
Predmet:
ISSN:2710-2165, 2710-2165
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Blood sugar issues are a major health issue worldwide, with their incidence growing rapidly and affecting human health, economic systems, and societal structures. If diabetes remains untreated and undiagnosed, it can cause blood sugar levels to vary significantly, potentially damaging essential organs like the kidneys, eyes, and arteries of the heart in critical cases. As a result, there is an increasing focus on the prevention and early detection of diabetes mellitus within the medical community. Utilizing machine learning algorithms to analyze appropriate datasets for early disease prediction could prove life-saving. The objective of this paper is to examine four algorithms that are proposed to enhance the diagnosis of diabetes. This research analyzes the effectiveness of various machine learning algorithms in processing datasets with minority classes. The evaluation was based on the classification report (including accuracy, precision, recall, and F1-score), the confusion matrix, and the ROC AUC. The Diabetes Prediction Dataset is used to evaluate four machine learning algorithms. The classifier that deserves a singular mention is the Artificial Neural Network (ANN), which achieves a 97% accuracy rate. This demonstrates its capability of classifying instances that are common and less common types. The Random Forest and Decision Tree models also perform well in terms of their ability to deliver strong performance, and the outcome shows some incremental differences, suggesting their ability to manage the dataset is quite high. However, the Support Vector Machine (SVM) model performs worse than all the above models at 96.36% and seems to struggle with the correct classification of less frequent instances. Therefore, it would be problematic to distinguish between classes that are prominent and those that are not. Notably, the ANN, Random Forest, and Decision Tree models effectively identify cases that are more likely to capture rare cases, an important aspect when dealing with datasets that have class imbalance.
ISSN:2710-2165
2710-2165
DOI:10.58564/IJSER.4.2.2025.315