Predicting Diabetes Mellitus with Machine Learning Techniques
Blood sugar issues are a major health issue worldwide, with their incidence growing rapidly and affecting human health, economic systems, and societal structures. If diabetes remains untreated and undiagnosed, it can cause blood sugar levels to vary significantly, potentially damaging essential orga...
Uloženo v:
| Vydáno v: | Al-Iraqia Journal for Scientific Engineering Research Ročník 4; číslo 2; s. 20 - 32 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Al-Iraqia University - College of Engineering
19.06.2025
|
| Témata: | |
| ISSN: | 2710-2165, 2710-2165 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Blood sugar issues are a major health issue worldwide, with their incidence growing rapidly and affecting human health, economic systems, and societal structures. If diabetes remains untreated and undiagnosed, it can cause blood sugar levels to vary significantly, potentially damaging essential organs like the kidneys, eyes, and arteries of the heart in critical cases. As a result, there is an increasing focus on the prevention and early detection of diabetes mellitus within the medical community. Utilizing machine learning algorithms to analyze appropriate datasets for early disease prediction could prove life-saving. The objective of this paper is to examine four algorithms that are proposed to enhance the diagnosis of diabetes. This research analyzes the effectiveness of various machine learning algorithms in processing datasets with minority classes. The evaluation was based on the classification report (including accuracy, precision, recall, and F1-score), the confusion matrix, and the ROC AUC. The Diabetes Prediction Dataset is used to evaluate four machine learning algorithms. The classifier that deserves a singular mention is the Artificial Neural Network (ANN), which achieves a 97% accuracy rate. This demonstrates its capability of classifying instances that are common and less common types. The Random Forest and Decision Tree models also perform well in terms of their ability to deliver strong performance, and the outcome shows some incremental differences, suggesting their ability to manage the dataset is quite high. However, the Support Vector Machine (SVM) model performs worse than all the above models at 96.36% and seems to struggle with the correct classification of less frequent instances. Therefore, it would be problematic to distinguish between classes that are prominent and those that are not. Notably, the ANN, Random Forest, and Decision Tree models effectively identify cases that are more likely to capture rare cases, an important aspect when dealing with datasets that have class imbalance. |
|---|---|
| ISSN: | 2710-2165 2710-2165 |
| DOI: | 10.58564/IJSER.4.2.2025.315 |