IMPROVING SUPPORT VECTOR MACHINE PERFORMANCE WITH BINARY GAUSSIAN IMPROVED WHALE OPTIMIZATION ALGORITHM: A CASE STUDY ON DIABETES DATA

Saved in:
Bibliographic Details
Title: IMPROVING SUPPORT VECTOR MACHINE PERFORMANCE WITH BINARY GAUSSIAN IMPROVED WHALE OPTIMIZATION ALGORITHM: A CASE STUDY ON DIABETES DATA
Authors: Eric Julianto, Natasha Clarrisa Maharani, Syaiful Anam, Safrizal Ardana Ardiyansa, Haidar Ahmad Fajri
Source: BAREKENG: Jurnal Ilmu Matematika dan Terapan. 19:2531-2542
Publisher Information: Universitas Pattimura, 2025.
Publication Year: 2025
Description: Diabetes mellitus is a chronic condition with high blood sugar that can cause severe organ damage, affecting all ages globally. Early diagnosis is crucial for improving patients' quality of life, and machine learning offers a promising approach. The Support Vector Machine (SVM) is effective for classification, but feature selection is essential to enhance the relevance of features. The Whale Optimization Algorithm (WOA) is an optimal method for global feature selection, but it has a drawback-premature convergence, which can lead to suboptimal results. This issue should be addressed by modifying mutation operations, convergence factors, and population initialization, resulting in Binary Gaussian IWOA (BGIWOA). This research focuses on feature selection using BGIWOA, comparing it with Variance Inflation Factor (VIF) using SVM. The result show that BGIWOA is better than VIF and the best configuration BGIWOA’s parameter is with linear kernel. This configuration produces the best accuracy of 95.00%. BGIWOA-SVM demonstrates better accuracy with stable consistency compared to VIF-SVM. The best SVM model achieves average accuracy of 95.62% for training data and 95.58% for validation data, with an accuracy of 93.85% for the test data. This model also yields an average precision of 94.00%, a recall of 91.00%, and an -score of 92.00%. The model was also better than SVM without optimization, which only achieved a training accuracy of 84.25% and a testing accuracy of 81.30%. This model can assist in diagnosing diabetes with accurate and consistent predictions for new data. The results are specific to the diabetes dataset used in this research, so further testing on other binary datasets is necessary to confirm the model's effectiveness and generalizability across different domains and types of data.
Document Type: Article
ISSN: 2615-3017
1978-7227
DOI: 10.30598/barekengvol19iss4pp2531-2542
Rights: CC BY SA
Accession Number: edsair.doi...........05e176f3d0a6aaabf28c704c72d1f3e1
Database: OpenAIRE
Description
Abstract:Diabetes mellitus is a chronic condition with high blood sugar that can cause severe organ damage, affecting all ages globally. Early diagnosis is crucial for improving patients' quality of life, and machine learning offers a promising approach. The Support Vector Machine (SVM) is effective for classification, but feature selection is essential to enhance the relevance of features. The Whale Optimization Algorithm (WOA) is an optimal method for global feature selection, but it has a drawback-premature convergence, which can lead to suboptimal results. This issue should be addressed by modifying mutation operations, convergence factors, and population initialization, resulting in Binary Gaussian IWOA (BGIWOA). This research focuses on feature selection using BGIWOA, comparing it with Variance Inflation Factor (VIF) using SVM. The result show that BGIWOA is better than VIF and the best configuration BGIWOA’s parameter is with linear kernel. This configuration produces the best accuracy of 95.00%. BGIWOA-SVM demonstrates better accuracy with stable consistency compared to VIF-SVM. The best SVM model achieves average accuracy of 95.62% for training data and 95.58% for validation data, with an accuracy of 93.85% for the test data. This model also yields an average precision of 94.00%, a recall of 91.00%, and an -score of 92.00%. The model was also better than SVM without optimization, which only achieved a training accuracy of 84.25% and a testing accuracy of 81.30%. This model can assist in diagnosing diabetes with accurate and consistent predictions for new data. The results are specific to the diabetes dataset used in this research, so further testing on other binary datasets is necessary to confirm the model's effectiveness and generalizability across different domains and types of data.
ISSN:26153017
19787227
DOI:10.30598/barekengvol19iss4pp2531-2542