Predicting Stroke Through Health Data Analysis and Stacking Integration Algorithm

Stroke is the second leading cause of death worldwide and a significant contributor to the global burden of disability. This study aims to analyze health data from community residents using ensemble learning algorithms to screen for early stroke symptoms, enhancing the reliability and accuracy of pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2025 5th International Conference on Consumer Electronics and Computer Engineering (ICCECE) S. 1 - 5
Hauptverfasser: Wang, Yi, Shang, Xinping, Dong, A'ni, Zhang, Chulan
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 28.02.2025
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Stroke is the second leading cause of death worldwide and a significant contributor to the global burden of disability. This study aims to analyze health data from community residents using ensemble learning algorithms to screen for early stroke symptoms, enhancing the reliability and accuracy of predictions and reducing healthcare costs. We propose a Stacking Integration method, SIXCL, which combines three gradient boosting decision tree algorithms-XGBoost, CatBoost, and LightGBM-and integrates them with logistic regression with L2 regularization through stacking techniques. To address the extreme data imbalance, oversampling techniques were employed to balance the dataset. Experimental results demonstrate that the SIXCL method achieved a prediction accuracy of 96.4%, significantly outperforming single models. Furthermore, the study identified key factors influencing stroke prediction. This research provides a cost-effective approach to early stroke prevention and screening and offers recommendations for future stroke prevention strategies.
DOI:10.1109/ICCECE65250.2025.10984591