Predicting Stroke Through Health Data Analysis and Stacking Integration Algorithm

Stroke is the second leading cause of death worldwide and a significant contributor to the global burden of disability. This study aims to analyze health data from community residents using ensemble learning algorithms to screen for early stroke symptoms, enhancing the reliability and accuracy of pr...

Full description

Saved in:
Bibliographic Details
Published in:2025 5th International Conference on Consumer Electronics and Computer Engineering (ICCECE) pp. 1 - 5
Main Authors: Wang, Yi, Shang, Xinping, Dong, A'ni, Zhang, Chulan
Format: Conference Proceeding
Language:English
Published: IEEE 28.02.2025
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Stroke is the second leading cause of death worldwide and a significant contributor to the global burden of disability. This study aims to analyze health data from community residents using ensemble learning algorithms to screen for early stroke symptoms, enhancing the reliability and accuracy of predictions and reducing healthcare costs. We propose a Stacking Integration method, SIXCL, which combines three gradient boosting decision tree algorithms-XGBoost, CatBoost, and LightGBM-and integrates them with logistic regression with L2 regularization through stacking techniques. To address the extreme data imbalance, oversampling techniques were employed to balance the dataset. Experimental results demonstrate that the SIXCL method achieved a prediction accuracy of 96.4%, significantly outperforming single models. Furthermore, the study identified key factors influencing stroke prediction. This research provides a cost-effective approach to early stroke prevention and screening and offers recommendations for future stroke prevention strategies.
DOI:10.1109/ICCECE65250.2025.10984591