An Imbalanced-Data Processing Algorithm for the Prediction of Heart Attack in Stroke Patients
Early predicting heart attack out of stroke patients in a view of data analysis is an approach to reduce a high mortality rate. Stroke-patient data in Intensive Care Unit are imbalanced due to that stroke patients with heart attack are in the minority of stroke patients. How to predict heart attack...
Uloženo v:
| Vydáno v: | IEEE access Ročník 9; s. 25394 - 25404 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2169-3536, 2169-3536 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Early predicting heart attack out of stroke patients in a view of data analysis is an approach to reduce a high mortality rate. Stroke-patient data in Intensive Care Unit are imbalanced due to that stroke patients with heart attack are in the minority of stroke patients. How to predict heart attack in the stroke-patient data becomes a challenge. For processing the imbalanced data, this paper designs an algorithm by leveraging random undersampling, clustering and oversampling techniques, which is called undersampling-clustering-oversampling algorithm (shortly, UCO algorithm). The UCO algorithm generates nearly balanced data which are utilized to train machine-learning models for predicting heart attack. Over the database of Medical Information Mart for Intensive Care III, extensive experiments are conducted to evaluate the UCO algorithm. A setting of undersampling number of 120 in the algorithm UCO, denoted UCO(120), shows good performance in helping machine-learning classifiers extract features. Five classifiers are separately deployed to predict heart attack based on outputs of the UCO(120). Our results show that random forest classifier achieves the best predicting performance with an <inline-formula> <tex-math notation="LaTeX">accuracy </tex-math></inline-formula> of 70.29%, and <inline-formula> <tex-math notation="LaTeX">precision </tex-math></inline-formula> of 70.05%. It could be well-predicted using UCO(120) and random forest that whether a stroke patient will have heart attack or not. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2169-3536 2169-3536 |
| DOI: | 10.1109/ACCESS.2021.3057693 |