Learning from imbalanced COVID-19 chest X-ray (CXR) medical imaging data

[Display omitted] •Presented a systematic approach to learn from imbalanced set of bio-medical images.•Developed a practical “survival of the fittest” approach for hyperparameter tuning of models.•Proposed a framework to use leftout imbalanced data for pseudo-testing purpose.•Provided a publicly ava...

Full description

Saved in:
Bibliographic Details
Published in:Methods (San Diego, Calif.) Vol. 202; pp. 31 - 39
Main Authors: Chan, Jonathan H., Li, Chenqi
Format: Journal Article
Language:English
Published: United States Elsevier Inc 01.06.2022
Subjects:
ISSN:1046-2023, 1095-9130, 1095-9130
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:[Display omitted] •Presented a systematic approach to learn from imbalanced set of bio-medical images.•Developed a practical “survival of the fittest” approach for hyperparameter tuning of models.•Proposed a framework to use leftout imbalanced data for pseudo-testing purpose.•Provided a publicly available chest X-ray dataset on the Kaggle platform.•Outperformed global competitors in terms of F1 and recall scores on the given dataset. The trendy task of digital medical image analysis has been continually evolving. It has been an area of prominent and growing importance from both research and deployment perspectives. Nonetheless, it is necessary to realize that the use of algorithms, methodology, as well as the source of medical image data, must be strictly scrutinized. As the COVID-19 pandemic has been gripping much of the world recently, there has been much efforts gone into developing affordable testing for the masses, and it has been shown that the established and widely available chest X-rays (CXR) images may be used as a screening criteria for assistive diagnosis purpose. Thanks to the dedicated work by various individuals and organizations, publicly available CXR of COVID-19 subjects are available for analytic usage. We have also provided a publicly available CXR dataset on the Kaggle platform. As a case study, this paper presents a systematic approach to learn from a typically imbalanced set of CXR images, which consists of a limited number of publicly available COVID-19 images. Our results show that we are able to outperform the top finishers in a related Kaggle multi-class CXR challenge. The proposed methodology should be able to help guide medical personnel in obtaining a robust diagnosis model to discern COVID-19 from other conditions confidently.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1046-2023
1095-9130
1095-9130
DOI:10.1016/j.ymeth.2021.06.002