A Light Gradient-Boosting Machine algorithm with Tree-Structured Parzen Estimator for breast cancer diagnosis
Breast cancer is a common and potentially life-threatening disease. Early and accurate diagnosis of breast cancer is crucial for effective treatment and improved patient outcomes. This study proposed using the Light Gradient-Boosting Machine (LightGBM) algorithm, Borderline- Synthetic Minority Overs...
Saved in:
| Published in: | Healthcare analytics (New York, N.Y.) Vol. 4; p. 100218 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Inc
01.12.2023
Elsevier |
| Subjects: | |
| ISSN: | 2772-4425, 2772-4425 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Breast cancer is a common and potentially life-threatening disease. Early and accurate diagnosis of breast cancer is crucial for effective treatment and improved patient outcomes. This study proposed using the Light Gradient-Boosting Machine (LightGBM) algorithm, Borderline- Synthetic Minority Oversampling Technique (SMOTE), and the Tree-Structured Parzen Estimator (TPE) for hyperparameter tuning to enhance the effectiveness of the Machine Learning (ML) model for diagnosing breast cancer. A 10-fold cross-validated TPE optimized Borderline-SMOTE LightGBM classifier was modelled on the Wisconsin Diagnostic Breast Cancer (WDBC) Dataset and evaluated for its performance compared to a baseline LightGBM model. The TPE-optimized Borderline-SMOTE LightGBM model exhibited a significant improvement in performance over the baseline model, achieving an average accuracy of 99.12%, specificity of 100%, precision of 100%, recall of 97.62%, F1-score of 98.80%, and a Mathews Correlation Coefficient of 98.12%. Compared to previous studies, the TPE-optimized Borderline-SMOTE LightGBM model performed exceptionally well. The study demonstrates the effectiveness of using data augmentation and hyperparameter optimization techniques to improve the performance of ML models for breast cancer diagnosis, which has significant implications for the medical field where the accurate and efficient diagnosis of breast cancer is critical.
•Propose a light gradient-boosting machine algorithm with a tree-structured Parzen estimator for breast cancer diagnosis.•The proposed model achieved an accuracy of 99.12%.•The proposed model performed exceptionally well compared to previous studies.•The model has the potential to support physicians in breast cancer diagnosis.•The study’s contributions have implications for breast cancer diagnosis and treatment. |
|---|---|
| ISSN: | 2772-4425 2772-4425 |
| DOI: | 10.1016/j.health.2023.100218 |