Predicting Tropical Cyclone Extreme Rainfall in Guangxi, China: An Interpretable Machine Learning Framework Addressing Class Imbalance and Feature Optimization
ABSTRACT Accurate prediction of tropical cyclone‐induced extreme rainfall (TCER) is of utmost importance for disaster mitigation in coastal regions. However, it remains a formidable challenge due to the intricate interactions among multi‐scale meteorological factors and the inherent data imbalances....
Uloženo v:
| Vydáno v: | Meteorological applications Ročník 32; číslo 3 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Chichester, UK
John Wiley & Sons, Ltd
01.05.2025
Wiley |
| Témata: | |
| ISSN: | 1350-4827, 1469-8080 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | ABSTRACT
Accurate prediction of tropical cyclone‐induced extreme rainfall (TCER) is of utmost importance for disaster mitigation in coastal regions. However, it remains a formidable challenge due to the intricate interactions among multi‐scale meteorological factors and the inherent data imbalances. This study presented an interpretable machine learning (ML) framework aimed at predicting both the occurrence and magnitude of TCER in Guangxi (GX), China. The framework integrated three supervised learning algorithms, namely XGBoost, Random Forest, and AdaBoost, along with feature selection techniques and an explainable method. A total of 202 experiments were conducted to comprehensively evaluate the framework's performance. Genetic Algorithm (GA) optimization and Shapley additive explanations (SHAP) were utilized to identify the optimal subsets of features and accurately quantify the contributions of each variable. Results showed that the optimized XGBoost model exhibited outstanding performance, integrating 18 predictors across dynamic, thermodynamic, moisture, and precursor variables, with a Threat Score of 0.41 for the classification of TCER occurrence and a Threat Score of 0.49 for the regression of rainfall magnitude, outperforming the TIGGE ensemble data in case studies of typhoons Chaba (2022) and Doksuri (2023). SHAP analysis revealed that Distance to Track is the most crucial factor for TCER occurrence. It also unveiled the existence of nonlinear interactions. For instance, an increase in vertical wind shear, favorable thermal conditions, ascending motion, and subtropical high activity can substantially amplify the likelihood of TCER when coupled with low‐level humidity accumulation. Moreover, time‐lagged variables and time‐evolution variables demonstrated their ability to capture the precursor signals of TCER events, like humidity accumulation, circulation adjustment, and typhoon intensity changes, highlighting the model's effectiveness in considering these factors. Therefore, this study showcases the great potential of ML in enhancing TCER prediction while maintaining physical interpretability. Additionally, it offers a valuable reference for addressing imbalance issues in similar research fields.
This study developed a machine learning (ML) framework to predict TC‐induced extreme rainfall events and their magnitudes in Guangxi (GX), China, providing a reference for the solution of imbalance problems in similar fields. This study demonstrates the transformative potential of ML in advancing TCER prediction while maintaining physical interpretability. |
|---|---|
| Bibliografie: | Funding This work was supported by the Natural Science Foundation of Guangxi (2023GXNSFBA026346, 2024GXNSFBA010259, 2020GXNSFAA159092, and 2023GXNSFBA026349), Innovation and Development Special Project of China Meteorological Administration (CXFZ2025J013), Guangxi Key Research and Development Program (Guike AB25069165), Laboratory of Beihai National Climate Observatory (BNCO‐N202301), Guangxi Meteorological Research Program Project (Guiqike2024QN04). |
| ISSN: | 1350-4827 1469-8080 |
| DOI: | 10.1002/met.70052 |