Optimal Channel Selection for FY-4B GIIRS Explainable Machine Learning Cloud Detection Algorithm
Cloud detection is a crucial preliminary step for assimilating meteorological satellite observation and retrieving other atmospheric parameters. This article presents an explainable machine learning (ML) algorithm for cloud detection using observations from the FY-4B Geostationary Interferometric In...
Uloženo v:
| Vydáno v: | IEEE transactions on geoscience and remote sensing Ročník 63; s. 1 - 12 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 0196-2892, 1558-0644 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Cloud detection is a crucial preliminary step for assimilating meteorological satellite observation and retrieving other atmospheric parameters. This article presents an explainable machine learning (ML) algorithm for cloud detection using observations from the FY-4B Geostationary Interferometric Infrared Sounder (GIIRS). Four ML models-random forest (RF), light gradient boosting machine (LightGBM), categorical boosting (CatBoost), and extreme gradient boosting (XGBoost)-were evaluated first for their effectiveness in cloud detection. The top 250 channels were selected as model inputs after feature importance analysis, which optimizes both computational efficiency and detection accuracy. Among the evaluated models, XGBoost demonstrated superior performance with a detection accuracy of 83.5%. An advanced channel selection strategy based on the SHapley Additive exPlanation (SHAP) analysis is proposed. The recognition accuracy using a subset of fewer 74 channels according to SHAP analysis is comparable with 250. FY-4B GIIRS real case applications have shown that this algorithm can be used operationally to retrieve GIIRS cloud mask products with fast speed and high accuracy. It takes no more than 1 s to do cloud mask for the entire China region. The results demonstrate a strong alignment with the Advanced Geosynchronous Radiation Imager (AGRI) L2 operational cloud mask product and visible channel albedo observations with high spatial resolution. Additionally, the algorithm maintains high detection accuracy even in regions with thin cirrus clouds. Due to the lower spatial resolution of GIIRS, the XGBoost model may classify probably cloud and probably clear areas as clear sky and clear sky areas with some cloud cover as partly cloudy covered. To evaluate its robustness and generalizability, the model was successfully applied to a similar instrument FY-3E/HIRAS-II uploaded on the polar satellite platform. It also demonstrates strong potential for operational application. Furthermore, the model robustness is tested during different seasons. The results revealed that the model was trained using combined seasonal training data (namely, increasing the representativeness of the samples) would enhance cross-seasonal cloud detection performance. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0196-2892 1558-0644 |
| DOI: | 10.1109/TGRS.2025.3597266 |