Optimal Channel Selection for FY-4B GIIRS Explainable Machine Learning Cloud Detection Algorithm

Cloud detection is a crucial preliminary step for assimilating meteorological satellite observation and retrieving other atmospheric parameters. This article presents an explainable machine learning (ML) algorithm for cloud detection using observations from the FY-4B Geostationary Interferometric In...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on geoscience and remote sensing Ročník 63; s. 1 - 12
Hlavní autori: Yang, Haoyu, Guan, Li
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:0196-2892, 1558-0644
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Cloud detection is a crucial preliminary step for assimilating meteorological satellite observation and retrieving other atmospheric parameters. This article presents an explainable machine learning (ML) algorithm for cloud detection using observations from the FY-4B Geostationary Interferometric Infrared Sounder (GIIRS). Four ML models-random forest (RF), light gradient boosting machine (LightGBM), categorical boosting (CatBoost), and extreme gradient boosting (XGBoost)-were evaluated first for their effectiveness in cloud detection. The top 250 channels were selected as model inputs after feature importance analysis, which optimizes both computational efficiency and detection accuracy. Among the evaluated models, XGBoost demonstrated superior performance with a detection accuracy of 83.5%. An advanced channel selection strategy based on the SHapley Additive exPlanation (SHAP) analysis is proposed. The recognition accuracy using a subset of fewer 74 channels according to SHAP analysis is comparable with 250. FY-4B GIIRS real case applications have shown that this algorithm can be used operationally to retrieve GIIRS cloud mask products with fast speed and high accuracy. It takes no more than 1 s to do cloud mask for the entire China region. The results demonstrate a strong alignment with the Advanced Geosynchronous Radiation Imager (AGRI) L2 operational cloud mask product and visible channel albedo observations with high spatial resolution. Additionally, the algorithm maintains high detection accuracy even in regions with thin cirrus clouds. Due to the lower spatial resolution of GIIRS, the XGBoost model may classify probably cloud and probably clear areas as clear sky and clear sky areas with some cloud cover as partly cloudy covered. To evaluate its robustness and generalizability, the model was successfully applied to a similar instrument FY-3E/HIRAS-II uploaded on the polar satellite platform. It also demonstrates strong potential for operational application. Furthermore, the model robustness is tested during different seasons. The results revealed that the model was trained using combined seasonal training data (namely, increasing the representativeness of the samples) would enhance cross-seasonal cloud detection performance.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2025.3597266