Improving the prediction of global solar radiation using interpretable boosting algorithms coupled SHAP and LIME analysis: a comparative study Improving the prediction of global solar radiation using interpretable boosting algorithms coupled SHAP and LIME analysis: a comparative study

Solar radiation prediction plays a vital role in many areas of hydrological and water resources planning and management. However, the need for a machine learning (ML) model’s interpretability and explainability has motivated the use of various interpretability methods. For these reasons, the present...

Full description

Saved in:
Bibliographic Details
Published in:Theoretical and applied climatology Vol. 156; no. 5; p. 292
Main Authors: Merabet, Khaled, Daif, Noureddine, Di Nunno, Fabio, Granata, Francesco, Difi, Salah, Kisi, Ozgur, Heddam, Salim, Kim, Sungwon, Zounemat-Kermani, Mohammad
Format: Journal Article
Language:English
Published: Vienna Springer Vienna 01.05.2025
Springer Nature B.V
Subjects:
ISSN:0177-798X, 1434-4483
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Solar radiation prediction plays a vital role in many areas of hydrological and water resources planning and management. However, the need for a machine learning (ML) model’s interpretability and explainability has motivated the use of various interpretability methods. For these reasons, the present study was oriented toward the development of robust ML models based on boosting algorithms and enhanced using SHapley Addictive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) algorithms. Six boosting algorithms were used in the present study: adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), categorical boosting (CatBoost), light gradient boosting machine (LightGBM), natural gradient boosting (NGBoost), and histogram gradient boosting (HistGBRT). All models were developed using data collected at the USGS 02187010 station and composed from various weather variables. All models were evaluated using root mean squared error (RMSE), the mean absolute error (MAE), the coefficient of correlation (R), and the Nash–Sutcliffe efficiency (NSE), based on two different scenarios: ( i ) scenario 1 using only weather variables, and ( ii ) scenario 2 weather variables combined with periodicity numbers, i.e., day, month, and year number. The obtained results indicate that the proposed boosting models using periodicity outperform the single models without periodicity, and excellent numerical performances were obtained. For scenario 1, the best accuracy was obtained using the CatBoost1 with R, NSE, RMSE, and MAE values of 0.835, 0.697, 44.407 W/m 2 , and 34.721 W/m 2 , respectively. Using scenario 2, the performances of the models were improved, showing the R, NSE, RMSE, and MAE values significantly improved reaching the values of 0.925, 0.856, 30.617 W/m 2 , and 22.925 W/m 2 , respectively, obtained using the CatBoost1 and HistGBRT1.  Graphical Abstract
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0177-798X
1434-4483
DOI:10.1007/s00704-025-05507-x