Enhancing Hydrogen Energy Consumption Prediction Based on Stacked Machine Learning Model with Shapley Additive Explanations

Enhancing hydrogen-based energy systems requires accurate hydrogen consumption forecast. This paper compares random forest regressor, multi-layer perceptron, support vector regressor, and gradient boosting regressor to forecast hydrogen consumption, using production and consumption capacity and geog...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Process integration and optimization for sustainability Ročník 9; číslo 5; s. 1847 - 1868
Hlavní autoři: Elshewey, Ahmed M., Hassan, Samah A. Z., Youssef, Rasha Y., El-Bakry, Hazem M., Osman, Ahmed M.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Singapore Springer Nature Singapore 01.11.2025
Springer Nature B.V
Témata:
ISSN:2509-4238, 2509-4246
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Enhancing hydrogen-based energy systems requires accurate hydrogen consumption forecast. This paper compares random forest regressor, multi-layer perceptron, support vector regressor, and gradient boosting regressor to forecast hydrogen consumption, using production and consumption capacity and geographical coordinates such as production capacity, consumption capacity, latitude, and longitude from the European Hydrogen Observatory, was analyzed using advanced statistical techniques to ensure robust preprocessing and feature selection. It encompasses hydrogen consumption data from various production pathways (e.g., electrolysis and steam reforming), as reported by the European Clean Hydrogen Observatory, although specific production methods are not individually labeled. The study evaluates model performance using standard regression metrics, including mean absolute error, mean squared error, root mean squared error, coefficient of determination, and median absolute error. The random forest regressor led with 0.9789 and low error metrics (mean absolute error = 0.0010, mean squared error = 0.0030, and root mean squared error = 0.0034). However, to further improve prediction accuracy, a stacking ensemble model was developed by combining random forest regressor, multi-layer perceptron, support vector regressor, and gradient boosting regressor as base learners, with Ridge regression serving as the meta-learner. The stacked model significantly outperformed all individual models, achieving a coefficient of determination score of 0.9963, with a reduction in error metrics (mean absolute error = 0.0009, mean squared error = 0.0002, and root mean squared error = 0.0014). To gain deeper insights into feature importance, SHapley Additive Explanations analysis was conducted on the stacked model. The results indicate that latitude, longitude, and production capacity are the most influential factors affecting hydrogen consumption. This paper indicates that the stacked model can effectively anticipate hydrogen usage, helping researchers and policymakers optimize hydrogen distribution and consumption strategies for sustainable energy planning.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2509-4238
2509-4246
DOI:10.1007/s41660-025-00539-2