Integrating multiple feature engineering methods with CatBoost algorithm for the prediction and interpretation of nitrogenous components in bio-oil from biomass pyrolysis

[Display omitted] •A novel framework integrating CatBoost and feature engineering methods was proposed.•Fast pyrolysis enabled higher N content in bio-oil.•Linear relationships among variables exerted the greatest influence on prediction.•N was the common feature which possessed dominant importance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioresource technology Jg. 440; S. 133505
Hauptverfasser: Liu, Xiaorui, Wang, Mingzhu, Yang, Haiping
Format: Journal Article
Sprache:Englisch
Veröffentlicht: England Elsevier Ltd 01.01.2026
Schlagworte:
ISSN:0960-8524, 1873-2976, 1873-2976
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •A novel framework integrating CatBoost and feature engineering methods was proposed.•Fast pyrolysis enabled higher N content in bio-oil.•Linear relationships among variables exerted the greatest influence on prediction.•N was the common feature which possessed dominant importance to each output.•PLS-CatBoost exhibited excellent prediction performance with the least features. Nitrogenous components (NCs) in bio-oil exhibit a dual nature: they are regarded as valuable chemicals, yet are unfavorable for bio-oil using as combustible biofuels due to the NOx precursor behavior. Understanding the formation of NCs is of great importance for regulating bio-oil quality according to its end-use. However, this is fulling of challenges due to the diversity of experiments in existing studies. In this study, a novel framework integrating feature engineering methods with CatBoost algorithm was constructed to accurately predict and explain the formation of NCs from biomass properties and pyrolysis conditions. Statistical analysis revealed that NCs in bio-oil were mainly existed as N-heterocyclics, followed by amides/amines and nitriles. By eliminating the linear relations among features, PLS-CatBoost framework exhibited optimal performance for the prediction of oil N content while PCC-CatBoost for the NCs species, the R2 values of which were all higher than 0.94. N content in biomass was the common feature which possessed dominant and positive importance to each output. When verifying with our experiments, PLS-CatBoost exhibited excellent prediction performance using the least features.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0960-8524
1873-2976
1873-2976
DOI:10.1016/j.biortech.2025.133505