Geographical origin identification of sweet cherry based on quality traits combined with DD-SIMCA and XGBoost

Geographical origin identification technologies based on physical and nutritional characteristics have recently been developed and applied. This study evaluated the feasibility of identifying the geographical origin of sweet cherries using organoleptic traits and phenolic compound profiles. Data-dri...

Full description

Saved in:
Bibliographic Details
Published in:Food chemistry Vol. 492; no. Pt 2; p. 145525
Main Authors: Wu, Linxia, Liu, Ziye, Wang, Meng
Format: Journal Article
Language:English
Published: England Elsevier Ltd 15.11.2025
Subjects:
ISSN:0308-8146, 1873-7072, 1873-7072
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Geographical origin identification technologies based on physical and nutritional characteristics have recently been developed and applied. This study evaluated the feasibility of identifying the geographical origin of sweet cherries using organoleptic traits and phenolic compound profiles. Data-driven soft independent modeling of class analogy (DD-SIMCA) and extreme gradient boosting (XGBoost) were applied to 170 sweet cherry samples collected in 2023 and 2024 from Beijing, Dalian, Tianshui, and Yantai, China. Measurements included transverse diameter, longitudinal diameter, fruit weight, soluble solid content, titratable acidity, organic acids, ascorbic acid, and 14 phenolic compounds. The DD-SIMCA model showed high sensitivity (98.00 %) and specificity (100.00 %). XGBoost yielded a prediction accuracy of 94.12 %, outperforming LDA (82.35 %), RF (88.24 %), and k-NN (82.35 %). Key discriminatory features included malic acid, quinic acid, citric acid, kaempferol-3-O-rutinoside, titratable acidity, and cyanidin-3-O-rutinoside. These findings indicate that DD-SIMCA and XGBoost are effective methods for the geographical origin identification of sweet cherries based on quality attributes. This approach supports quality assurance and control in regional production systems. •The geographical origin of sweet cherries were identified by DD-SIMCA and XGBoost;•The quality traits included organoleptic traits and phenolic compound profiles;•The DD-SIMCA model had good sensitivity (98.00 %) and specificity (100.00 %);•XGBoost exhibited higher prediction accuracy (94.12 %), compared with LDA, RF and k-NN;•The key features were malic acid, quinic acid, citric acid, kaempferol-3-O-rutinoside.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0308-8146
1873-7072
1873-7072
DOI:10.1016/j.foodchem.2025.145525