Genomic Selection for Cashmere Traits in Inner Mongolian Cashmere Goats Using Random Forest, Gradient Boosting Decision Tree, Extreme Gradient Boosting and Light Gradient Boosting Machine Methods.

Uloženo v:
Podrobná bibliografie
Název: Genomic Selection for Cashmere Traits in Inner Mongolian Cashmere Goats Using Random Forest, Gradient Boosting Decision Tree, Extreme Gradient Boosting and Light Gradient Boosting Machine Methods.
Autoři: Liu, Jiaqi, Yan, Xiaochun, Li, Wenze, Xue, Shan-Hui, Wang, Zhiying, Su, Rui
Zdroj: Animals (2076-2615); Oct2025, Vol. 15 Issue 20, p2940, 14p
Témata: CASHMERE, ANIMAL breeding, GENETICS, BOOSTING algorithms, ENSEMBLE learning, RANDOM forest algorithms, MACHINE learning
Abstrakt: Simple Summary: This study aims to perform genome selection of cashmere traits in Inner Mongolian cashmere goats using machine learning algorithms. By comparing the prediction accuracy of various machine learning algorithms, it explores the feasibility of applying different machine learning algorithms to genome selection of cashmere traits in Inner Mongolian cashmere goats, with the goal of improving the accuracy of genomic selection and enhancing breeding efficiency. Fiber length and cashmere production can enhance the economic value of cashmere goats. We analyzed cashmere trait data from 2299 cashmere goats, including fiber length, cashmere diameter, and cashmere production. We used RF, XGBoost, GBDT, and LightGBM for genome selection in Inner Mongolian cashmere goats. For fiber length, cashmere production, and cashmere diameter, LightGBM, RF, and GBDT achieved the highest selection accuracy after hyperparameter optimization. However, in the case of cashmere traits, the prediction accuracy of XGBoost was the lowest among all the models, at 0.541, 0.309, and 0.387 for fiber length, cashmere production, and cashmere diameter, respectively. For machine learning methods, hyperparameter tuning is essential, as it can improve prediction accuracy. In recent years, Machine Learning (ML) has garnered increasing attention for its applications in genomic prediction. ML effectively processes high-dimensional genomic data and establishes nonlinear models. Compared to traditional Genomic Selection (GS) methods, ML algorithms enhance computational efficiency and offer higher prediction accuracy. Therefore, this study strives to achieve the optimal machine learning algorithm for genome-wide selection of cashmere traits in Inner Mongolian cashmere goats. This study compared the genomic prediction accuracy of cashmere traits using four machine learning algorithms—Random Forest (RF), Extreme Gradient Boosting Tree (XGBoost), Gradient Boosting Decision Tree (GBDT), and LightGBM—based on genotype data and cashmere trait phenotypic data from 2299 Inner Mongolian cashmere goats. The results showed that after parameter optimization, LightGBM achieved the highest selection accuracy for fiber length (56.4%), RF achieved the highest selection accuracy for cashmere production (35.2%), and GBDT achieved the highest selection accuracy for cashmere diameter (40.4%), compared with GBLUP, the accuracy improved by 0.8–2.7%. Among the three traits, XGBoost exhibited the lowest prediction accuracy, at 0.541, 0.309, and 0.387. Additionally, following parameter optimization, the prediction accuracy of the four machine learning methods for cashmere fineness, cashmere yield, and fiber length improved by an average of 2.9%, 2.7%, and 3.8%, respectively. The mean squared error (MSE) and mean absolute error (MAE) for all machine learning methods also decreased, indicating that hyperparameter tuning can enhance prediction accuracy in ML algorithms. [ABSTRACT FROM AUTHOR]
Copyright of Animals (2076-2615) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Biomedical Index
Popis
Abstrakt:Simple Summary: This study aims to perform genome selection of cashmere traits in Inner Mongolian cashmere goats using machine learning algorithms. By comparing the prediction accuracy of various machine learning algorithms, it explores the feasibility of applying different machine learning algorithms to genome selection of cashmere traits in Inner Mongolian cashmere goats, with the goal of improving the accuracy of genomic selection and enhancing breeding efficiency. Fiber length and cashmere production can enhance the economic value of cashmere goats. We analyzed cashmere trait data from 2299 cashmere goats, including fiber length, cashmere diameter, and cashmere production. We used RF, XGBoost, GBDT, and LightGBM for genome selection in Inner Mongolian cashmere goats. For fiber length, cashmere production, and cashmere diameter, LightGBM, RF, and GBDT achieved the highest selection accuracy after hyperparameter optimization. However, in the case of cashmere traits, the prediction accuracy of XGBoost was the lowest among all the models, at 0.541, 0.309, and 0.387 for fiber length, cashmere production, and cashmere diameter, respectively. For machine learning methods, hyperparameter tuning is essential, as it can improve prediction accuracy. In recent years, Machine Learning (ML) has garnered increasing attention for its applications in genomic prediction. ML effectively processes high-dimensional genomic data and establishes nonlinear models. Compared to traditional Genomic Selection (GS) methods, ML algorithms enhance computational efficiency and offer higher prediction accuracy. Therefore, this study strives to achieve the optimal machine learning algorithm for genome-wide selection of cashmere traits in Inner Mongolian cashmere goats. This study compared the genomic prediction accuracy of cashmere traits using four machine learning algorithms—Random Forest (RF), Extreme Gradient Boosting Tree (XGBoost), Gradient Boosting Decision Tree (GBDT), and LightGBM—based on genotype data and cashmere trait phenotypic data from 2299 Inner Mongolian cashmere goats. The results showed that after parameter optimization, LightGBM achieved the highest selection accuracy for fiber length (56.4%), RF achieved the highest selection accuracy for cashmere production (35.2%), and GBDT achieved the highest selection accuracy for cashmere diameter (40.4%), compared with GBLUP, the accuracy improved by 0.8–2.7%. Among the three traits, XGBoost exhibited the lowest prediction accuracy, at 0.541, 0.309, and 0.387. Additionally, following parameter optimization, the prediction accuracy of the four machine learning methods for cashmere fineness, cashmere yield, and fiber length improved by an average of 2.9%, 2.7%, and 3.8%, respectively. The mean squared error (MSE) and mean absolute error (MAE) for all machine learning methods also decreased, indicating that hyperparameter tuning can enhance prediction accuracy in ML algorithms. [ABSTRACT FROM AUTHOR]
ISSN:20762615
DOI:10.3390/ani15202940