Stacking Based LightGBM-CatBoost-RandomForest Algorithm and Its Application in Big Data Modeling

Recent years, the application of big data model prediction in various fields has been increasing, but the improvement of model accuracy has always been a major problem. Integrating multiple base classifiers by using an ensemble algorithm is an efficient way to improve model accuracy. In this paper,...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS) s. 1 - 6
Hlavní autoři:	Wang, Zhihong, Ren, Hongru, Lu, Renquan, Huang, Lirong
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 28.10.2022
Témata:	Big Data Big Data Model Classification algorithms Data models Ensemble learning LightGBM-CatBoost-RandomForest Prediction algorithms Predictive models Stacking
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Recent years, the application of big data model prediction in various fields has been increasing, but the improvement of model accuracy has always been a major problem. Integrating multiple base classifiers by using an ensemble algorithm is an efficient way to improve model accuracy. In this paper, LightGBM, CatBoost and RandomForest are used as base classifiers, and the Stacking method in ensemble learning is used to build a combined model of LightGBM-CatBoost-Random-Forest. A comparative experiment is carried out with the SVM-KNN combination model based on the soft voting method in the existing literature. The results show that the Stacking-based on LightGBM-CatBoost-RandomForest combined model has good performance in the four model evaluation indicators of accuracy, precision, recall and F1 score.
DOI:	10.1109/DOCS55193.2022.9967714