EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code

Automated detection of software vulnerabilities is critical for enhancing security, yet existing methods often struggle with the complexity and diversity of modern codebases. In this paper, we propose a novel ensemble stacking approach that synergizes multiple pre-trained large language models (LLMs...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE International Conference on Big Data s. 6356 - 6364
Hlavní autoři:	Ridoy, Shahriyar Zaman, Shazzad Hossain Shaon, Md, Cuzzocrea, Alfredo, Akter, Mst Shapna
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 15.12.2024
Témata:	Accuracy CodeBERT Codes Ensemble Stacking GraphCodeBERT Large language models Large Language Models (LLMs) Logistic regression Predictive models Security Semantics Source Code Analysis Source coding Stacking Support vector machines UniXcoder Vulnerability Detection
ISSN:	2573-2978
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Automated detection of software vulnerabilities is critical for enhancing security, yet existing methods often struggle with the complexity and diversity of modern codebases. In this paper, we propose a novel ensemble stacking approach that synergizes multiple pre-trained large language models (LLMs)-CodeBERT, GraphCodeBERT, and UniXcoder-to improve vulnerability detection in source code. Our method uniquely combines the semantic understanding of CodeBERT, the structural code representations of GraphCodeBERT, and the cross-modal capabilities of UniXcoder. By fine-tuning these models on the Draper VDISC dataset and integrating their predictions using meta-classifiers such as Logistic Regression, Support Vector Machines (SVM), Random Forest, and XGBoost, we effectively capture complex code patterns that individual models may miss. The meta-classifiers aggregate the strengths of each model, enhancing overall predictive performance. Our ensemble demonstrates significant performance gains over existing methods, with notable improvements in accuracy, precision, recall, F1-score, and AUC-score. This advancement addresses the challenge of detecting subtle and complex vulnerabilities in diverse programming contexts. The results suggest that our ensemble stacking approach offers a more robust and comprehensive solution for automated vulnerability detection, potentially influencing future AI-driven security practices.
ISSN:	2573-2978
DOI:	10.1109/BigData62323.2024.10825609