EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code

Automated detection of software vulnerabilities is critical for enhancing security, yet existing methods often struggle with the complexity and diversity of modern codebases. In this paper, we propose a novel ensemble stacking approach that synergizes multiple pre-trained large language models (LLMs...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE International Conference on Big Data S. 6356 - 6364
Hauptverfasser:	Ridoy, Shahriyar Zaman, Shazzad Hossain Shaon, Md, Cuzzocrea, Alfredo, Akter, Mst Shapna
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 15.12.2024
Schlagworte:	Accuracy CodeBERT Codes Ensemble Stacking GraphCodeBERT Large language models Large Language Models (LLMs) Logistic regression Predictive models Security Semantics Source Code Analysis Source coding Stacking Support vector machines UniXcoder Vulnerability Detection
ISSN:	2573-2978
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Automated detection of software vulnerabilities is critical for enhancing security, yet existing methods often struggle with the complexity and diversity of modern codebases. In this paper, we propose a novel ensemble stacking approach that synergizes multiple pre-trained large language models (LLMs)-CodeBERT, GraphCodeBERT, and UniXcoder-to improve vulnerability detection in source code. Our method uniquely combines the semantic understanding of CodeBERT, the structural code representations of GraphCodeBERT, and the cross-modal capabilities of UniXcoder. By fine-tuning these models on the Draper VDISC dataset and integrating their predictions using meta-classifiers such as Logistic Regression, Support Vector Machines (SVM), Random Forest, and XGBoost, we effectively capture complex code patterns that individual models may miss. The meta-classifiers aggregate the strengths of each model, enhancing overall predictive performance. Our ensemble demonstrates significant performance gains over existing methods, with notable improvements in accuracy, precision, recall, F1-score, and AUC-score. This advancement addresses the challenge of detecting subtle and complex vulnerabilities in diverse programming contexts. The results suggest that our ensemble stacking approach offers a more robust and comprehensive solution for automated vulnerability detection, potentially influencing future AI-driven security practices.
ISSN:	2573-2978
DOI:	10.1109/BigData62323.2024.10825609