Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code

Ransomware and other forms of malware cause significant financial and operational damage to organizations by exploiting long-standing and often difficult-to-detect software vulnerabilities. To detect vulnerabilities such as buffer overflows in compiled code, this research investigates the applicatio...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings (International Symposium on Digital Forensic and Security. Online) s. 1 - 6
Hlavní autoři:	McCully, Gary A., Hastings, John D., Xu, Shengjie, Fortier, Adam
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 24.04.2025
Témata:	Accuracy Bidirectional control Binary Security Buffer Over-flows Codes Encoding GPT-2 Long short term memory Machine Learning Neural networks Organizations Recurrent neural networks Training Transformers Unidirectional Encoders
ISSN:	2768-1831
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Ransomware and other forms of malware cause significant financial and operational damage to organizations by exploiting long-standing and often difficult-to-detect software vulnerabilities. To detect vulnerabilities such as buffer overflows in compiled code, this research investigates the application of unidirectional transformer-based embeddings, specifically G PT-2. Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code. Our study reveals that embed dings from the GPT-2 model significantly outperform those from bidirectional models of BERT and RoBERTa, achieving an accuracy of 92.5% and an F1-score of 89.7%. LSTM neural networks were developed with both frozen and unfrozen embedding model layers. The model with the highest performance was achieved when the embedding layers were unfrozen. Further, the research finds that, in exploring the impact of different optimizers within this domain, the SGD optimizer demonstrates superior performance over Adam. Overall, these findings reveal important insights into the potential of unidirectional transformer-based approaches in enhancing cybersecurity defenses.
ISSN:	2768-1831
DOI:	10.1109/ISDFS65363.2025.11012025