Zobrazit v EDS

GBsim: A Robust GCN-BERT Approach for Cross-Architecture Binary Code Similarity Analysis.

Uloženo v:

Podrobná bibliografie
Název:	GBsim: A Robust GCN-BERT Approach for Cross-Architecture Binary Code Similarity Analysis.
Autoři:	Du J; School of Cyber Science and Engineering, Information Engineering University, Zhengzhou 450001, China., Wei Q; School of Cyber Science and Engineering, Information Engineering University, Zhengzhou 450001, China., Wang Y; School of Cyber Science and Engineering, Information Engineering University, Zhengzhou 450001, China., Bai X; School of Cyber Science and Engineering, Information Engineering University, Zhengzhou 450001, China.
Zdroj:	Entropy (Basel, Switzerland) [Entropy (Basel)] 2025 Apr 07; Vol. 27 (4). Date of Electronic Publication: 2025 Apr 07.
Způsob vydávání:	Journal Article
Jazyk:	English
Informace o časopise:	Publisher: MDPI Country of Publication: Switzerland NLM ID: 101243874 Publication Model: Electronic Cited Medium: Internet ISSN: 1099-4300 (Electronic) Linking ISSN: 10994300 NLM ISO Abbreviation: Entropy (Basel) Subsets: PubMed not MEDLINE
Imprint Name(s):	Original Publication: Basel, Switzerland : MDPI, 1999-
Abstrakt:	Recent advances in graph neural networks have transformed structural pattern learning in domains ranging from social network analysis to biomolecular modeling. Nevertheless, practical deployments in mission-critical scenarios such as binary code similarity detection face two fundamental obstacles: first, the inherent noise in graph construction processes exemplified by incomplete control flow edges during binary function recovery; second, the substantial distribution discrepancies caused by cross-architecture instruction set variations. Conventional GNN architectures demonstrate severe performance degradation under such low signal-to-noise ratio conditions and cross-domain operational environments, particularly in security-sensitive vulnerability identification tasks where feature instability or domain shifts could trigger critical false judgments. To address these challenges, we propose GBsim, a novel approach that combines graph neural networks with natural language processing. GBsim employs a cross-architecture language model to transform binary functions into semantic graphs, leverages a multilayer GCN for structural feature extraction, and employs a Transformer layer to integrate semantic information, generates robust cross-architecture embeddings that maintain high performance despite significant distribution shifts. Extensive experiments on a large-scale cross-architecture dataset show that GBsim achieves an MRR of 0.901 and a Recall@1 of 0.831, outperforming state-of-the-art methods. In real-world vulnerability detection tasks, GBsim achieves an average recall rate of 81.3% on a 1-day vulnerability dataset, demonstrating its practical effectiveness in identifying security threats and outperforming existing methods by 2.1%. This performance advantage stems from GBsim's ability to maximize information preservation across architectural boundaries, enhancing model robustness in the presence of noise and distribution shifts.
Contributed Indexing:	Keywords: binary code similarity analysis; cross-architecture embedding; graph neural network robustness; hybrid deep learning
Entry Date(s):	Date Created: 20250426 Latest Revision: 20250429
Update Code:	20250429
PubMed Central ID:	PMC12025366
DOI:	10.3390/e27040392
PMID:	40282627
Databáze:	MEDLINE

Full Text Finder

Nájsť tento článok vo Web of Science

Buďte první, kdo okomentuje tento záznam!

Cannot write session to /tmp/vufind_sessions/sess_tpnjc09j25sdgo9q03t3hj9tdb