Malware Detection Based on API Call Sequence Analysis: A Gated Recurrent Unit–Generative Adversarial Network Model Approach

Malware remains a major threat to computer systems, with a vast number of new samples being identified and documented regularly. Windows systems are particularly vulnerable to malicious programs like viruses, worms, and trojans. Dynamic analysis, which involves observing malware behavior during exec...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Future internet Ročník 16; číslo 10; s. 369
Hlavní autoři: Owoh, Nsikak, Adejoh, John, Hosseinzadeh, Salaheddin, Ashawa, Moses, Osamor, Jude, Qureshi, Ayyaz
Médium: Journal Article
Jazyk:angličtina
Vydáno: Basel MDPI AG 01.10.2024
Témata:
ISSN:1999-5903, 1999-5903
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Malware remains a major threat to computer systems, with a vast number of new samples being identified and documented regularly. Windows systems are particularly vulnerable to malicious programs like viruses, worms, and trojans. Dynamic analysis, which involves observing malware behavior during execution in a controlled environment, has emerged as a powerful technique for detection. This approach often focuses on analyzing Application Programming Interface (API) calls, which represent the interactions between the malware and the operating system. Recent advances in deep learning have shown promise in improving malware detection accuracy using API call sequence data. However, the potential of Generative Adversarial Networks (GANs) for this purpose remains largely unexplored. This paper proposes a novel hybrid deep learning model combining Gated Recurrent Units (GRUs) and GANs to enhance malware detection based on API call sequences from Windows portable executable files. We evaluate our GRU–GAN model against other approaches like Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU) on multiple datasets. Results demonstrated the superior performance of our hybrid model, achieving 98.9% accuracy on the most challenging dataset. It outperformed existing models in resource utilization, with faster training and testing times and low memory usage.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1999-5903
1999-5903
DOI:10.3390/fi16100369