Compiler-provenance identification in obfuscated binaries using vision transformers
Uložené v:
| Názov: | Compiler-provenance identification in obfuscated binaries using vision transformers |
|---|---|
| Autori: | Khan, Wasif, Alrabaee, Saed, Al-kfairy, Mousa, Tang, Jie, Raymond Choo, Kim Kwang |
| Zdroj: | All Works |
| Informácie o vydavateľovi: | ZU Scholars |
| Rok vydania: | 2024 |
| Predmety: | Binary code analysis, Compiler provenance, Malware analysis, Reverse engineering, Computer Sciences |
| Popis: | Extracting compiler-provenance-related information (e.g., the source of a compiler, its version, its optimization settings, and compiler-related functions) is crucial for binary-analysis tasks such as function fingerprinting, detecting code clones, and determining authorship attribution. However, the presence of obfuscation techniques has complicated the efforts to automate such extraction. In this paper, we propose an efficient and resilient approach to provenance identification in obfuscated binaries using advanced pre-trained computer-vision models. To achieve this, we transform the program binaries into images and apply a two-layer approach for compiler and optimization prediction. Extensive results from experiments performed on a large-scale dataset show that the proposed method can achieve an accuracy of over 98 % for both obfuscated and deobfuscated binaries. |
| Druh dokumentu: | text |
| Popis súboru: | application/pdf |
| Jazyk: | unknown |
| Relation: | https://zuscholars.zu.ac.ae/works/6635; https://zuscholars.zu.ac.ae/context/works/article/7672/viewcontent/1_s2.0_S2666281724000830_main.pdf |
| DOI: | 10.1016/j.fsidi.2024.301764 |
| Dostupnosť: | https://zuscholars.zu.ac.ae/works/6635 https://doi.org/10.1016/j.fsidi.2024.301764 https://zuscholars.zu.ac.ae/context/works/article/7672/viewcontent/1_s2.0_S2666281724000830_main.pdf |
| Rights: | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
| Prístupové číslo: | edsbas.E4055BE0 |
| Databáza: | BASE |
| Abstrakt: | Extracting compiler-provenance-related information (e.g., the source of a compiler, its version, its optimization settings, and compiler-related functions) is crucial for binary-analysis tasks such as function fingerprinting, detecting code clones, and determining authorship attribution. However, the presence of obfuscation techniques has complicated the efforts to automate such extraction. In this paper, we propose an efficient and resilient approach to provenance identification in obfuscated binaries using advanced pre-trained computer-vision models. To achieve this, we transform the program binaries into images and apply a two-layer approach for compiler and optimization prediction. Extensive results from experiments performed on a large-scale dataset show that the proposed method can achieve an accuracy of over 98 % for both obfuscated and deobfuscated binaries. |
|---|---|
| DOI: | 10.1016/j.fsidi.2024.301764 |