RedJsod: A readable javascript obfuscation detector using semantic-based analysis

Uloženo v:
Podrobná bibliografie
Název: RedJsod: A readable javascript obfuscation detector using semantic-based analysis
Autoři: Al-Taharwa, I.A., Lee, H.-M., Jeng, A.B., Wu, K.-P., Mao, C.-H., Wei, T.-E., Chen, S.-M.
Rok vydání: 2012
Sbírka: National Taiwan University of Science and Technology Repository (NTUSTR) / 台灣科技大學
Témata: AST representation, detection, encoding, feature-based, JavaScript malware, obfuscation, static analysis
Popis: JavaScript allows Web-developers to hide intention behind their code inside different looking scripts known as Obfuscated code. Automatic detection of obfuscated code is generally tackled from readability perspective. However, recently obfuscation exhibits patterns that modify both syntax and semantic characteristics while preserving readability characteristic. There are two problems in dealing with readable obfuscation: 1. Difficulty in locating it since it does not manipulate suspicious strings. 2. It is a common and essential practice adopted in both benign codes and malicious codes. In this work, we first investigate why and how readable obfuscation can hinder detection of maliciousness and prevent the static analysis of suspicious scripts. Next, we propose a readable JavaScript obfuscation detector (RedJsod) system to deal with this type of threat. RedJsod is a well defined detector based on variable length context-based feature extraction (VCLFE) scheme that takes advantages of abstract syntax tree (AST) representation of a given JavaScript code to infer run-time behaviors statically. We applied RedJsod to three datasets collected from real world Web-pages to evaluate its effectiveness. Also, we tested RedJsod on well-known readable obfuscation samples cited in related works as a proof of concept illustration. Our experimental results indicated that RedJsod achieved very high detection rates (greater than 97%) in terms of accuracy, eliminated false negatives completely, while at the same time yielded very few false positives. ? 2012 IEEE.
Druh dokumentu: other/unknown material
Jazyk: English
Relation: Proc. of the 11th IEEE Int. Conference on Trust, Security and Privacy in Computing and Communications, TrustCom-2012 - 11th IEEE Int. Conference on Ubiquitous Computing and Communications, IUCC-2012, pages 1370 - 1375; http://ir.lib.ntust.edu.tw/handle/987654321/43617; http://ir.lib.ntust.edu.tw/bitstream/987654321/43617/-1/index.html
Dostupnost: http://ir.lib.ntust.edu.tw/handle/987654321/43617
http://ir.lib.ntust.edu.tw/bitstream/987654321/43617/-1/index.html
Přístupové číslo: edsbas.F7629C70
Databáze: BASE
Popis
Abstrakt:JavaScript allows Web-developers to hide intention behind their code inside different looking scripts known as Obfuscated code. Automatic detection of obfuscated code is generally tackled from readability perspective. However, recently obfuscation exhibits patterns that modify both syntax and semantic characteristics while preserving readability characteristic. There are two problems in dealing with readable obfuscation: 1. Difficulty in locating it since it does not manipulate suspicious strings. 2. It is a common and essential practice adopted in both benign codes and malicious codes. In this work, we first investigate why and how readable obfuscation can hinder detection of maliciousness and prevent the static analysis of suspicious scripts. Next, we propose a readable JavaScript obfuscation detector (RedJsod) system to deal with this type of threat. RedJsod is a well defined detector based on variable length context-based feature extraction (VCLFE) scheme that takes advantages of abstract syntax tree (AST) representation of a given JavaScript code to infer run-time behaviors statically. We applied RedJsod to three datasets collected from real world Web-pages to evaluate its effectiveness. Also, we tested RedJsod on well-known readable obfuscation samples cited in related works as a proof of concept illustration. Our experimental results indicated that RedJsod achieved very high detection rates (greater than 97%) in terms of accuracy, eliminated false negatives completely, while at the same time yielded very few false positives. ? 2012 IEEE.