Similarity-based Android malware detection using Hamming distance of static binary features

In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Future generation computer systems Ročník 105; s. 230 - 247
Hlavní autori: Taheri, Rahim, Ghahramani, Meysam, Javidan, Reza, Shojafar, Mohammad, Pooranian, Zahra, Conti, Mauro
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 01.04.2020
Predmet:
ISSN:0167-739X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can trigger the alarm if we detect an Android app is malicious. Hence, our solutions help us to avoid the spread of detected malware on a broader scale. We provide a detailed description of the proposed detection methods and related algorithms. We include an extensive analysis to assess the suitability of our proposed similarity-based detection methods. In this way, we perform our experiments on three datasets, including benign and malware Android apps like Drebin, Contagio, and Genome. Thus, to corroborate the actual effectiveness of our classifier, we carry out performance comparisons with some state-of-the-art classification and malware detection algorithms, namely Mixed and Separated solutions, the program dissimilarity measure based on entropy (PDME) and the FalDroid algorithms. We test our experiments in a different type of features: API, intent, and permission features on these three datasets. The results confirm that accuracy rates of proposed algorithms are more than 90% and in some cases (i.e., considering API features) are more than 99%, and are comparable with existing state-of-the-art solutions. •We prove the similar results achievement of using Hamming distance with others.•We propose four scenarios for malware detection using Hamming distances.•We obtain the maximum achievable accuracy with the Hamming distance as a threshold.•We evaluate our methods using three standard datasets and various features.•We compare our malware detection methods against three cutting-edge solutions.
ISSN:0167-739X
DOI:10.1016/j.future.2019.11.034