Similarity-based Android malware detection using Hamming distance of static binary features

In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Future generation computer systems Ročník 105; s. 230 - 247
Hlavní autoři: Taheri, Rahim, Ghahramani, Meysam, Javidan, Reza, Shojafar, Mohammad, Pooranian, Zahra, Conti, Mauro
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.04.2020
Témata:
ISSN:0167-739X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In this paper, we develop four malware detection methods using Hamming distance to find similarity between samples which are first nearest neighbors (FNN), all nearest neighbors (ANN), weighted all nearest neighbors (WANN), and k-medoid based nearest neighbors (KMNN). In our proposed methods, we can trigger the alarm if we detect an Android app is malicious. Hence, our solutions help us to avoid the spread of detected malware on a broader scale. We provide a detailed description of the proposed detection methods and related algorithms. We include an extensive analysis to assess the suitability of our proposed similarity-based detection methods. In this way, we perform our experiments on three datasets, including benign and malware Android apps like Drebin, Contagio, and Genome. Thus, to corroborate the actual effectiveness of our classifier, we carry out performance comparisons with some state-of-the-art classification and malware detection algorithms, namely Mixed and Separated solutions, the program dissimilarity measure based on entropy (PDME) and the FalDroid algorithms. We test our experiments in a different type of features: API, intent, and permission features on these three datasets. The results confirm that accuracy rates of proposed algorithms are more than 90% and in some cases (i.e., considering API features) are more than 99%, and are comparable with existing state-of-the-art solutions. •We prove the similar results achievement of using Hamming distance with others.•We propose four scenarios for malware detection using Hamming distances.•We obtain the maximum achievable accuracy with the Hamming distance as a threshold.•We evaluate our methods using three standard datasets and various features.•We compare our malware detection methods against three cutting-edge solutions.
ISSN:0167-739X
DOI:10.1016/j.future.2019.11.034