A Hybrid Intrusion Detection System Based on Scalable K-Means+ Random Forest and Deep Learning
Digital assets have come under various network security threats in the digital age. As a kind of security equipment to protect digital assets, intrusion detection system (IDS) is less efficient if the alert is not timely and IDS is useless if the accuracy cannot meet the requirements. Therefore, an...
Saved in:
| Published in: | IEEE access Vol. 9; pp. 75729 - 75740 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 2169-3536, 2169-3536 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Digital assets have come under various network security threats in the digital age. As a kind of security equipment to protect digital assets, intrusion detection system (IDS) is less efficient if the alert is not timely and IDS is useless if the accuracy cannot meet the requirements. Therefore, an intrusion detection model that combines machine learning with deep learning is proposed in this paper. The model uses the k-means and the random forest (RF) algorithms for the binary classification, and distributed computing of these algorithms is implemented on the Spark platform to quickly classify normal events and attack events. Then, by using the convolutional neural network (CNN), long short-term memory (LSTM), and other deep learning algorithms, the events judged as abnormal are further classified into different attack types finally. At this stage, adaptive synthetic sampling (ADASYN) is adopted to solve the unbalanced dataset. The NSL-KDD and CIS-IDS2017 datasets are used to evaluate the performance of the proposed model. The experimental results show that the proposed model has better TPR for most of attack events, faster data preprocessing speed, and potentially less training time. In particular, the accuracy of multi-target classification can reach as high as 85.24% in the NSL-KDD dataset and 99.91% in the CIC-IDS2017 dataset. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2169-3536 2169-3536 |
| DOI: | 10.1109/ACCESS.2021.3082147 |