Research on Abnormal Traffic Detection in Industrial Control Network Based on CVAE-CatBoost
For the detection of abnormal traffic in Industrial Control Network(ICN),a new abnormal traffic detection model based on Conditional Variational Autoencoder(CVAE) and the Categorical Features Gradient Boosting(CatBoost) algorithm is proposed to address the problems of unbalanced data distribution an...
Uložené v:
| Vydané v: | Ji suan ji gong cheng Ročník 49; číslo 5; s. 173 - 180 |
|---|---|
| Hlavný autor: | |
| Médium: | Journal Article |
| Jazyk: | Chinese English |
| Vydavateľské údaje: |
Editorial Office of Computer Engineering
01.05.2023
|
| Predmet: | |
| ISSN: | 1000-3428 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | For the detection of abnormal traffic in Industrial Control Network(ICN),a new abnormal traffic detection model based on Conditional Variational Autoencoder(CVAE) and the Categorical Features Gradient Boosting(CatBoost) algorithm is proposed to address the problems of unbalanced data distribution and low detection rate in existing models.CVAE uses label information as a constraint to control the category of generated samples.The CatBoost algorithm overcomes gradient bias by introducing unbiased estimation,improves prediction accuracy,and reduces risk of overfitting by adopting various tree growth modes.CVAE is used to enhance data,expand rare attack samples,and build balanced datasets with uniform distribution.The CatBoost algorithm is an anomaly traffic detection model which accurately identifies attack samples,such as Dos,Fuzzers,and outputs the classification results.The experimental results show that on the UNSW-NB15 dataset,after data enhancement using CVAE,CatBoost improves the F1 value by 25.16 percentage points on average,whereby the overall precision,recall,and F1 value,reach 87.85%,87.87%,and 87.86%,respectively;on the ZYELL_NCTU NetTraffic_1.0 dataset,after using CVAE to enhance the data,CatBoost improves the F1 value by 16.32% on average,and the overall precision,recall,and F1 value,reach 99.85%.The proposed model can effectively avoid data imbalance problems and has better detection performance and generalization ability than machine learning and deep learning algorithms,such as K-Nearest Neighbor(KNN),Random Forest(RF),and Convolution Neural Network(CNN). |
|---|---|
| ISSN: | 1000-3428 |
| DOI: | 10.19678/j.issn.1000-3428.0065478 |