A parallel algorithm for network traffic anomaly detection based on Isolation Forest

With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications,...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of distributed sensor networks Ročník 14; číslo 11; s. 155014771881447
Hlavní autori: Tao, Xiaoling, Peng, Yang, Zhao, Feng, Zhao, Peichao, Wang, Yong
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: London, England SAGE Publications 01.11.2018
Wiley
Predmet:
ISSN:1550-1477, 1550-1477
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications, such as malware detection, load balancing, network intrusion detection. Although there are many methods around for network traffic anomaly detection, they are all designed for single machine, failing to deal with the case that the network traffic data are so large that it is prohibitive for a single computer to store and process the data. To solve these problems, we propose a parallel algorithm based on Isolation Forest and Spark for network traffic anomaly detection. We combine the advantages of Isolation Forest algorithm in network traffic anomaly detection and big data processing capability of Spark technology. Meanwhile, we apply the idea of parallelization to the process of modeling and evaluation. In the calculation process, by assigning tasks to multiple compute nodes, Isolation Forest and Spark can efficiently perform anomaly detection and evaluation process. By this way, we can also solve the problem of computation bottleneck on single machine. Extensive experiments on real world datasets show that our Isolation Forest and Spark is efficient and scales well for anomaly detection on large network traffic data.
AbstractList With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications, such as malware detection, load balancing, network intrusion detection. Although there are many methods around for network traffic anomaly detection, they are all designed for single machine, failing to deal with the case that the network traffic data are so large that it is prohibitive for a single computer to store and process the data. To solve these problems, we propose a parallel algorithm based on Isolation Forest and Spark for network traffic anomaly detection. We combine the advantages of Isolation Forest algorithm in network traffic anomaly detection and big data processing capability of Spark technology. Meanwhile, we apply the idea of parallelization to the process of modeling and evaluation. In the calculation process, by assigning tasks to multiple compute nodes, Isolation Forest and Spark can efficiently perform anomaly detection and evaluation process. By this way, we can also solve the problem of computation bottleneck on single machine. Extensive experiments on real world datasets show that our Isolation Forest and Spark is efficient and scales well for anomaly detection on large network traffic data.
Author Zhao, Peichao
Peng, Yang
Tao, Xiaoling
Zhao, Feng
Wang, Yong
Author_xml – sequence: 1
  givenname: Xiaoling
  surname: Tao
  fullname: Tao, Xiaoling
– sequence: 2
  givenname: Yang
  orcidid: 0000-0003-4043-8165
  surname: Peng
  fullname: Peng, Yang
– sequence: 3
  givenname: Feng
  surname: Zhao
  fullname: Zhao, Feng
  email: zhaofeng@guet.edu.cn
– sequence: 4
  givenname: Peichao
  surname: Zhao
  fullname: Zhao, Peichao
  email: zhaofeng@guet.edu.cn
– sequence: 5
  givenname: Yong
  surname: Wang
  fullname: Wang, Yong
BookMark eNp9kE1LAzEURYNUsFb3LvMHRpOZpOksRfwoFNzU9fDm5aVOTScliYj_3mkrIgVdvcvl3bM452zUh54Yu5LiWkpjbqTWQipj5GwmlTLyhI13VbHrRr_yGTtPaS1ENS2ncsyWt3wLEbwnz8GvQuzy64a7EHlP-SPEN54jONchhz5swH9yS5kwd6HnLSSyfAjzFDzsq4cQKeULdurAJ7r8vhP28nC_vHsqFs-P87vbRYHVrM6FAVuLthWoDZYCS-W0bZ1UZGk6pBJ1iSixtLWWKGtTt7UVUgC6qlKIWE3Y_MC1AdbNNnYbiJ9NgK7ZFyGuGoi5Q0-NqZV0AqwYKEphCS0NlkpdExlyWg8scWBhDClFcj88KZqd4ebY8DCZHk2wy3sPg7PO_zcsDsMEK2rW4T32g6a__78AcrSOvA
CitedBy_id crossref_primary_10_1016_j_est_2022_104177
crossref_primary_10_1088_1361_6501_ac9545
crossref_primary_10_1155_2021_6636270
crossref_primary_10_1177_03611981241302335
crossref_primary_10_1155_2021_5576504
crossref_primary_10_1155_2023_5162254
crossref_primary_10_3390_batteries9070346
crossref_primary_10_3233_JCS_230092
crossref_primary_10_1007_s11770_025_1178_z
crossref_primary_10_3390_s22239144
crossref_primary_10_1177_14759217251362150
crossref_primary_10_1145_3620676
crossref_primary_10_1109_TIM_2021_3062684
crossref_primary_10_1016_j_cjche_2025_05_012
crossref_primary_10_1155_2020_6046729
crossref_primary_10_1016_j_icte_2020_06_003
crossref_primary_10_1109_ACCESS_2020_3022855
crossref_primary_10_1016_j_is_2025_102524
crossref_primary_10_1007_s10111_024_00776_4
crossref_primary_10_1016_j_anucene_2021_108785
crossref_primary_10_1186_s40537_025_01149_y
crossref_primary_10_1016_j_cose_2024_104126
crossref_primary_10_1016_j_foodchem_2021_131981
crossref_primary_10_1016_j_suscom_2022_100764
crossref_primary_10_1002_er_8471
crossref_primary_10_1109_ACCESS_2025_3528114
crossref_primary_10_3390_s22249626
crossref_primary_10_1177_14759217251339607
crossref_primary_10_1016_j_knosys_2019_105191
crossref_primary_10_1007_s10489_020_01886_y
crossref_primary_10_1016_j_chroma_2022_463486
Cites_doi 10.1016/j.neucom.2015.10.009
10.1016/j.physa.2016.12.069
10.1016/j.neucom.2014.09.083
10.1109/TNSE.2018.2830307.
10.1007/978-3-319-01604-7_42
10.1109/TMM.2016.2537781
10.1587/transcom.2016EBP3239
10.1007/s00521-012-1263-0
10.1016/j.cose.2016.10.010
10.1109/TSE.1987.232894
10.1007/978-94-015-3994-4
10.1155/2016/9653230.
10.1002/ett.2619
10.3390/e17042367
ContentType Journal Article
Copyright The Author(s) 2018
Copyright_xml – notice: The Author(s) 2018
DBID AFRWT
AAYXX
CITATION
DOA
DOI 10.1177/1550147718814471
DatabaseName Sage Journals GOLD Open Access 2024
CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1550-1477
ExternalDocumentID oai_doaj_org_article_7941f0ad079b44c2abe447259ee7ef55
10_1177_1550147718814471
10.1177_1550147718814471
GrantInformation_xml – fundername: the Science and Technology Program of Guangxi
  grantid: No. AB17195045
– fundername: the Open Projects of State Key Laboratory of Integrated Service Networks (ISN) of Xidian University
  grantid: No. ISN19-13
– fundername: the National Natural Science Foundation of China
  grantid: No. 61363006
– fundername: the National Natural Science Foundation of Guangxi
  grantid: No.2016GXNSFAA380098
GroupedDBID .4S
.DC
0R~
24P
29J
4.4
54M
5GY
5VS
8FE
8FG
AAJEY
AAJPV
AAKPC
AASGM
ABAWP
ABJCF
ABQXT
ABUWG
ACCMX
ACGEJ
ACGFS
ACIWK
ACROE
ADBBV
ADMLS
ADXPE
AEDFJ
AENEX
AEWDL
AFCOW
AFKRA
AFKRG
AFRWT
AJUZI
ALMA_UNASSIGNED_HOLDINGS
ARAPS
ARCSS
AUTPY
AWYRJ
AYAKG
AZQEC
BCNDV
BDDNI
BENPR
BGLVJ
BPHCQ
CAG
CCPQU
COF
CS3
CWDGH
DU5
DWQXO
EBS
EDO
EJD
GNUQQ
GROUPED_DOAJ
H13
HCIFZ
I-F
IAO
ICD
IEA
IL9
IPNFZ
ITC
J8X
K.F
K6V
K7-
KQ8
L6V
M4Z
M7S
O9-
OK1
P2P
P62
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PTHSS
RIG
RNS
ROL
SAUOL
SCNPE
SFC
TDBHL
TFW
TUS
TWF
TWQ
XH6
AAMMB
AAYXX
ACHEB
AEFGJ
AFFHD
AGXDD
AIDQK
AIDYY
ALUQN
AQTUD
CITATION
PQGLB
ID FETCH-LOGICAL-c389t-7ad90bb0c57c20c24f5dbf14ede65db2c52cc1c2d951c1979b9d010acf334ccc3
IEDL.DBID DOA
ISICitedReferencesCount 46
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000451384800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1550-1477
IngestDate Fri Oct 03 12:53:42 EDT 2025
Sat Nov 29 06:19:37 EST 2025
Tue Nov 18 21:54:05 EST 2025
Tue Jun 17 22:47:17 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11
Keywords anomaly detection
Spark
Isolation Forest
parallelization
Network traffic
Language English
License This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c389t-7ad90bb0c57c20c24f5dbf14ede65db2c52cc1c2d951c1979b9d010acf334ccc3
ORCID 0000-0003-4043-8165
OpenAccessLink https://doaj.org/article/7941f0ad079b44c2abe447259ee7ef55
ParticipantIDs doaj_primary_oai_doaj_org_article_7941f0ad079b44c2abe447259ee7ef55
crossref_primary_10_1177_1550147718814471
crossref_citationtrail_10_1177_1550147718814471
sage_journals_10_1177_1550147718814471
PublicationCentury 2000
PublicationDate 2018-11-01
PublicationDateYYYYMMDD 2018-11-01
PublicationDate_xml – month: 11
  year: 2018
  text: 2018-11-01
  day: 01
PublicationDecade 2010
PublicationPlace London, England
PublicationPlace_xml – name: London, England
PublicationTitle International journal of distributed sensor networks
PublicationYear 2018
Publisher SAGE Publications
Wiley
Publisher_xml – name: SAGE Publications
– name: Wiley
References Chen, Yeo, Lee 2017; 65
Matsuda, Morita, Kudo 2017; 100
Li 2017; 472
De la Hoz, De La Hoz, Ortiz 2015; 164
Du, Jiang, Qian 2016; 18
Jiang, Yao, Xu 2015; 26
Moustafa, Slay 2016; 25
Yu, Jibin, Jiang 2016
Kadri, Harrou, Chaabane 2016; 173
Bereziski, Jasiul, Szpyrka 2015; 17
Denning 1987; 13
Cai, He, Guan 2018; 15
Sheikhan, Jadidi 2014; 24
Huang Q (bibr19-1550147718814471)
bibr13-1550147718814471
bibr18-1550147718814471
bibr9-1550147718814471
bibr14-1550147718814471
Paul B (bibr5-1550147718814471)
Moustafa N (bibr28-1550147718814471)
Liu FT (bibr26-1550147718814471)
bibr8-1550147718814471
bibr17-1550147718814471
bibr4-1550147718814471
Yu G (bibr12-1550147718814471)
Moustafa N (bibr27-1550147718814471) 2016; 25
Cai Z (bibr2-1550147718814471) 2018; 15
Salem O (bibr7-1550147718814471)
bibr11-1550147718814471
Celenk M (bibr15-1550147718814471)
Han J (bibr16-1550147718814471)
Shon T (bibr23-1550147718814471)
bibr10-1550147718814471
bibr3-1550147718814471
bibr25-1550147718814471
bibr1-1550147718814471
bibr6-1550147718814471
Duong NH (bibr21-1550147718814471)
Kumari R (bibr22-1550147718814471)
bibr20-1550147718814471
References_xml – volume: 65
  start-page: 314
  year: 2017
  end-page: 328
  article-title: Detection of network anomalies using Improved-MSPCA with sketches
  publication-title: Comput Secur
– volume: 472
  start-page: 164
  year: 2017
  end-page: 187
  article-title: Record length requirement of long-range dependent teletraffic
  publication-title: Physica A
– volume: 13
  start-page: 222
  year: 1987
  end-page: 232
  article-title: An intrusion-detection model
  publication-title: IEEE T Software Eng
– volume: 173
  start-page: 2102
  year: 2016
  end-page: 2114
  article-title: Seasonal ARMA-based SPC charts for anomaly detection: application to emergency department systems
  publication-title: Neurocomputing
– volume: 15
  start-page: 577
  year: 2018
  end-page: 590
  article-title: Collective data-sanitization for preventing sensitive information inference attacks in social networks
  publication-title: IEEE T Depend Secure
– volume: 17
  start-page: 2367
  year: 2015
  end-page: 2408
  article-title: An entropy-based network anomaly detection method
  publication-title: Entropy
– volume: 24
  start-page: 599
  year: 2014
  end-page: 611
  article-title: Flow-based anomaly detection in high-speed links using modified GSA-optimized neural network
  publication-title: Neural Comput Appl
– volume: 18
  start-page: 820
  year: 2016
  end-page: 830
  article-title: Resource allocation with video traffic prediction in cloud-based space systems
  publication-title: IEEE T Multimedia
– volume: 100
  start-page: 749
  year: 2017
  end-page: 761
  article-title: Traffic anomaly detection based on robust principal component analysis using periodic traffic behavior
  publication-title: IEICE T Commun
– year: 2016
  article-title: An improved ARIMA-based traffic anomaly detection algorithm for wireless sensor networks
  publication-title: Int J Distrib Sens N
– volume: 164
  start-page: 71
  year: 2015
  end-page: 81
  article-title: PCA filtering and probabilistic SOM for network intrusion detection
  publication-title: Neurocomputing
– volume: 25
  start-page: 18
  year: 2016
  end-page: 31
  article-title: The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set
  publication-title: Inform Secur J
– volume: 26
  start-page: 308
  year: 2015
  end-page: 317
  article-title: Multi-scale anomaly detection for high-speed network traffic
  publication-title: T Emerg Telecommun T
– ident: bibr8-1550147718814471
  doi: 10.1016/j.neucom.2015.10.009
– ident: bibr13-1550147718814471
  doi: 10.1016/j.physa.2016.12.069
– start-page: 3548
  volume-title: IEEE international conference on systems, man and cybernetics
  ident: bibr15-1550147718814471
– start-page: 176
  volume-title: Proceedings from the sixth annual IEEE SMC information assurance workshop
  ident: bibr23-1550147718814471
– ident: bibr10-1550147718814471
  doi: 10.1016/j.neucom.2014.09.083
– ident: bibr1-1550147718814471
  doi: 10.1109/TNSE.2018.2830307.
– ident: bibr6-1550147718814471
  doi: 10.1007/978-3-319-01604-7_42
– ident: bibr14-1550147718814471
  doi: 10.1109/TMM.2016.2537781
– start-page: 157
  volume-title: Fourth international conference on advances in computing and communications
  ident: bibr5-1550147718814471
– start-page: 413
  volume-title: Eighth IEEE international conference on data mining
  ident: bibr26-1550147718814471
– volume: 15
  start-page: 577
  year: 2018
  ident: bibr2-1550147718814471
  publication-title: IEEE T Depend Secure
– ident: bibr9-1550147718814471
  doi: 10.1587/transcom.2016EBP3239
– ident: bibr25-1550147718814471
  doi: 10.1007/s00521-012-1263-0
– ident: bibr20-1550147718814471
  doi: 10.1016/j.cose.2016.10.010
– ident: bibr4-1550147718814471
  doi: 10.1109/TSE.1987.232894
– start-page: 1420
  volume-title: IEEE conference on computer communications
  ident: bibr19-1550147718814471
– ident: bibr3-1550147718814471
  doi: 10.1007/978-94-015-3994-4
– start-page: 1
  volume-title: Military communications and information systems conference (MilCIS)
  ident: bibr28-1550147718814471
– start-page: 208
  volume-title: IEEE intelligent vehicles symposium
  ident: bibr12-1550147718814471
– start-page: 387
  volume-title: 3rd international conference on recent advances in information technology (RAIT)
  ident: bibr22-1550147718814471
– start-page: 4373
  volume-title: IEEE international conference on communications (ICC)
  ident: bibr7-1550147718814471
– volume: 25
  start-page: 18
  year: 2016
  ident: bibr27-1550147718814471
  publication-title: Inform Secur J
– ident: bibr17-1550147718814471
  doi: 10.1155/2016/9653230.
– ident: bibr18-1550147718814471
  doi: 10.1002/ett.2619
– start-page: 644
  volume-title: 18th international conference on advanced communication technology (ICACT)
  ident: bibr21-1550147718814471
– start-page: 1
  volume-title: Proceedings of IEEE Southeastcon
  ident: bibr16-1550147718814471
– ident: bibr11-1550147718814471
  doi: 10.3390/e17042367
SSID ssj0036261
Score 2.3413706
Snippet With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data...
SourceID doaj
crossref
sage
SourceType Open Website
Enrichment Source
Index Database
Publisher
StartPage 155014771881447
Title A parallel algorithm for network traffic anomaly detection based on Isolation Forest
URI https://journals.sagepub.com/doi/full/10.1177/1550147718814471
https://doaj.org/article/7941f0ad079b44c2abe447259ee7ef55
Volume 14
WOSCitedRecordID wos000451384800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1550-1477
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0036261
  issn: 1550-1477
  databaseCode: DOA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3NS8MwFA8yPHjxW5xf5CCCh7K2SZv2OMWhIMPDhN1K8pKoUDvZquB_70vaTSeoF2-hSUl5H3m_R15_j5BTF3a4UjbIcT5wjHSBYin6lU79echi5buW3IrhMBuP87svrb5cTVhDD9wIrof2EtlQ6lDkinOIpTKcCwTtxghjE89einPzZKo5gx3HSvR5KdlzODziAs_hDBMIES0FIc_Vv1TI5WPLYJOst6CQ9puP2SIrptomG_OGC7T1vx0y6lNH1Y2pfkll-TDBxP7xmSLspFVTzU3rqXScEFRWk2dZvlNtal9rVVEXrjTFwQ1am1cHdW05Z_UuuR9cjS6vg7YtQgCILupASJ2HSoWQCIhDiLlNtLIRN9qkOIohiQEiiDWCJ4hylFuuMeuSYBnjAMD2SKeaVGaf0MwoYDpkzEDGrZGZzlIrEfFplSTKiC7pzeVUQMsZ7lpXlEXU0oR_l2yXnC_eeGn4Mn5Ze-FEv1jnmK79A9R_0eq_-Ev_XXLmFFe0rjf7cbeD_9jtkKwhYsqanxGPSKeevppjsgpv9dNseuJt8ANel9zr
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+parallel+algorithm+for+network+traffic+anomaly+detection+based+on+Isolation+Forest&rft.jtitle=International+journal+of+distributed+sensor+networks&rft.au=Tao%2C+Xiaoling&rft.au=Peng%2C+Yang&rft.au=Zhao%2C+Feng&rft.au=Zhao%2C+Peichao&rft.date=2018-11-01&rft.pub=SAGE+Publications&rft.issn=1550-1477&rft.eissn=1550-1477&rft.volume=14&rft.issue=11&rft_id=info:doi/10.1177%2F1550147718814471&rft.externalDocID=10.1177_1550147718814471
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1550-1477&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1550-1477&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1550-1477&client=summon