A parallel algorithm for network traffic anomaly detection based on Isolation Forest
With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications,...
Uložené v:
| Vydané v: | International journal of distributed sensor networks Ročník 14; číslo 11; s. 155014771881447 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
London, England
SAGE Publications
01.11.2018
Wiley |
| Predmet: | |
| ISSN: | 1550-1477, 1550-1477 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications, such as malware detection, load balancing, network intrusion detection. Although there are many methods around for network traffic anomaly detection, they are all designed for single machine, failing to deal with the case that the network traffic data are so large that it is prohibitive for a single computer to store and process the data. To solve these problems, we propose a parallel algorithm based on Isolation Forest and Spark for network traffic anomaly detection. We combine the advantages of Isolation Forest algorithm in network traffic anomaly detection and big data processing capability of Spark technology. Meanwhile, we apply the idea of parallelization to the process of modeling and evaluation. In the calculation process, by assigning tasks to multiple compute nodes, Isolation Forest and Spark can efficiently perform anomaly detection and evaluation process. By this way, we can also solve the problem of computation bottleneck on single machine. Extensive experiments on real world datasets show that our Isolation Forest and Spark is efficient and scales well for anomaly detection on large network traffic data. |
|---|---|
| AbstractList | With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications, such as malware detection, load balancing, network intrusion detection. Although there are many methods around for network traffic anomaly detection, they are all designed for single machine, failing to deal with the case that the network traffic data are so large that it is prohibitive for a single computer to store and process the data. To solve these problems, we propose a parallel algorithm based on Isolation Forest and Spark for network traffic anomaly detection. We combine the advantages of Isolation Forest algorithm in network traffic anomaly detection and big data processing capability of Spark technology. Meanwhile, we apply the idea of parallelization to the process of modeling and evaluation. In the calculation process, by assigning tasks to multiple compute nodes, Isolation Forest and Spark can efficiently perform anomaly detection and evaluation process. By this way, we can also solve the problem of computation bottleneck on single machine. Extensive experiments on real world datasets show that our Isolation Forest and Spark is efficient and scales well for anomaly detection on large network traffic data. |
| Author | Zhao, Peichao Peng, Yang Tao, Xiaoling Zhao, Feng Wang, Yong |
| Author_xml | – sequence: 1 givenname: Xiaoling surname: Tao fullname: Tao, Xiaoling – sequence: 2 givenname: Yang orcidid: 0000-0003-4043-8165 surname: Peng fullname: Peng, Yang – sequence: 3 givenname: Feng surname: Zhao fullname: Zhao, Feng email: zhaofeng@guet.edu.cn – sequence: 4 givenname: Peichao surname: Zhao fullname: Zhao, Peichao email: zhaofeng@guet.edu.cn – sequence: 5 givenname: Yong surname: Wang fullname: Wang, Yong |
| BookMark | eNp9kE1LAzEURYNUsFb3LvMHRpOZpOksRfwoFNzU9fDm5aVOTScliYj_3mkrIgVdvcvl3bM452zUh54Yu5LiWkpjbqTWQipj5GwmlTLyhI13VbHrRr_yGTtPaS1ENS2ncsyWt3wLEbwnz8GvQuzy64a7EHlP-SPEN54jONchhz5swH9yS5kwd6HnLSSyfAjzFDzsq4cQKeULdurAJ7r8vhP28nC_vHsqFs-P87vbRYHVrM6FAVuLthWoDZYCS-W0bZ1UZGk6pBJ1iSixtLWWKGtTt7UVUgC6qlKIWE3Y_MC1AdbNNnYbiJ9NgK7ZFyGuGoi5Q0-NqZV0AqwYKEphCS0NlkpdExlyWg8scWBhDClFcj88KZqd4ebY8DCZHk2wy3sPg7PO_zcsDsMEK2rW4T32g6a__78AcrSOvA |
| CitedBy_id | crossref_primary_10_1016_j_est_2022_104177 crossref_primary_10_1088_1361_6501_ac9545 crossref_primary_10_1155_2021_6636270 crossref_primary_10_1177_03611981241302335 crossref_primary_10_1155_2021_5576504 crossref_primary_10_1155_2023_5162254 crossref_primary_10_3390_batteries9070346 crossref_primary_10_3233_JCS_230092 crossref_primary_10_1007_s11770_025_1178_z crossref_primary_10_3390_s22239144 crossref_primary_10_1177_14759217251362150 crossref_primary_10_1145_3620676 crossref_primary_10_1109_TIM_2021_3062684 crossref_primary_10_1016_j_cjche_2025_05_012 crossref_primary_10_1155_2020_6046729 crossref_primary_10_1016_j_icte_2020_06_003 crossref_primary_10_1109_ACCESS_2020_3022855 crossref_primary_10_1016_j_is_2025_102524 crossref_primary_10_1007_s10111_024_00776_4 crossref_primary_10_1016_j_anucene_2021_108785 crossref_primary_10_1186_s40537_025_01149_y crossref_primary_10_1016_j_cose_2024_104126 crossref_primary_10_1016_j_foodchem_2021_131981 crossref_primary_10_1016_j_suscom_2022_100764 crossref_primary_10_1002_er_8471 crossref_primary_10_1109_ACCESS_2025_3528114 crossref_primary_10_3390_s22249626 crossref_primary_10_1177_14759217251339607 crossref_primary_10_1016_j_knosys_2019_105191 crossref_primary_10_1007_s10489_020_01886_y crossref_primary_10_1016_j_chroma_2022_463486 |
| Cites_doi | 10.1016/j.neucom.2015.10.009 10.1016/j.physa.2016.12.069 10.1016/j.neucom.2014.09.083 10.1109/TNSE.2018.2830307. 10.1007/978-3-319-01604-7_42 10.1109/TMM.2016.2537781 10.1587/transcom.2016EBP3239 10.1007/s00521-012-1263-0 10.1016/j.cose.2016.10.010 10.1109/TSE.1987.232894 10.1007/978-94-015-3994-4 10.1155/2016/9653230. 10.1002/ett.2619 10.3390/e17042367 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2018 |
| Copyright_xml | – notice: The Author(s) 2018 |
| DBID | AFRWT AAYXX CITATION DOA |
| DOI | 10.1177/1550147718814471 |
| DatabaseName | Sage Journals GOLD Open Access 2024 CrossRef DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1550-1477 |
| ExternalDocumentID | oai_doaj_org_article_7941f0ad079b44c2abe447259ee7ef55 10_1177_1550147718814471 10.1177_1550147718814471 |
| GrantInformation_xml | – fundername: the Science and Technology Program of Guangxi grantid: No. AB17195045 – fundername: the Open Projects of State Key Laboratory of Integrated Service Networks (ISN) of Xidian University grantid: No. ISN19-13 – fundername: the National Natural Science Foundation of China grantid: No. 61363006 – fundername: the National Natural Science Foundation of Guangxi grantid: No.2016GXNSFAA380098 |
| GroupedDBID | .4S .DC 0R~ 24P 29J 4.4 54M 5GY 5VS 8FE 8FG AAJEY AAJPV AAKPC AASGM ABAWP ABJCF ABQXT ABUWG ACCMX ACGEJ ACGFS ACIWK ACROE ADBBV ADMLS ADXPE AEDFJ AENEX AEWDL AFCOW AFKRA AFKRG AFRWT AJUZI ALMA_UNASSIGNED_HOLDINGS ARAPS ARCSS AUTPY AWYRJ AYAKG AZQEC BCNDV BDDNI BENPR BGLVJ BPHCQ CAG CCPQU COF CS3 CWDGH DU5 DWQXO EBS EDO EJD GNUQQ GROUPED_DOAJ H13 HCIFZ I-F IAO ICD IEA IL9 IPNFZ ITC J8X K.F K6V K7- KQ8 L6V M4Z M7S O9- OK1 P2P P62 PHGZM PHGZT PIMPY PQQKQ PROAC PTHSS RIG RNS ROL SAUOL SCNPE SFC TDBHL TFW TUS TWF TWQ XH6 AAMMB AAYXX ACHEB AEFGJ AFFHD AGXDD AIDQK AIDYY ALUQN AQTUD CITATION PQGLB |
| ID | FETCH-LOGICAL-c389t-7ad90bb0c57c20c24f5dbf14ede65db2c52cc1c2d951c1979b9d010acf334ccc3 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 46 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000451384800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1550-1477 |
| IngestDate | Fri Oct 03 12:53:42 EDT 2025 Sat Nov 29 06:19:37 EST 2025 Tue Nov 18 21:54:05 EST 2025 Tue Jun 17 22:47:17 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 11 |
| Keywords | anomaly detection Spark Isolation Forest parallelization Network traffic |
| Language | English |
| License | This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c389t-7ad90bb0c57c20c24f5dbf14ede65db2c52cc1c2d951c1979b9d010acf334ccc3 |
| ORCID | 0000-0003-4043-8165 |
| OpenAccessLink | https://doaj.org/article/7941f0ad079b44c2abe447259ee7ef55 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_7941f0ad079b44c2abe447259ee7ef55 crossref_primary_10_1177_1550147718814471 crossref_citationtrail_10_1177_1550147718814471 sage_journals_10_1177_1550147718814471 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-11-01 |
| PublicationDateYYYYMMDD | 2018-11-01 |
| PublicationDate_xml | – month: 11 year: 2018 text: 2018-11-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | London, England |
| PublicationPlace_xml | – name: London, England |
| PublicationTitle | International journal of distributed sensor networks |
| PublicationYear | 2018 |
| Publisher | SAGE Publications Wiley |
| Publisher_xml | – name: SAGE Publications – name: Wiley |
| References | Chen, Yeo, Lee 2017; 65 Matsuda, Morita, Kudo 2017; 100 Li 2017; 472 De la Hoz, De La Hoz, Ortiz 2015; 164 Du, Jiang, Qian 2016; 18 Jiang, Yao, Xu 2015; 26 Moustafa, Slay 2016; 25 Yu, Jibin, Jiang 2016 Kadri, Harrou, Chaabane 2016; 173 Bereziski, Jasiul, Szpyrka 2015; 17 Denning 1987; 13 Cai, He, Guan 2018; 15 Sheikhan, Jadidi 2014; 24 Huang Q (bibr19-1550147718814471) bibr13-1550147718814471 bibr18-1550147718814471 bibr9-1550147718814471 bibr14-1550147718814471 Paul B (bibr5-1550147718814471) Moustafa N (bibr28-1550147718814471) Liu FT (bibr26-1550147718814471) bibr8-1550147718814471 bibr17-1550147718814471 bibr4-1550147718814471 Yu G (bibr12-1550147718814471) Moustafa N (bibr27-1550147718814471) 2016; 25 Cai Z (bibr2-1550147718814471) 2018; 15 Salem O (bibr7-1550147718814471) bibr11-1550147718814471 Celenk M (bibr15-1550147718814471) Han J (bibr16-1550147718814471) Shon T (bibr23-1550147718814471) bibr10-1550147718814471 bibr3-1550147718814471 bibr25-1550147718814471 bibr1-1550147718814471 bibr6-1550147718814471 Duong NH (bibr21-1550147718814471) Kumari R (bibr22-1550147718814471) bibr20-1550147718814471 |
| References_xml | – volume: 65 start-page: 314 year: 2017 end-page: 328 article-title: Detection of network anomalies using Improved-MSPCA with sketches publication-title: Comput Secur – volume: 472 start-page: 164 year: 2017 end-page: 187 article-title: Record length requirement of long-range dependent teletraffic publication-title: Physica A – volume: 13 start-page: 222 year: 1987 end-page: 232 article-title: An intrusion-detection model publication-title: IEEE T Software Eng – volume: 173 start-page: 2102 year: 2016 end-page: 2114 article-title: Seasonal ARMA-based SPC charts for anomaly detection: application to emergency department systems publication-title: Neurocomputing – volume: 15 start-page: 577 year: 2018 end-page: 590 article-title: Collective data-sanitization for preventing sensitive information inference attacks in social networks publication-title: IEEE T Depend Secure – volume: 17 start-page: 2367 year: 2015 end-page: 2408 article-title: An entropy-based network anomaly detection method publication-title: Entropy – volume: 24 start-page: 599 year: 2014 end-page: 611 article-title: Flow-based anomaly detection in high-speed links using modified GSA-optimized neural network publication-title: Neural Comput Appl – volume: 18 start-page: 820 year: 2016 end-page: 830 article-title: Resource allocation with video traffic prediction in cloud-based space systems publication-title: IEEE T Multimedia – volume: 100 start-page: 749 year: 2017 end-page: 761 article-title: Traffic anomaly detection based on robust principal component analysis using periodic traffic behavior publication-title: IEICE T Commun – year: 2016 article-title: An improved ARIMA-based traffic anomaly detection algorithm for wireless sensor networks publication-title: Int J Distrib Sens N – volume: 164 start-page: 71 year: 2015 end-page: 81 article-title: PCA filtering and probabilistic SOM for network intrusion detection publication-title: Neurocomputing – volume: 25 start-page: 18 year: 2016 end-page: 31 article-title: The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set publication-title: Inform Secur J – volume: 26 start-page: 308 year: 2015 end-page: 317 article-title: Multi-scale anomaly detection for high-speed network traffic publication-title: T Emerg Telecommun T – ident: bibr8-1550147718814471 doi: 10.1016/j.neucom.2015.10.009 – ident: bibr13-1550147718814471 doi: 10.1016/j.physa.2016.12.069 – start-page: 3548 volume-title: IEEE international conference on systems, man and cybernetics ident: bibr15-1550147718814471 – start-page: 176 volume-title: Proceedings from the sixth annual IEEE SMC information assurance workshop ident: bibr23-1550147718814471 – ident: bibr10-1550147718814471 doi: 10.1016/j.neucom.2014.09.083 – ident: bibr1-1550147718814471 doi: 10.1109/TNSE.2018.2830307. – ident: bibr6-1550147718814471 doi: 10.1007/978-3-319-01604-7_42 – ident: bibr14-1550147718814471 doi: 10.1109/TMM.2016.2537781 – start-page: 157 volume-title: Fourth international conference on advances in computing and communications ident: bibr5-1550147718814471 – start-page: 413 volume-title: Eighth IEEE international conference on data mining ident: bibr26-1550147718814471 – volume: 15 start-page: 577 year: 2018 ident: bibr2-1550147718814471 publication-title: IEEE T Depend Secure – ident: bibr9-1550147718814471 doi: 10.1587/transcom.2016EBP3239 – ident: bibr25-1550147718814471 doi: 10.1007/s00521-012-1263-0 – ident: bibr20-1550147718814471 doi: 10.1016/j.cose.2016.10.010 – ident: bibr4-1550147718814471 doi: 10.1109/TSE.1987.232894 – start-page: 1420 volume-title: IEEE conference on computer communications ident: bibr19-1550147718814471 – ident: bibr3-1550147718814471 doi: 10.1007/978-94-015-3994-4 – start-page: 1 volume-title: Military communications and information systems conference (MilCIS) ident: bibr28-1550147718814471 – start-page: 208 volume-title: IEEE intelligent vehicles symposium ident: bibr12-1550147718814471 – start-page: 387 volume-title: 3rd international conference on recent advances in information technology (RAIT) ident: bibr22-1550147718814471 – start-page: 4373 volume-title: IEEE international conference on communications (ICC) ident: bibr7-1550147718814471 – volume: 25 start-page: 18 year: 2016 ident: bibr27-1550147718814471 publication-title: Inform Secur J – ident: bibr17-1550147718814471 doi: 10.1155/2016/9653230. – ident: bibr18-1550147718814471 doi: 10.1002/ett.2619 – start-page: 644 volume-title: 18th international conference on advanced communication technology (ICACT) ident: bibr21-1550147718814471 – start-page: 1 volume-title: Proceedings of IEEE Southeastcon ident: bibr16-1550147718814471 – ident: bibr11-1550147718814471 doi: 10.3390/e17042367 |
| SSID | ssj0036261 |
| Score | 2.3413706 |
| Snippet | With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data... |
| SourceID | doaj crossref sage |
| SourceType | Open Website Enrichment Source Index Database Publisher |
| StartPage | 155014771881447 |
| Title | A parallel algorithm for network traffic anomaly detection based on Isolation Forest |
| URI | https://journals.sagepub.com/doi/full/10.1177/1550147718814471 https://doaj.org/article/7941f0ad079b44c2abe447259ee7ef55 |
| Volume | 14 |
| WOSCitedRecordID | wos000451384800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1550-1477 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0036261 issn: 1550-1477 databaseCode: DOA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3NS8MwFA8yPHjxW5xf5CCCh7K2SZv2OMWhIMPDhN1K8pKoUDvZquB_70vaTSeoF2-hSUl5H3m_R15_j5BTF3a4UjbIcT5wjHSBYin6lU79echi5buW3IrhMBuP87svrb5cTVhDD9wIrof2EtlQ6lDkinOIpTKcCwTtxghjE89einPzZKo5gx3HSvR5KdlzODziAs_hDBMIES0FIc_Vv1TI5WPLYJOst6CQ9puP2SIrptomG_OGC7T1vx0y6lNH1Y2pfkll-TDBxP7xmSLspFVTzU3rqXScEFRWk2dZvlNtal9rVVEXrjTFwQ1am1cHdW05Z_UuuR9cjS6vg7YtQgCILupASJ2HSoWQCIhDiLlNtLIRN9qkOIohiQEiiDWCJ4hylFuuMeuSYBnjAMD2SKeaVGaf0MwoYDpkzEDGrZGZzlIrEfFplSTKiC7pzeVUQMsZ7lpXlEXU0oR_l2yXnC_eeGn4Mn5Ze-FEv1jnmK79A9R_0eq_-Ev_XXLmFFe0rjf7cbeD_9jtkKwhYsqanxGPSKeevppjsgpv9dNseuJt8ANel9zr |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+parallel+algorithm+for+network+traffic+anomaly+detection+based+on+Isolation+Forest&rft.jtitle=International+journal+of+distributed+sensor+networks&rft.au=Tao%2C+Xiaoling&rft.au=Peng%2C+Yang&rft.au=Zhao%2C+Feng&rft.au=Zhao%2C+Peichao&rft.date=2018-11-01&rft.pub=SAGE+Publications&rft.issn=1550-1477&rft.eissn=1550-1477&rft.volume=14&rft.issue=11&rft_id=info:doi/10.1177%2F1550147718814471&rft.externalDocID=10.1177_1550147718814471 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1550-1477&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1550-1477&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1550-1477&client=summon |