Anomaly Detection in Networking Logs Using Unsupervised Autoencoder Learning

Modern cloud-based infrastructures frequently operate across multilayered, multi-OS environments supported by numerous vendors. Diagnosing anomalies in such complex systems is a time-consuming task that often results in inaccurate root cause attribution, harming the credibility of otherwise reliable...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:ICFAI journal of computer sciences Ročník 19; číslo 3; s. 47 - 60
Hlavný autor: Bar, Kaushik
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Hyderabad IUP Publications 10.07.2025
Predmet:
ISSN:0973-9904
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Modern cloud-based infrastructures frequently operate across multilayered, multi-OS environments supported by numerous vendors. Diagnosing anomalies in such complex systems is a time-consuming task that often results in inaccurate root cause attribution, harming the credibility of otherwise reliable service components. The paper addresses the critical challenge of anomaly detection in networking logs and subsequent root cause analysis through hardware status data in such virtualized infrastructures. It identifies the limitations of traditional anomaly detection methods, including clustering-based (LOF SOF), statistical (Gaussian-based), rule-based and supervised approaches, which often fail under noisy, high-dimensional, or sparsely labeled settings. To overcome these limitations, the paper proposes a two-stage architecture: (1) an autoencoder-based anomaly detector trained on textual networking logs; and (2) a self-supervised long short-term memory (LSTM) autoencoder trained on hardware metrics augmented by log-derived anomaly flags. This hybrid approach captures temporal dependencies, reduces false positives, and improves root cause traceability. Evaluated on a proprietary dataset comprising over 200K entries, the proposed method outperformed traditional baselines, achieving an F1 score of 0.87, surpassing others by a margin of at least 12%. This solution offers a scalable and automated diagnostic tool for distributed systems with minimal human intervention.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0973-9904
DOI:10.71329/IUPJCS/2025.19.3.47-60