LLM-AE-MP: Web Attack Detection Using a Large Language Model with Autoencoder and Multilayer Perceptron

Web applications store sensitive data, making them prime targets for cybercriminals and posing national security risks. This study introduces a new approach to distinguishing legitimate and malicious hypertext transfer protocol (HTTP) requests using an autoencoder (AE). The integration of AE allows...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Expert systems with applications Ročník 274; s. 126982
Hlavní autori: Yang, Jing, Wu, Yuangui, Yuan, Yuping, Xue, Haozhong, Bourouis, Sami, Abdel-Salam, Mahmoud, Prajapat, Sunil, Por, Lip Yee
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 15.05.2025
Predmet:
ISSN:0957-4174
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Web applications store sensitive data, making them prime targets for cybercriminals and posing national security risks. This study introduces a new approach to distinguishing legitimate and malicious hypertext transfer protocol (HTTP) requests using an autoencoder (AE). The integration of AE allows for efficient feature distillation, enhancing the sensitivity of the model to anomalies in HTTP traffic. The AE framework is combined with a transductive long short-term memory (TLSTM) network, which is trained with an advanced generative adversarial network (GAN). Using GAN promotes an adaptive learning environment, significantly boosting the robustness and generalizability of our method against evolving web attack vectors. TLSTM uses transductive learning to focus on data points near the test set, improving the adaptability of the model to outperform traditional LSTM models. In our GAN, the generator purposely excludes gradients from the most influential batch elements, improving the ability of the model to generate diverse and generalized outputs. After training the AE, its latent representations are passed to a multilayer perceptron (MLP) for detection tasks. To address the imbalanced classification in MLP, we use a reinforcement learning (RL) strategy. The RL approach strategically adjusts incentives, enhancing the performance of the model in identifying less frequent but critical malicious instances, thereby supporting a balanced security assessment. Our evaluations using the CSIC 2010 (Spanish National Research Council 2010), FWAF (web application firewall), and HttpParams datasets show that our method outperforms existing techniques, achieving (Accuracy, F-measure, geometric mean (G-means), and area under the curve (AUC)) reaching (90.937%, 89.755%, 88.446%, 0.838), (89.055, 90.663%, 88.334%, 0.847) and (92.242%, 93.774%, 91.356%, 0.897), respectively. Moreover, our model achieves efficient runtime and memory usage across the datasets, providing a practical solution for real-time web attack detection. These results confirm the effectiveness of the model in security contexts, representing a substantial advancement in web attack detection and the improvement of investigative strategies.
ISSN:0957-4174
DOI:10.1016/j.eswa.2025.126982