An optimized multi-scale convolutional autoencoder for efficient abnormal event detection using rgb, depth and optical flow data
In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems&...
Uložené v:
| Vydané v: | Multimedia tools and applications Ročník 84; číslo 28; s. 34401 - 34435 |
|---|---|
| Hlavný autor: | |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
Springer US
01.08.2025
Springer Nature B.V |
| Predmet: | |
| ISSN: | 1573-7721, 1380-7501, 1573-7721 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems' ability to detect abnormal events, thereby contributing to enhanced security measures in public spaces. The proposed framework utilizes a Multiscale Convolutional Autoencoder (MSCAE) that processes inputs from RGB, depth, and optical flow video clips, enhancing the detection accuracy in complex scenes characterized by varying object scales, aspect ratios, and occlusions. To address the challenge of noise and preserve edges in video data, we implement a two-pass bilateral smooth filtering method, which is effective for noise-invariant, edge-preserving image smoothing. For object detection within these complex scenes, an enhanced Faster R-CNN model is employed. This model's performance is further refined through transfer learning on a dataset specifically composed of abnormal event videos. We also introduce significant improvements to the region proposal network (RPN) of the Faster R-CNN, particularly in non-maximum suppression (NMS) and anchor generation techniques, to better detect anomalies in diverse and complex environments. Furthermore, the MSCAE is integrated with Long Short-Term Memory (LSTM) neural networks to classify the detected anomalies, creating an end-to-end solution for video anomaly detection. Hyperparameter optimization for our deep learning models is performed using the Chameleon Swarm Algorithm, ensuring optimal model performance. Our framework was rigorously tested on the CUHK Avenue dataset, where it achieved a remarkable 99.5% accuracy, significantly outperforming existing methods and demonstrating the effectiveness of our approach. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1573-7721 1380-7501 1573-7721 |
| DOI: | 10.1007/s11042-025-20608-5 |