Multi-Stream 3D latent feature clustering for abnormality detection in videos

Detection of abnormal behavior in surveillance videos is essential for public safety and monitoring. However, it needs constant human focus and attention for human-based surveillance systems, which is a challenging process. Therefore, automatic detection of such events is of great significance. Abno...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Applied intelligence (Dordrecht, Netherlands) Ročník 52; číslo 1; s. 1126 - 1143
Hlavní autoři: Asad, Mujtaba, Jiang, He, Yang, Jie, Tu, Enmei, Malik, Aftab Ahmad
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.01.2022
Springer Nature B.V
Témata:
ISSN:0924-669X, 1573-7497
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Detection of abnormal behavior in surveillance videos is essential for public safety and monitoring. However, it needs constant human focus and attention for human-based surveillance systems, which is a challenging process. Therefore, automatic detection of such events is of great significance. Abnormal event detection is a challenging problem due to the scarceness of labelled data and the low probability of occurrence of such events. In this paper, we propose a novel multi-stream two-stage architecture to detect abnormal behavior in videos. Our contributions are three-fold: 1) In the first stage, we propose a 3D Convolutional Autoencoder (3DCAE) architecture for appearance and motion feature extraction from both video frame input and dynamic flow input streams of normal event training videos in an unsupervised manner. 2) We have used a multi-objective loss function for 3DCAE reconstruction which can focus more on foreground moving objects rather that the stationary background information. 3) In the second stage, the fused latent features from both video frames and dynamic flow inputs are grouped together into different clusters of normality. Then we eliminate the smaller or sparse clusters, which are supposed to contain noisy patterns in the training data, to represent stronger normality patterns. A Deep one-class Support Vector Data Description (SVDD) classifier is then trained on these 3D normality clusters to generate anomaly scores for each sample in 3D clusters to differentiate between normal and abnormal occurrences. Experimental results on three benchmarking datasets: UCSD Pedestrian, Shanghai Tech, and Avenue, show significant improvement in the performance compared to the state-of-the-art approaches.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-021-02356-9