DCTFormer: A Dual-Branch Transformer With Cloze Tests for Video Anomaly Detection
Video anomaly detection is of critical importance in safety-critical scenarios. The key challenge is to effectively capture the spatio-temporal features of videos and learn normal patterns from the training data. However, existing methods often fall short in modelling intra-channel and inter-channel...
Uloženo v:
| Vydáno v: | IEEE transactions on multimedia s. 1 - 11 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
2025
|
| Témata: | |
| ISSN: | 1520-9210, 1941-0077 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Video anomaly detection is of critical importance in safety-critical scenarios. The key challenge is to effectively capture the spatio-temporal features of videos and learn normal patterns from the training data. However, existing methods often fall short in modelling intra-channel and inter-channel correlations as well as dynamic dependencies between video frames, leading to challenges in model robustness and generalization. To address these issues, we propose DCTFormer, a dual-branch framework that integrates both RGB and optical flow branches to handle Video Anomaly Detection. Firstly, we design a novel module TRAECT (Transformer-based Residual Autoencoder with Cloze Tests), which incorporates high-level semantics and temporal context information to improve the spatio-temporal relationships learning ability by capturing intra-channel and inter-channel correlations. More importantly, conditioned on the RGB branch, we propose a new optical flow completion approach incorporating richer motion dynamics to learn dynamic dependencies between video frames and optical flows through a conditional variational autoencoder. At last, we introduce an ensemble strategy to compute anomaly scores for both branches, and thus fully exploit the branches modality information. The experimentation on three challenging benchmark datasets evinces the efficacy of our framework, which outperforms current state-of-the-art approaches with regard to anomaly detection performance. |
|---|---|
| ISSN: | 1520-9210 1941-0077 |
| DOI: | 10.1109/TMM.2025.3613082 |