Attention-based vector quantized variational autoencoder for anomaly detection by using orthogonal subspace constraints
This paper introduces a new framework that uses a vector quantized variational autoencoder (VQVAE) enhanced by orthogonal subspace constraints (OSC) and pyramid criss-cross attention (PCCA). The framework was designed for anomaly detection in industrial product image datasets. Previous studies on mo...
Gespeichert in:
| Veröffentlicht in: | Pattern recognition Jg. 164; S. 111500 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Ltd
01.08.2025
|
| Schlagworte: | |
| ISSN: | 0031-3203 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | This paper introduces a new framework that uses a vector quantized variational autoencoder (VQVAE) enhanced by orthogonal subspace constraints (OSC) and pyramid criss-cross attention (PCCA). The framework was designed for anomaly detection in industrial product image datasets. Previous studies on modeling low-dimensional feature distributions have been unable to effectively distinguish between normal features and noisy/abnormal information, which is effectively addressed using OSC in this study. Then, the vector quantized mechanism is embodied in these two complementary subspaces to obtain normal and abnormal embedding subspaces and discrete representations for normal and noisy information, respectively. The proposed approach robustly represents low-dimensional discrete manifolds to present the information from normal data using a limited number of feature vectors. Additionally, two PCCA modules are proposed to capture feature maps from different layers in the encoder and decoder, benefitting the low-dimensional mapping and reconstruction process. The features of different layers are treated as the query (Q), key (K), and value (V), which could capture both low-level and high-level features, incorporating comprehensive contextual information. The effectiveness of the proposed framework for anomaly detection is assessed by comparing its performance with those of the state-of-the-art approaches on various publicly available industrial product image datasets.
•Introduced a novel vector quantized variational autoencoder (VQVAE) framework for robust anomaly detection.•Utilized pyramid criss-cross attention (PCCA) modules to enhance feature encoding and reconstruction.•Effectively presented normal and abnormal features by using orthogonal subspace constraints (OSCs).•Achieved notable improvements in anomaly detection on several industrial image datasets. |
|---|---|
| ISSN: | 0031-3203 |
| DOI: | 10.1016/j.patcog.2025.111500 |