Cross-Level Attentive Feature Aggregation for Change Detection

This article studies change detection within pairs of optical images remotely sensed from overhead views. We consider that a high-performance solution to this task entails highly effective multi-level feature interaction. With that in mind, we propose a novel approach characterized by two attentive...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on circuits and systems for video technology Ročník 34; číslo 7; s. 6051 - 6062
Hlavní autoři:	Wang, Guangxing, Cheng, Gong, Zhou, Peicheng, Han, Junwei
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.07.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	attention mechanism Attention mechanisms Change detection Change detection algorithms Effectiveness feature aggregation Feature extraction feature pyramid network Fuses Logic gates Modulation Remote sensing Transformers
ISSN:	1051-8215, 1558-2205
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This article studies change detection within pairs of optical images remotely sensed from overhead views. We consider that a high-performance solution to this task entails highly effective multi-level feature interaction. With that in mind, we propose a novel approach characterized by two attentive feature aggregation schemes that handle cross-level features in different processes. For the Siamese-based feature extraction of the bi-temporal image pair, we attach emphasis on constructing semantically strong and contextually rich pyramidal feature representations to enable comprehensive matching and differencing. To this end, we leverage a feature pyramid network and re-formulate its cross-level feature merging procedure as top-down modulation with multiplicative channel attention and additive gated attention. For the multi-level difference feature fusion, we progressively fuse the derived difference feature pyramid in an attend-then-filter manner. This makes the high-level fused features and the adjacent lower-level difference features constrain each other, and thus allows steady feature fusion for specifying change regions. In addition, we build an upsampling head as a replacement for the normal heads followed by static upsampling. Our implementation contains a stack of upsampling modules that allocate features for each pixel. Each has a learnable branch that produces attentive residuals for refining the statically upsampled results. We conduct extensive experiments on four public datasets and results show that our approach achieves state-of-the-art performance. Code is available at https://github.com/xingronaldo/CLAFA .
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2023.3344092