Dual Cross-Attention for medical image segmentation
We propose Dual Cross-Attention (DCA), a simple yet effective attention module that enhances skip-connections in U-Net-based architectures for medical image segmentation. The plain and simple skip-connection scheme in U-Net-based architectures struggles with capturing the multi-scale context, result...
Saved in:
| Published in: | Engineering applications of artificial intelligence Vol. 126; p. 107139 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Ltd
01.11.2023
|
| Subjects: | |
| ISSN: | 0952-1976, 1873-6769 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We propose Dual Cross-Attention (DCA), a simple yet effective attention module that enhances skip-connections in U-Net-based architectures for medical image segmentation. The plain and simple skip-connection scheme in U-Net-based architectures struggles with capturing the multi-scale context, resulting in a semantic gap between encoder and decoder features. Such a semantic gap causes redundancy between low and high-level features which ultimately limits the segmentation performance. In this paper, we address this issue by sequentially capturing channel and spatial dependencies across multi-scale encoder features that adaptively combine low and high-level features in various scales to effectively bridge the semantic gap. First, the Channel Cross-Attention (CCA) extracts global channel-wise dependencies by utilizing cross-attention across channel tokens of multi-scale encoder features. Then, the Spatial Cross-Attention (SCA) module performs cross-attention to capture spatial dependencies across spatial tokens. Finally, these fine-grained encoder features are up-sampled and connected to their corresponding decoder parts to form the skip-connection scheme. Our proposed DCA module can be integrated into any encoder–decoder architecture with skip-connections such as U-Net and its variants as well as advanced architectures based on vision transformers. The experimental results using six medical image segmentation datasets demonstrate that our DCA module can consistently improve the overall segmentation performance at a slight parameter increase. Our codes are available at: https://github.com/gorkemcanates/Dual-Cross-Attention.
[Display omitted]
•The skip-connection schemes in U-Net and variants introduce a semantic gap between the encoder and decoder in medical image segmentation.•We propose Dual Cross-Attention (DCA) to enhance skip-connections in U-Net-based architectures for medical image segmentation.•DCA utilizes cross-attention in both channel and spatial dimensions to capture the rich global context.•The Proposed DCA module can be integrated into any encoder–decoder architecture with skip connections such as U-Net and its variants.•DCA significantly improves existing U-Net-based methods with a slight computational overhead. |
|---|---|
| ISSN: | 0952-1976 1873-6769 |
| DOI: | 10.1016/j.engappai.2023.107139 |