CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we pr...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 5906 - 5916
Hlavní autoři:	Zhao, Zixiang, Bai, Haowen, Zhang, Jiangshe, Zhang, Yulun, Xu, Shuang, Lin, Zudi, Timofte, Radu, Van Gool, Luc
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 01.06.2023
Témata:	Feature extraction Focusing Image restoration Low-level vision Neural networks Object detection Semantic segmentation Transformers
ISSN:	1063-6919
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. Firstly, CDDFuse uses Restormer blocks to extract cross-modality shallow features. We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information. A correlation-driven loss is further proposed to make the low-frequency features correlated while the high-frequency features uncorrelated based on the embedded information. Then, the LT-based global fusion and INN-based local fusion layers output the fused image. Extensive experiments demonstrate that our CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion. We also show that CDDFuse can boost the performance in downstream infrared-visible semantic segmentation and object detection in a unified benchmark. The code is available at https://github.om/haozixiang1228/MMIF-CDDFuse.
ISSN:	1063-6919
DOI:	10.1109/CVPR52729.2023.00572