Multi-Modal Image Super-Resolution via Deep Convolutional Transform Learning

Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MI...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2024 32nd European Signal Processing Conference (EUSIPCO) s. 671 - 675
Hlavní autori: Kumar, Kriti, Majumdar, Angshul, Kumar, A Anil, Chandra, M Girish
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: European Association for Signal Processing - EURASIP 26.08.2024
Predmet:
ISSN:2076-1465
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MISR) techniques are required to improve the spatial/spectral resolution of target modality, taking help from High Resolution (HR) guidance modality that shares common features like textures, edges, and other structures. Traditional MISR approaches using Convolutional Neural Networks (CNNs) typically employ an encoder-decoder architecture, which is prone to overfit in data-limited scenarios. This work proposes a novel deep convolutional analysis sparse coding method, utilizing convolutional transforms within a fusion framework that eliminates the need for a decoder network. Thus, reducing the trainable parameters and enhancing the suitability for data-limited application scenarios. A joint optimization framework is proposed, which learns deep convolutional transforms for Low Resolution (LR) images of the target modality and HR images of the guidance modality, along with a fusion transform that combines these transform features to reconstruct HR images of the target modality. In contrast to dictionary-based synthesis sparse coding methods for MISR, the proposed approach offers improved performance with reduced complexity, leveraging the inherent advantages of transform learning. The efficacy of the proposed method is demonstrated using RGB-NIR and RGB-MS datasets, showing superior reconstruction performance compared to state-of-the-art techniques without introducing additional artifacts from the guidance image.
ISSN:2076-1465
DOI:10.23919/EUSIPCO63174.2024.10715007