Single image defocus deblurring via multimodal-guided diffusion and depth-aware fusion

Defocus blur is inherently related to depth of field, yet its complex spatial variations pose significant challenges for existing methods. Two primary limitations persist: (1) insufficient semantic perception, introducing artifacts in degraded regions, and (2) the inability to explicitly infer depth...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition Jg. 171; S. 112133
Hauptverfasser:	Li, Xiaopan, Wu, Shiqian, Zhu, Qile, Xie, Shoulie, Agaian, Sos
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier Ltd 01.03.2026
Schlagworte:	Defocus deblurring Depth-aware fusion Diffusion model Multimodal representation Multimodal representation Depth-aware fusion Defocus deblurring Diffusion model
ISSN:	0031-3203
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Defocus blur is inherently related to depth of field, yet its complex spatial variations pose significant challenges for existing methods. Two primary limitations persist: (1) insufficient semantic perception, introducing artifacts in degraded regions, and (2) the inability to explicitly infer depth cues, leading to over-sharpening in focused areas and inadequate restoration in defocused regions. To address these issues, we propose MDDF-SIDD, a novel two-stage framework that combines multimodal-guided diffusion and depth-aware fusion for single-image defocus deblurring. In the first stage, we construct a multimodal representation by integrating fidelity-aware image features with blur-adaptive prompt features. This representation guides a pretrained text-to-image diffusion model to achieve high perceptual quality with precise structural and semantic alignment. In the second stage, we exploit depth priors to generate a depth-adaptive weight map, distinguishing focused regions from defocused ones. This enables a Laplacian pyramid-based fusion strategy, which adaptively blends sharp details from the original image with the refined estimation from the first stage, ensuring both local detail preservation and global consistency. Extensive experiments demonstrate that MDDF-SIDD outperforms state-of-the-art methods in both quantitative metrics and perceptual fidelity, setting a new benchmark for defocus deblurring.
ISSN:	0031-3203
DOI:	10.1016/j.patcog.2025.112133