Single image defocus deblurring via multimodal-guided diffusion and depth-aware fusion

Defocus blur is inherently related to depth of field, yet its complex spatial variations pose significant challenges for existing methods. Two primary limitations persist: (1) insufficient semantic perception, introducing artifacts in degraded regions, and (2) the inability to explicitly infer depth...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition Vol. 171; p. 112133
Main Authors: Li, Xiaopan, Wu, Shiqian, Zhu, Qile, Xie, Shoulie, Agaian, Sos
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.03.2026
Subjects:
ISSN:0031-3203
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Defocus blur is inherently related to depth of field, yet its complex spatial variations pose significant challenges for existing methods. Two primary limitations persist: (1) insufficient semantic perception, introducing artifacts in degraded regions, and (2) the inability to explicitly infer depth cues, leading to over-sharpening in focused areas and inadequate restoration in defocused regions. To address these issues, we propose MDDF-SIDD, a novel two-stage framework that combines multimodal-guided diffusion and depth-aware fusion for single-image defocus deblurring. In the first stage, we construct a multimodal representation by integrating fidelity-aware image features with blur-adaptive prompt features. This representation guides a pretrained text-to-image diffusion model to achieve high perceptual quality with precise structural and semantic alignment. In the second stage, we exploit depth priors to generate a depth-adaptive weight map, distinguishing focused regions from defocused ones. This enables a Laplacian pyramid-based fusion strategy, which adaptively blends sharp details from the original image with the refined estimation from the first stage, ensuring both local detail preservation and global consistency. Extensive experiments demonstrate that MDDF-SIDD outperforms state-of-the-art methods in both quantitative metrics and perceptual fidelity, setting a new benchmark for defocus deblurring.
ISSN:0031-3203
DOI:10.1016/j.patcog.2025.112133