FADE: A Task-Agnostic Upsampling Operator for Encoder–Decoder Architectures

The goal of this work is to develop a task-agnostic feature upsampling operator for dense prediction where the operator is required to facilitate not only region-sensitive tasks like semantic segmentation but also detail-sensitive tasks such as image matting. Prior upsampling operators often can wor...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	International journal of computer vision Ročník 133; číslo 1; s. 151 - 172
Hlavní autori:	Lu, Hao, Liu, Wenze, Fu, Hongtao, Cao, Zhiguo
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York Springer US 01.01.2025 Springer Nature B.V
Predmet:	Artificial Intelligence Computer Imaging Computer Science Encoders-Decoders Image Processing and Computer Vision Image segmentation Parameter sensitivity Pattern Recognition Pattern Recognition and Graphics Semantic segmentation Semantics Vision Instance segmentation Feature upsampling Semantic segmentation Object detection Image matting Dense prediction Depth estimation
ISSN:	0920-5691, 1573-1405
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	The goal of this work is to develop a task-agnostic feature upsampling operator for dense prediction where the operator is required to facilitate not only region-sensitive tasks like semantic segmentation but also detail-sensitive tasks such as image matting. Prior upsampling operators often can work well in either type of the tasks, but not both. We argue that task-agnostic upsampling should dynamically trade off between semantic preservation and detail delineation, instead of having a bias between the two properties. In this paper, we present FADE, a novel, plug-and-play, lightweight, and task-agnostic upsampling operator by fusing the assets of decoder and encoder features at three levels: (i) considering both the encoder and decoder feature in upsampling kernel generation; (ii) controlling the per-point contribution of the encoder/decoder feature in upsampling kernels with an efficient semi-shift convolutional operator; and (iii) enabling the selective pass of encoder features with a decoder-dependent gating mechanism for compensating details. To improve the practicality of FADE, we additionally study parameter- and memory-efficient implementations of semi-shift convolution. We analyze the upsampling behavior of FADE on toy data and show through large-scale experiments that FADE is task-agnostic with consistent performance improvement on a number of dense prediction tasks with little extra cost. For the first time, we demonstrate robust feature upsampling on both region- and detail-sensitive tasks successfully. Code is made available at: https://github.com/poppinace/fade
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-024-02191-8