Dynamic Feature Integration for Simultaneous Detection of Salient Object, Edge and Skeleton

Salient object segmentation, edge detection, and skeleton extraction are three contrasting low-level pixel-wise vision problems, where existing works mostly focused on designing tailored methods for each individual task. However, it is inconvenient and inefficient to store a pre-trained model for ea...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on image processing Vol. 29; p. 1
Main Authors:	Liu, Jiang-Jiang, Hou, Qibin, Cheng, Ming-Ming
Format:	Journal Article
Language:	English
Published:	United States IEEE 01.01.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Ablation Annotations Datasets Edge detection Image segmentation joint learning Modules Salience Salient object segmentation skeleton extraction
ISSN:	1057-7149, 1941-0042, 1941-0042
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Salient object segmentation, edge detection, and skeleton extraction are three contrasting low-level pixel-wise vision problems, where existing works mostly focused on designing tailored methods for each individual task. However, it is inconvenient and inefficient to store a pre-trained model for each task and perform multiple different tasks in sequence. There are methods that solve specific related tasks jointly but require datasets with different types of annotations supported at the same time. In this paper, we first show some similarities shared by these tasks and then demonstrate how they can be leveraged for developing a unified framework that can be trained end-to-end. In particular, we introduce a selective integration module that allows each task to dynamically choose features at different levels from the shared backbone based on its own characteristics. Furthermore, we design a task-adaptive attention module, aiming at intelligently allocating information for different tasks according to the image content priors. To evaluate the performance of our proposed network on these tasks, we conduct exhaustive experiments on multiple representative datasets. We will show that though these tasks are naturally quite different, our network can work well on all of them and even perform better than current single-purpose state-of-the-art methods. In addition, we also conduct adequate ablation analyses that provide a full understanding of the design principles of the proposed framework. To facilitate future research, source code will be released.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1057-7149 1941-0042 1941-0042
DOI:	10.1109/TIP.2020.3017352