Multi-modal deep network for RGB-D segmentation of clothes

In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environm...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics letters Jg. 56; H. 9; S. 432 - 435
Hauptverfasser: Joukovsky, B, Hu, P, Munteanu, A
Format: Journal Article
Sprache:Englisch
Veröffentlicht: The Institution of Engineering and Technology 30.04.2020
Schlagworte:
ISSN:0013-5194, 1350-911X, 1350-911X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps. Secondly, a novel multi-modal encoder–ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.
ISSN:0013-5194
1350-911X
1350-911X
DOI:10.1049/el.2019.4150