DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: (1) lack of aligned training pairs and (2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	International journal of computer vision Ročník 128; číslo 10-11; s. 2402 - 2417
Hlavní autori:	Lee, Hsin-Ying, Tseng, Hung-Yu, Mao, Qi, Huang, Jia-Bin, Lu, Yu-Ding, Singh, Maneesh, Yang, Ming-Hsuan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York Springer US 01.11.2020 Springer
Predmet:	Artificial Intelligence Computer Imaging Computer Science Image Processing and Computer Vision Pattern Recognition Pattern Recognition and Graphics Special Issue on Generative Adversarial Networks for Computer Vision Vision
ISSN:	0920-5691, 1573-1405
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: (1) lack of aligned training pairs and (2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative evaluations, we measure realism with user study and Fréchet inception distance, and measure diversity with the perceptual distance metric, Jensen–Shannon divergence, and number of statistically-different bins.
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-019-01284-z