Cloth Interactive Transformer for Virtual Try-On

Uložené v:
Podrobná bibliografia
Názov: Cloth Interactive Transformer for Virtual Try-On
Autori: Bin Ren, Hao Tang, Fanyang Meng, Ding Runwei, Philip H. S. Torr, Nicu Sebe
Zdroj: ACM Transactions on Multimedia Computing, Communications and Applications, 20 (4)
ACM Transactions on Multimedia Computing, Communications, and Applications
Publication Status: Preprint
Informácie o vydavateľovi: Association for Computing Machinery (ACM), 2023.
Rok vydania: 2023
Predmety: FOS: Computer and information sciences, Virtual try-on, transformer, garment transfer, cross attention, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology, Computing methodologies → Visual content-based indexing and retrieval, Virtual try-on, transformer, garment transfer, cross attention
Popis: The 2D image-based virtual try-on has aroused increased interest from the multimedia and computer vision fields due to its enormous commercial value. Nevertheless, most existing image-based virtual try-on approaches directly combine the person-identity representation and the in-shop clothing items without taking their mutual correlations into consideration. Moreover, these methods are commonly established on pure convolutional neural networks (CNNs) architectures which are not simple to capture the long-range correlations among the input pixels. As a result, it generally results in inconsistent results. To alleviate these issues, in this article, we propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. During the first stage, we design a CIT matching block, aiming at precisely capturing the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. Consequently, it makes the warped in-shop clothing items look more natural in appearance. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask. The empirical results, based on mutual dependencies, demonstrate that the final try-on results are more realistic. Substantial empirical results on a public fashion dataset illustrate that the suggested CIT attains competitive virtual try-on performance.
Druh dokumentu: Article
Conference object
Popis súboru: application/pdf; application/application/pdf
Jazyk: English
ISSN: 1551-6865
1551-6857
DOI: 10.1145/3617374
DOI: 10.48550/arxiv.2104.05519
DOI: 10.3929/ethz-b-000655404
Prístupová URL adresa: http://arxiv.org/abs/2104.05519
https://ora.ox.ac.uk/objects/uuid:e700e92e-eb57-41a0-8270-a5b3de2bf4d3
http://hdl.handle.net/20.500.11850/655404
Rights: CC BY
arXiv Non-Exclusive Distribution
Prístupové číslo: edsair.doi.dedup.....95ebbce1caade695a7dad3e739b67fef
Databáza: OpenAIRE
Popis
Abstrakt:The 2D image-based virtual try-on has aroused increased interest from the multimedia and computer vision fields due to its enormous commercial value. Nevertheless, most existing image-based virtual try-on approaches directly combine the person-identity representation and the in-shop clothing items without taking their mutual correlations into consideration. Moreover, these methods are commonly established on pure convolutional neural networks (CNNs) architectures which are not simple to capture the long-range correlations among the input pixels. As a result, it generally results in inconsistent results. To alleviate these issues, in this article, we propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. During the first stage, we design a CIT matching block, aiming at precisely capturing the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. Consequently, it makes the warped in-shop clothing items look more natural in appearance. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask. The empirical results, based on mutual dependencies, demonstrate that the final try-on results are more realistic. Substantial empirical results on a public fashion dataset illustrate that the suggested CIT attains competitive virtual try-on performance.
ISSN:15516865
15516857
DOI:10.1145/3617374