SignAvatar: Sign Language 3D Motion Reconstruction and Generation

Achieving expressive 3D motion reconstruction and automatic generation for isolated sign words can be challenging, due to the lack of real-world 3D sign-word data, the complex nuances of signing motions, and the cross-modal understanding of sign language semantics. To address these challenges, we in...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG s. 1 - 10
Hlavní autoři: Dong, Lu, Chaudhary, Lipisha, Xu, Fei, Wang, Xiao, Lary, Mason, Nwogu, Ifeoma
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 27.05.2024
Témata:
ISSN:2770-8330
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Achieving expressive 3D motion reconstruction and automatic generation for isolated sign words can be challenging, due to the lack of real-world 3D sign-word data, the complex nuances of signing motions, and the cross-modal understanding of sign language semantics. To address these challenges, we introduce SignAvatar, a framework capable of both word-level sign language reconstruction and generation. SignAvatar employs a transformer-based conditional variational autoencoder architecture, effectively establishing relationships across different semantic modalities. Additionally, this approach incorporates a curriculum learning strategy to enhance the model's robustness and generalization, resulting in more realistic motions. Furthermore, we contribute the ASL3DWord dataset, composed of 3D joint rotation data for the body, hands, and face, for unique sign words. We demonstrate the effectiveness of SignAvatar through extensive experiments, showcasing its superior reconstruction and automatic generation capabilities. The code and dataset are available on the project page1.
ISSN:2770-8330
DOI:10.1109/FG59268.2024.10581934