A Comparative Study of Video-Based Human Representations for American Sign Language Alphabet Generation

Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG s. 1 - 6
Hlavní autori: Xu, Fei, Chaudhary, Lipisha, Dong, Lu, Setlur, Srirangaraj, Govindaraju, Venu, Nwogu, Ifeoma
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 27.05.2024
Predmet:
ISSN:2770-8330
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of sign language vocabulary. In practice, it is challenging to analyze fingerspelling alphabets due to their signing speed and small motion range. The usage of synthetic data has the potential of further improving fingerspelling alphabets analysis at scale. In this paper, we evaluate how different video-based human representations perform in a framework for Alphabet Generation for American Sign Language (ASL). We tested three mainstream video-based human representations: two-stream inflated 3D ConvNet, 3D landmarks of body joints, and rotation matrices of body joints. We also evaluated the effect of different skeleton graphs and selected body joints. The generation process of ASL fingerspelling used a transformer-based Conditional Variational Autoencoder. To train the model, we collected ASL alphabet signing videos from 17 signers with dynamic alphabet signing. The generated alphabets were evaluated using automatic metrics of quality such as FID, and we also considered supervised metrics by recognizing the generated entries using Spatio-Temporal Graph Convolutional Networks. Our experiments show that using the rotation matrices of the upper body joints and the signing hand give the best results for the generation of ASL alphabet signing. Going forward, our goal is to produce articulated fingerspelling words by combining individual alphabets learned in this work.
AbstractList Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of sign language vocabulary. In practice, it is challenging to analyze fingerspelling alphabets due to their signing speed and small motion range. The usage of synthetic data has the potential of further improving fingerspelling alphabets analysis at scale. In this paper, we evaluate how different video-based human representations perform in a framework for Alphabet Generation for American Sign Language (ASL). We tested three mainstream video-based human representations: two-stream inflated 3D ConvNet, 3D landmarks of body joints, and rotation matrices of body joints. We also evaluated the effect of different skeleton graphs and selected body joints. The generation process of ASL fingerspelling used a transformer-based Conditional Variational Autoencoder. To train the model, we collected ASL alphabet signing videos from 17 signers with dynamic alphabet signing. The generated alphabets were evaluated using automatic metrics of quality such as FID, and we also considered supervised metrics by recognizing the generated entries using Spatio-Temporal Graph Convolutional Networks. Our experiments show that using the rotation matrices of the upper body joints and the signing hand give the best results for the generation of ASL alphabet signing. Going forward, our goal is to produce articulated fingerspelling words by combining individual alphabets learned in this work.
Author Dong, Lu
Chaudhary, Lipisha
Govindaraju, Venu
Nwogu, Ifeoma
Xu, Fei
Setlur, Srirangaraj
Author_xml – sequence: 1
  givenname: Fei
  surname: Xu
  fullname: Xu, Fei
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
– sequence: 2
  givenname: Lipisha
  surname: Chaudhary
  fullname: Chaudhary, Lipisha
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
– sequence: 3
  givenname: Lu
  surname: Dong
  fullname: Dong, Lu
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
– sequence: 4
  givenname: Srirangaraj
  surname: Setlur
  fullname: Setlur, Srirangaraj
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
– sequence: 5
  givenname: Venu
  surname: Govindaraju
  fullname: Govindaraju, Venu
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
– sequence: 6
  givenname: Ifeoma
  surname: Nwogu
  fullname: Nwogu, Ifeoma
  organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA
BookMark eNo1UG1LwzAYjKLgnPsHCvkDrU-aNE0-1uE2YSA49etImyc1sqYl7YT9e4svn-64Ow7urslF6AIScscgZQz0_Wqd60yqNINMpAxyNRE4IwtdaMVz4Fpooc7JLCsKSBTncEUWw_AJABwYMM5npCnpsmt7E83ov5DuxqM90c7Rd2-xSx7MgJZujq0J9AX7iAOGcUp2YaCui7RsMfp6Mne-CXRrQnM0DdLy0H-YCke6xoDxJ39DLp05DLj4wzl5Wz2-LjfJ9nn9tCy3ic9AjImQdWYLgVhX1mpntMu5VZVxgjGma2QOpFGOKUQ76aySopbMKTmtz2Wh-Jzc_vZ6RNz30bcmnvb_3_Bv9fdbKg
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/FG59268.2024.10582020
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Libary (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798350394948
EISSN 2770-8330
EndPage 6
ExternalDocumentID 10582020
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i204t-46c2d74eecbdd9fa9f53d8baf41119ce1f06a8f18eedd8b1b64c61f8610556783
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001270976600134&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:03:58 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-46c2d74eecbdd9fa9f53d8baf41119ce1f06a8f18eedd8b1b64c61f8610556783
PageCount 6
ParticipantIDs ieee_primary_10582020
PublicationCentury 2000
PublicationDate 2024-May-27
PublicationDateYYYYMMDD 2024-05-27
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-May-27
  day: 27
PublicationDecade 2020
PublicationTitle IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG
PublicationTitleAbbrev FG
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003010133
Score 1.8863398
Snippet Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Avatars
Measurement
Sign language
Three-dimensional displays
Transformers
Visualization
Vocabulary
Title A Comparative Study of Video-Based Human Representations for American Sign Language Alphabet Generation
URI https://ieeexplore.ieee.org/document/10582020
WOSCitedRecordID wos001270976600134&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05b8IwFLZa1KETPah66w1dTYntxPZIUWkHhFAvsSHHB8qSVBAq9d_XNgHUoUO3yEpk2Zbzfe_43kPojhntjLAaB8sQs1Qy7EFBYm_eOpFTZkgahcIjPh6L6VROGrF61MJYa2Pyme2GxxjLN5VeBVeZv-GpByziLfR9zvlarLV1qNBQLY3SRqWT9OT98CmVJAv5W4R1N9_-6qISQWTY_uf0R6izk-PBZAs0x2jPlieo3fBHaG7n8hTN-zDYFfOGkCL4DZWDj8LYCj94vDIQnfbwEvNfG9lRuQTPXGETu4HXYl7CqPFjQj-IcXNbw7pAdXi_g96Hj2-DZ9w0UsAF6bEas0wTw5m1OjdGOiVdSo3IlWP-Tye1TVwvU8Ilwq_Djyd5xnSWOJHF7plc0DPUKqvSniMwnq-oELk3VDNPLRVluScxSunMpTKhF6gTNm72ua6VMdvs2eUf41foMBxPiMcTfo1a9WJlb9CB_qqL5eI2nvAPLuKnQA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwFLYQIMFUjiJu3sDq0thOYo-lohQRqgoK6lYlPqosCWpTJP49tpu2YmBgi6xE8iHn-97xvYfQLVPSKK4ldpYhZqFg2IKCwNa8NTyjTJHQC4WTeDDg47EY1mJ1r4XRWvvkM91yjz6Wr0q5cK4ye8NDC1jEWug7IWMkWMq11i4V6uqlUVrrdIK2uOs9hoJELoOLsNbq6199VDyM9Br_nMABam4EeTBcQ80h2tLFEWrUDBLq-zk_RtMOdDflvMElCX5DaeAjV7rE9xaxFHi3Pbz6DNhaeFTMwXJXWEVv4C2fFpDUnkzoODlupitYlqh27zfRe-9h1O3jupUCzkmbVZhFkqiYaS0zpYRJhQmp4llqmP3XCakD045SbgJu12HHgyxiMgoMj3z_zJjTE7RdlIU-RaAsY0ld7F5RySy5TCnLLI1JUxmZUAT0DDXdxk0-l9UyJqs9O_9j_Abt9UcvySR5GjxfoH13VC46T-JLtF3NFvoK7cqvKp_Prv1p_wAPEaqH
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+and+Workshops+on+Automatic+Face+and+Gesture+Recognition+%3A+FG&rft.atitle=A+Comparative+Study+of+Video-Based+Human+Representations+for+American+Sign+Language+Alphabet+Generation&rft.au=Xu%2C+Fei&rft.au=Chaudhary%2C+Lipisha&rft.au=Dong%2C+Lu&rft.au=Setlur%2C+Srirangaraj&rft.date=2024-05-27&rft.pub=IEEE&rft.eissn=2770-8330&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FFG59268.2024.10582020&rft.externalDocID=10582020