A Comparative Study of Video-Based Human Representations for American Sign Language Alphabet Generation
Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of...
Uložené v:
| Vydané v: | IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG s. 1 - 6 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
27.05.2024
|
| Predmet: | |
| ISSN: | 2770-8330 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of sign language vocabulary. In practice, it is challenging to analyze fingerspelling alphabets due to their signing speed and small motion range. The usage of synthetic data has the potential of further improving fingerspelling alphabets analysis at scale. In this paper, we evaluate how different video-based human representations perform in a framework for Alphabet Generation for American Sign Language (ASL). We tested three mainstream video-based human representations: two-stream inflated 3D ConvNet, 3D landmarks of body joints, and rotation matrices of body joints. We also evaluated the effect of different skeleton graphs and selected body joints. The generation process of ASL fingerspelling used a transformer-based Conditional Variational Autoencoder. To train the model, we collected ASL alphabet signing videos from 17 signers with dynamic alphabet signing. The generated alphabets were evaluated using automatic metrics of quality such as FID, and we also considered supervised metrics by recognizing the generated entries using Spatio-Temporal Graph Convolutional Networks. Our experiments show that using the rotation matrices of the upper body joints and the signing hand give the best results for the generation of ASL alphabet signing. Going forward, our goal is to produce articulated fingerspelling words by combining individual alphabets learned in this work. |
|---|---|
| AbstractList | Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of the essential components of sign language, fingerspelling connects the natural spoken languages to the sign language and expands the scale of sign language vocabulary. In practice, it is challenging to analyze fingerspelling alphabets due to their signing speed and small motion range. The usage of synthetic data has the potential of further improving fingerspelling alphabets analysis at scale. In this paper, we evaluate how different video-based human representations perform in a framework for Alphabet Generation for American Sign Language (ASL). We tested three mainstream video-based human representations: two-stream inflated 3D ConvNet, 3D landmarks of body joints, and rotation matrices of body joints. We also evaluated the effect of different skeleton graphs and selected body joints. The generation process of ASL fingerspelling used a transformer-based Conditional Variational Autoencoder. To train the model, we collected ASL alphabet signing videos from 17 signers with dynamic alphabet signing. The generated alphabets were evaluated using automatic metrics of quality such as FID, and we also considered supervised metrics by recognizing the generated entries using Spatio-Temporal Graph Convolutional Networks. Our experiments show that using the rotation matrices of the upper body joints and the signing hand give the best results for the generation of ASL alphabet signing. Going forward, our goal is to produce articulated fingerspelling words by combining individual alphabets learned in this work. |
| Author | Dong, Lu Chaudhary, Lipisha Govindaraju, Venu Nwogu, Ifeoma Xu, Fei Setlur, Srirangaraj |
| Author_xml | – sequence: 1 givenname: Fei surname: Xu fullname: Xu, Fei organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA – sequence: 2 givenname: Lipisha surname: Chaudhary fullname: Chaudhary, Lipisha organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA – sequence: 3 givenname: Lu surname: Dong fullname: Dong, Lu organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA – sequence: 4 givenname: Srirangaraj surname: Setlur fullname: Setlur, Srirangaraj organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA – sequence: 5 givenname: Venu surname: Govindaraju fullname: Govindaraju, Venu organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA – sequence: 6 givenname: Ifeoma surname: Nwogu fullname: Nwogu, Ifeoma organization: University at Buffalo,Department of Computer Science and Engineering,Buffalo,New York,USA |
| BookMark | eNo1UG1LwzAYjKLgnPsHCvkDrU-aNE0-1uE2YSA49etImyc1sqYl7YT9e4svn-64Ow7urslF6AIScscgZQz0_Wqd60yqNINMpAxyNRE4IwtdaMVz4Fpooc7JLCsKSBTncEUWw_AJABwYMM5npCnpsmt7E83ov5DuxqM90c7Rd2-xSx7MgJZujq0J9AX7iAOGcUp2YaCui7RsMfp6Mne-CXRrQnM0DdLy0H-YCke6xoDxJ39DLp05DLj4wzl5Wz2-LjfJ9nn9tCy3ic9AjImQdWYLgVhX1mpntMu5VZVxgjGma2QOpFGOKUQ76aySopbMKTmtz2Wh-Jzc_vZ6RNz30bcmnvb_3_Bv9fdbKg |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/FG59268.2024.10582020 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Digital Libary (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9798350394948 |
| EISSN | 2770-8330 |
| EndPage | 6 |
| ExternalDocumentID | 10582020 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-i204t-46c2d74eecbdd9fa9f53d8baf41119ce1f06a8f18eedd8b1b64c61f8610556783 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001270976600134&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:03:58 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i204t-46c2d74eecbdd9fa9f53d8baf41119ce1f06a8f18eedd8b1b64c61f8610556783 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_10582020 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-May-27 |
| PublicationDateYYYYMMDD | 2024-05-27 |
| PublicationDate_xml | – month: 05 year: 2024 text: 2024-May-27 day: 27 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG |
| PublicationTitleAbbrev | FG |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003010133 |
| Score | 1.8863398 |
| Snippet | Sign language is a complex visual language, and automatic interpretations of sign language can facilitate communication involving deaf individuals. As one of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Avatars Measurement Sign language Three-dimensional displays Transformers Visualization Vocabulary |
| Title | A Comparative Study of Video-Based Human Representations for American Sign Language Alphabet Generation |
| URI | https://ieeexplore.ieee.org/document/10582020 |
| WOSCitedRecordID | wos001270976600134&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05b8IwFLZa1KETPah66w1dTYntxPZIUWkHhFAvsSHHB8qSVBAq9d_XNgHUoUO3yEpk2Zbzfe_43kPojhntjLAaB8sQs1Qy7EFBYm_eOpFTZkgahcIjPh6L6VROGrF61MJYa2Pyme2GxxjLN5VeBVeZv-GpByziLfR9zvlarLV1qNBQLY3SRqWT9OT98CmVJAv5W4R1N9_-6qISQWTY_uf0R6izk-PBZAs0x2jPlieo3fBHaG7n8hTN-zDYFfOGkCL4DZWDj8LYCj94vDIQnfbwEvNfG9lRuQTPXGETu4HXYl7CqPFjQj-IcXNbw7pAdXi_g96Hj2-DZ9w0UsAF6bEas0wTw5m1OjdGOiVdSo3IlWP-Tye1TVwvU8Ilwq_Djyd5xnSWOJHF7plc0DPUKqvSniMwnq-oELk3VDNPLRVluScxSunMpTKhF6gTNm72ua6VMdvs2eUf41foMBxPiMcTfo1a9WJlb9CB_qqL5eI2nvAPLuKnQA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwFLYQIMFUjiJu3sDq0thOYo-lohQRqgoK6lYlPqosCWpTJP49tpu2YmBgi6xE8iHn-97xvYfQLVPSKK4ldpYhZqFg2IKCwNa8NTyjTJHQC4WTeDDg47EY1mJ1r4XRWvvkM91yjz6Wr0q5cK4ye8NDC1jEWug7IWMkWMq11i4V6uqlUVrrdIK2uOs9hoJELoOLsNbq6199VDyM9Br_nMABam4EeTBcQ80h2tLFEWrUDBLq-zk_RtMOdDflvMElCX5DaeAjV7rE9xaxFHi3Pbz6DNhaeFTMwXJXWEVv4C2fFpDUnkzoODlupitYlqh27zfRe-9h1O3jupUCzkmbVZhFkqiYaS0zpYRJhQmp4llqmP3XCakD045SbgJu12HHgyxiMgoMj3z_zJjTE7RdlIU-RaAsY0ld7F5RySy5TCnLLI1JUxmZUAT0DDXdxk0-l9UyJqs9O_9j_Abt9UcvySR5GjxfoH13VC46T-JLtF3NFvoK7cqvKp_Prv1p_wAPEaqH |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+and+Workshops+on+Automatic+Face+and+Gesture+Recognition+%3A+FG&rft.atitle=A+Comparative+Study+of+Video-Based+Human+Representations+for+American+Sign+Language+Alphabet+Generation&rft.au=Xu%2C+Fei&rft.au=Chaudhary%2C+Lipisha&rft.au=Dong%2C+Lu&rft.au=Setlur%2C+Srirangaraj&rft.date=2024-05-27&rft.pub=IEEE&rft.eissn=2770-8330&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FFG59268.2024.10582020&rft.externalDocID=10582020 |