Adversarial autoencoder for continuous sign language recognition

Summary Sign language serves as a vital communication medium for the deaf community, encompassing a diverse array of signs conveyed through distinct hand shapes along with non‐manual gestures like facial expressions and body movements. Accurate recognition of sign language is crucial for bridging th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Concurrency and computation Jg. 36; H. 22
Hauptverfasser:	Kamal, Suhail Muhammad, Chen, Yidong, Li, Shaozi
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Hoboken Wiley Subscription Services, Inc 10.10.2024
Schlagworte:	adversarial autoencoder Availability continuous sign language recognition Datasets Deafness Knowledge representation Modules Performance enhancement Sign language vision‐language
ISSN:	1532-0626, 1532-0634
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Summary Sign language serves as a vital communication medium for the deaf community, encompassing a diverse array of signs conveyed through distinct hand shapes along with non‐manual gestures like facial expressions and body movements. Accurate recognition of sign language is crucial for bridging the communication gap between deaf and hearing individuals, yet the scarcity of large‐scale datasets poses a significant challenge in developing robust recognition technologies. Existing works address this challenge by employing various strategies, such as enhancing visual modules, incorporating pretrained visual models, and leveraging multiple modalities to improve performance and mitigate overfitting. However, the exploration of the contextual module, responsible for modeling long‐term dependencies, remains limited. This work introduces an Adversarial Autoencoder for Continuous Sign Language Recognition, AA‐CSLR, to address the constraints imposed by limited data availability, leveraging the capabilities of generative models. The integration of pretrained knowledge, coupled with cross‐modal alignment, enhances the representation of sign language by effectively aligning visual and textual features. Through extensive experiments on publicly available datasets (PHOENIX‐2014, PHOENIX‐2014T, and CSL‐Daily), we demonstrate the effectiveness of our proposed method in achieving competitive performance in continuous sign language recognition.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1532-0626 1532-0634
DOI:	10.1002/cpe.8220