Spatial-temporal generative network based on deep long short-term memory autoencoder for hand skeleton data sequences reconstruction and recognition

Convolutional neural networks attract the highest research focus in the developing field of Hand Gesture Recognition (HGR). Nevertheless, these approaches presented a challenging task in adapting to time-series data. In skeleton-based HGR, extracting spatial–temporal information remains a challenge....

Full description

Saved in:

Bibliographic Details
Published in:	Engineering applications of artificial intelligence Vol. 161; p. 112289
Main Authors:	Ameur, Safa, Mahjoub, Mohamed Ali, Ben Khalifa, Anouar
Format:	Journal Article
Language:	English
Published:	Elsevier Ltd 12.12.2025
Subjects:	Autoencoder Generative network Hand gesture recognition Recurrent neural network Time-series data Generative network Recurrent neural network Autoencoder Time-series data Hand gesture recognition
ISSN:	0952-1976
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Convolutional neural networks attract the highest research focus in the developing field of Hand Gesture Recognition (HGR). Nevertheless, these approaches presented a challenging task in adapting to time-series data. In skeleton-based HGR, extracting spatial–temporal information remains a challenge. In recent times, recurrent neural networks have exhibited exceptional performance in detecting desired hand gestures by processing of varied length time-series data. Although they outperform traditional methods when huge training data is accessible, their effectiveness significantly diminishes when data availability is constrained. In this study, we introduce an unsupervised data augmentation network known as the Spatial-Temporal Generative Network (STGN), which reconstructs both the spatial and temporal information of the input sequences by leveraging a Deep Long Short-Term Memory Auto-Encoder (DLSTM-AE) network. Consequently, the DLSTM-AE combined with different Long Short-Term Memory (LSTM) network variations, forming an integrated network that can be trained end-to-end for HGR. Through experimentation conducted on the LeapGestureDB dataset (Leap Motion-based Gesture Dataset) and RIT dataset (Rochester Institute of Technology Hand Gesture Dataset), we prove that data reconstruction using STGN had a prominent effect on improving the accuracy of recognizing time-series based hand gestures. For all experiments, the best recognition results are achieved in the augmented dataset. Accuracies were improved on all tested LSTM networks from 2 to 10%. For reproducible research, the code is available at: https://github.com/AMEURsafa/STGN. [Display omitted] •A DLSTM-AE reconstructs spatial and temporal representation of the input sequences.•Generation of larger and reliable datasets, enhancing generalizability of the models.•HGR model: DLSTM-AE for data reconstruction and various RNN models for classification.•Effectiveness and efficiency model evaluation on two benchmark datasets.
ISSN:	0952-1976
DOI:	10.1016/j.engappai.2025.112289