Hybrid convolution (1D/2D)-based adaptive and attention-aided residual DenseNet approach on brain-computer interface for automatic imagined speech recognition framework

Neuroscience and rehabilitation may benefit from the use of a Brain-Computer Interface (BCI) based on Electroencephalograms (EEG). A person with neurological impairment benefits from this process since they express their thoughts to the outside world without the need for any devices. The process of...

Full description

Saved in:
Bibliographic Details
Published in:Computer speech & language Vol. 96; p. 101866
Main Authors: Anusha, Jalagadugu, Chandramohan Reddy, S.
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.02.2026
Subjects:
ISSN:0885-2308
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Neuroscience and rehabilitation may benefit from the use of a Brain-Computer Interface (BCI) based on Electroencephalograms (EEG). A person with neurological impairment benefits from this process since they express their thoughts to the outside world without the need for any devices. The process of speech recognition has attained limited success in recent years, primarily due to the weaker and more unpredictable brain impulses compared to speech signals, which makes it difficult for machine learning-based algorithms to perform the speech recognition process. Nowadays, computer vision has been transformed through the deep learning model and provides better performance than the machine learning model. Designing the automatic BCI-based imagined speech recognition process using deep learning is the main aim of the research. This research utilizes two publicly available datasets to collect the required EEG signals. The pre-processing is done on the collected EEG signal. These processed EEGs are transformed into spectrogram images. Similarly, the Empirical Mean Curve Decomposition (EMCD) process the preprocessed EEG to obtain the subband. From this sub bands, the statistical features likes root mean square, average amplitude change, kurtosis, Shannon Entropy (SE), Renyi entropy, Maximum fractal length, Hjorth Mobility (HM), Hjorth Complexity (HC), Enhanced wavelength, Integrated EEG, Slope sign change, zero crossing, activity, mean and Simple Square Integral (SSI), Tsallis Entropy (TE), Skewness, median, standard deviation, variance are obtained. Further, obtained features from the decomposed signal and the spectrogram images obtained from the pre-processed signal are passed to the Hybrid Convolution (1D-2D)-based Adaptive and Attention Residual DenseNet (HC-AARDNet) to get the final speech recognized output. Here, the developed HC-AARDNet’s parameters are optimized using the Enhanced Heap-Based Optimizer Algorithm (EHOA). Several performance metrics are used to evaluate the proposed speech recognition model. It is found that the recommended system attained an accuracy, specificity and sensitivity of 94.72, 94.72, and 94.69 respectively.
ISSN:0885-2308
DOI:10.1016/j.csl.2025.101866