Hybrid convolution (1D/2D)-based adaptive and attention-aided residual DenseNet approach on brain-computer interface for automatic imagined speech recognition framework
Neuroscience and rehabilitation may benefit from the use of a Brain-Computer Interface (BCI) based on Electroencephalograms (EEG). A person with neurological impairment benefits from this process since they express their thoughts to the outside world without the need for any devices. The process of...
Saved in:
| Published in: | Computer speech & language Vol. 96; p. 101866 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Ltd
01.02.2026
|
| Subjects: | |
| ISSN: | 0885-2308 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Neuroscience and rehabilitation may benefit from the use of a Brain-Computer Interface (BCI) based on Electroencephalograms (EEG). A person with neurological impairment benefits from this process since they express their thoughts to the outside world without the need for any devices. The process of speech recognition has attained limited success in recent years, primarily due to the weaker and more unpredictable brain impulses compared to speech signals, which makes it difficult for machine learning-based algorithms to perform the speech recognition process. Nowadays, computer vision has been transformed through the deep learning model and provides better performance than the machine learning model. Designing the automatic BCI-based imagined speech recognition process using deep learning is the main aim of the research. This research utilizes two publicly available datasets to collect the required EEG signals. The pre-processing is done on the collected EEG signal. These processed EEGs are transformed into spectrogram images. Similarly, the Empirical Mean Curve Decomposition (EMCD) process the preprocessed EEG to obtain the subband. From this sub bands, the statistical features likes root mean square, average amplitude change, kurtosis, Shannon Entropy (SE), Renyi entropy, Maximum fractal length, Hjorth Mobility (HM), Hjorth Complexity (HC), Enhanced wavelength, Integrated EEG, Slope sign change, zero crossing, activity, mean and Simple Square Integral (SSI), Tsallis Entropy (TE), Skewness, median, standard deviation, variance are obtained. Further, obtained features from the decomposed signal and the spectrogram images obtained from the pre-processed signal are passed to the Hybrid Convolution (1D-2D)-based Adaptive and Attention Residual DenseNet (HC-AARDNet) to get the final speech recognized output. Here, the developed HC-AARDNet’s parameters are optimized using the Enhanced Heap-Based Optimizer Algorithm (EHOA). Several performance metrics are used to evaluate the proposed speech recognition model. It is found that the recommended system attained an accuracy, specificity and sensitivity of 94.72, 94.72, and 94.69 respectively. |
|---|---|
| ISSN: | 0885-2308 |
| DOI: | 10.1016/j.csl.2025.101866 |