Feature Optimization and Stacked Ensemble Learning for Parkinson's Disease Classification Using Speech Analysis

Parkinson's Disease (PD) is the second most common neurodegenerative disorder, whose symptoms worsen over time, making early diagnosis a challenging task. Changes in speech have been identified as an early symptom of PD identification. However, medical datasets often have a small sample size, w...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Technical review - IETE Ročník 42; číslo 5; s. 632 - 650
Hlavní autori: Agrawal, Sneha, Sahu, Satya Prakash
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Taylor & Francis 03.09.2025
Predmet:
ISSN:0256-4602, 0974-5971
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Parkinson's Disease (PD) is the second most common neurodegenerative disorder, whose symptoms worsen over time, making early diagnosis a challenging task. Changes in speech have been identified as an early symptom of PD identification. However, medical datasets often have a small sample size, while speech signal analysis generates high-dimensional data. Therefore, rigorous feature selection is essential for obtaining the best set of PD characteristics. This paper proposes a hybrid filter-wrapper feature selection approach for PD classification using a publicly available speech dataset (188 PD, 64 healthy subjects). Maximum Relevancy Minimum Redundancy (mRMR) and Relief algorithms are used to select top-ranked features, followed by the Modified Whale Optimization Algorithm (mWOA) to refine the selection for obtaining an optimized feature subset. The class imbalance issue is addressed using SMOTE. A stacked ensemble model is developed, integrating base learners, Decision Tree, Support Vector Machine, Naïve Bayes, k-Nearest Neighbour, and deep networks like shallow and deep with hyperparameters tuned via a grid search mechanism. The proposed approach is evaluated against state-of-the-art methods based on accuracy, precision, recall, and F1-score. Results demonstrate that hybrid feature selection and hyperparameter tuning reduce computational burden while improving classification accuracy, making it a promising framework for PD detection from speech data.
ISSN:0256-4602
0974-5971
DOI:10.1080/02564602.2025.2560813