Feature Optimization and Stacked Ensemble Learning for Parkinson's Disease Classification Using Speech Analysis

Parkinson's Disease (PD) is the second most common neurodegenerative disorder, whose symptoms worsen over time, making early diagnosis a challenging task. Changes in speech have been identified as an early symptom of PD identification. However, medical datasets often have a small sample size, w...

Full description

Saved in:
Bibliographic Details
Published in:Technical review - IETE Vol. 42; no. 5; pp. 632 - 650
Main Authors: Agrawal, Sneha, Sahu, Satya Prakash
Format: Journal Article
Language:English
Published: Taylor & Francis 03.09.2025
Subjects:
ISSN:0256-4602, 0974-5971
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Parkinson's Disease (PD) is the second most common neurodegenerative disorder, whose symptoms worsen over time, making early diagnosis a challenging task. Changes in speech have been identified as an early symptom of PD identification. However, medical datasets often have a small sample size, while speech signal analysis generates high-dimensional data. Therefore, rigorous feature selection is essential for obtaining the best set of PD characteristics. This paper proposes a hybrid filter-wrapper feature selection approach for PD classification using a publicly available speech dataset (188 PD, 64 healthy subjects). Maximum Relevancy Minimum Redundancy (mRMR) and Relief algorithms are used to select top-ranked features, followed by the Modified Whale Optimization Algorithm (mWOA) to refine the selection for obtaining an optimized feature subset. The class imbalance issue is addressed using SMOTE. A stacked ensemble model is developed, integrating base learners, Decision Tree, Support Vector Machine, Naïve Bayes, k-Nearest Neighbour, and deep networks like shallow and deep with hyperparameters tuned via a grid search mechanism. The proposed approach is evaluated against state-of-the-art methods based on accuracy, precision, recall, and F1-score. Results demonstrate that hybrid feature selection and hyperparameter tuning reduce computational burden while improving classification accuracy, making it a promising framework for PD detection from speech data.
ISSN:0256-4602
0974-5971
DOI:10.1080/02564602.2025.2560813