Optimal feature selection through reinforcement learning and fuzzy signature for improving classification accuracy

It is the main objective of feature selection to reduce the computational cost of a predictive model while increasing its performance. For feature selection, an exact search approach is used to evaluate all possible combinations of features in exponential time. The use of meta-heuristic algorithms i...

Full description

Saved in:

Bibliographic Details
Published in:	Multimedia tools and applications Vol. 84; no. 10; pp. 6931 - 6965
Main Authors:	Mansouri, Najme, Zandvakili, Aboozar, Javidi, Mohammad Masoud
Format:	Journal Article
Language:	English
Published:	New York Springer US 01.03.2025 Springer Nature B.V
Subjects:	Accuracy Adaptive algorithms Algorithms Classification Computer Communication Networks Computer Science Data Structures and Information Theory Exploitation Feature selection Greedy algorithms Heuristic methods Machine learning Multi-armed bandit problems Multimedia Information Systems Optimization Parameters Prediction models Redundancy Special Purpose and Application-Based Systems Fuzzy signature Feature selection Greedy Multi-Armed Bandit Reinforcement learning Classification
ISSN:	1573-7721, 1380-7501, 1573-7721
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	It is the main objective of feature selection to reduce the computational cost of a predictive model while increasing its performance. For feature selection, an exact search approach is used to evaluate all possible combinations of features in exponential time. The use of meta-heuristic algorithms is another option that has some drawbacks. This paper converts the feature selection into a multi-armed bandit problem (MAB) by using the ε-Greedy algorithm. The ε parameter balances exploration and exploitation. In traditional ε -Greedy, the ε parameter is considered to be fixed. The creation of a good balance is impossible with a fixed parameter. A fuzzy signature approach is used to adaptively adjust the parameter of the ε-Greedy algorithm to take advantage of the exploration-exploitation trade-off inherent in the multi-armed bandit problem. In each episode, rewards are calculated based on correlations between features and objective functions (e. g., error classification, number of selected features, and redundancy). The calculated reward is used to adjust ε parameter adaptively and also is used to determine the number of iterations of the inner loop in the ε -Greedy algorithm. Adaptive adjustment allows the value of ε to be dynamic, adapting to the behavior of the environment. Lastly, different algorithms are compared, including Bat Algorithm (BA), Grasshopper Optimization Algorithm (GOA), Binary Monarch Butterfly Optimization (BMBO), Upper Confidence Bound (UCB), Stochastic Gradient Ascent (SGA), Greedy and classical ε -Greedy algorithms. As compared to BA, GOA, BMBO, UCB, SGA, Greedy, and classical ε -Greedy, the proposed algorithm improved the classification accuracy by 8.5%, 10.6%, 5.9%, 12.9%, 17.5%, 22.0%, and 3.2%, respectively.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1573-7721 1380-7501 1573-7721
DOI:	10.1007/s11042-024-19069-z