Bi-objective evolutionary hyper-heuristics in automated machine learning for text classification tasks

This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number...

Full description

Saved in:

Bibliographic Details
Published in:	Swarm and evolutionary computation Vol. 98; p. 102073
Main Authors:	Estrella-Ramírez, Jonathan, de la Calleja, Jorge, Carranza, Juan Carlos Gómez
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01.10.2025
Subjects:	Automated machine learning Genetic algorithms Hyper-heuristics Multi-objective optimization Text classification Automated machine learning Multi-objective optimization Hyper-heuristics Text classification Genetic algorithms
ISSN:	2210-6502
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number of misclassifications and computational time, which are often in conflict. At the second level, hyper-heuristics, as a set of rules of the form if→then, select classification methods for datasets based on 16 meta-features representing the data distribution. The fitness for a hyper-heuristic is evaluated on a training group of datasets by aggregating the two low-level performances of the chosen methods. At the third level, the multi-objective evolutionary algorithm Strength Pareto Evolutionary Algorithm 2 evolves hyper-heuristic populations considering the bi-objective of minimizing the two aggregated performances. The result is a Pareto-approximated front of hyper-heuristics, which offers a range of solutions from computationally efficient to high classification performance. Finally, the model evaluates the front with an independent test group of datasets and selects those hyper-heuristics that are not dominated. We evaluated the resulting fronts through extensive experiments, measuring several quality indicators. We compare the model’s fronts with a front baseline consisting of non-dominated individual classification methods and four state-of-the-art automated machine learning tools (AutoKeras, AutoGluon, H2O, and TPOT). The proposed model yields larger, more diverse Pareto-approximated fronts that outperform the baseline front, allowing solution selection based on available resources and trade-offs between performance and cost. [Display omitted] •An evolutionary model based on HHs is presented.•The HHs select the most adequate classification methods for text datasets.•The selection is bi-objective: classification performance and computational time.•SPEA2 is used to produce PAFs of HHs using the two objectives.•The model’s PAFs outperforms the baseline considering different quality indicators.
ISSN:	2210-6502
DOI:	10.1016/j.swevo.2025.102073