Bi-objective evolutionary hyper-heuristics in automated machine learning for text classification tasks
This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number...
Saved in:
| Published in: | Swarm and evolutionary computation Vol. 98; p. 102073 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.10.2025
|
| Subjects: | |
| ISSN: | 2210-6502 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number of misclassifications and computational time, which are often in conflict. At the second level, hyper-heuristics, as a set of rules of the form if→then, select classification methods for datasets based on 16 meta-features representing the data distribution. The fitness for a hyper-heuristic is evaluated on a training group of datasets by aggregating the two low-level performances of the chosen methods. At the third level, the multi-objective evolutionary algorithm Strength Pareto Evolutionary Algorithm 2 evolves hyper-heuristic populations considering the bi-objective of minimizing the two aggregated performances. The result is a Pareto-approximated front of hyper-heuristics, which offers a range of solutions from computationally efficient to high classification performance. Finally, the model evaluates the front with an independent test group of datasets and selects those hyper-heuristics that are not dominated. We evaluated the resulting fronts through extensive experiments, measuring several quality indicators. We compare the model’s fronts with a front baseline consisting of non-dominated individual classification methods and four state-of-the-art automated machine learning tools (AutoKeras, AutoGluon, H2O, and TPOT). The proposed model yields larger, more diverse Pareto-approximated fronts that outperform the baseline front, allowing solution selection based on available resources and trade-offs between performance and cost.
[Display omitted]
•An evolutionary model based on HHs is presented.•The HHs select the most adequate classification methods for text datasets.•The selection is bi-objective: classification performance and computational time.•SPEA2 is used to produce PAFs of HHs using the two objectives.•The model’s PAFs outperforms the baseline considering different quality indicators. |
|---|---|
| ISSN: | 2210-6502 |
| DOI: | 10.1016/j.swevo.2025.102073 |