Bi-objective evolutionary hyper-heuristics in automated machine learning for text classification tasks

This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Swarm and evolutionary computation Ročník 98; s. 102073
Hlavní autoři: Estrella-Ramírez, Jonathan, de la Calleja, Jorge, Carranza, Juan Carlos Gómez
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.10.2025
Témata:
ISSN:2210-6502
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper proposes an evolutionary model based on hyper-heuristics to automate the selection of classification methods for text datasets under a bi-objective approach. The model has three nested levels. At the first level, individual methods classify datasets, recording two performances: the number of misclassifications and computational time, which are often in conflict. At the second level, hyper-heuristics, as a set of rules of the form if→then, select classification methods for datasets based on 16 meta-features representing the data distribution. The fitness for a hyper-heuristic is evaluated on a training group of datasets by aggregating the two low-level performances of the chosen methods. At the third level, the multi-objective evolutionary algorithm Strength Pareto Evolutionary Algorithm 2 evolves hyper-heuristic populations considering the bi-objective of minimizing the two aggregated performances. The result is a Pareto-approximated front of hyper-heuristics, which offers a range of solutions from computationally efficient to high classification performance. Finally, the model evaluates the front with an independent test group of datasets and selects those hyper-heuristics that are not dominated. We evaluated the resulting fronts through extensive experiments, measuring several quality indicators. We compare the model’s fronts with a front baseline consisting of non-dominated individual classification methods and four state-of-the-art automated machine learning tools (AutoKeras, AutoGluon, H2O, and TPOT). The proposed model yields larger, more diverse Pareto-approximated fronts that outperform the baseline front, allowing solution selection based on available resources and trade-offs between performance and cost. [Display omitted] •An evolutionary model based on HHs is presented.•The HHs select the most adequate classification methods for text datasets.•The selection is bi-objective: classification performance and computational time.•SPEA2 is used to produce PAFs of HHs using the two objectives.•The model’s PAFs outperforms the baseline considering different quality indicators.
ISSN:2210-6502
DOI:10.1016/j.swevo.2025.102073