Adaptive sampling for active learning with genetic programming

Active learning is a machine learning paradigm allowing to decide which inputs to use for training. It is introduced to Genetic Programming (GP) essentially thanks to the dynamic data sampling, used to address some known issues such as the computational cost, the over-fitting problem and the imbalan...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Cognitive systems research Ročník 65; s. 23 - 39
Hlavní autoři: Ben Hamida, Sana, Hmida, Hmida, Borgi, Amel, Rukoz, Marta
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.01.2021
Elsevier
Témata:
ISSN:1389-0417, 1389-0417
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Active learning is a machine learning paradigm allowing to decide which inputs to use for training. It is introduced to Genetic Programming (GP) essentially thanks to the dynamic data sampling, used to address some known issues such as the computational cost, the over-fitting problem and the imbalanced databases. The traditional dynamic sampling for GP gives to the algorithm a new sample periodically, often each generation, without considering the state of the evolution. In so doing, individuals do not have enough time to extract the hidden knowledge. An alternative approach is to use some information about the learning state to adapt the periodicity of the training data change. In this work, we propose an adaptive sampling strategy for classification tasks based on the state of solved fitness cases throughout learning. It is a flexible approach that could be applied with any dynamic sampling. We implemented some sampling algorithms extended with dynamic and adaptive controlling re-sampling frequency. We experimented them to solve the KDD intrusion detection and the Adult incomes prediction problems with GP. The experimental study demonstrates how the sampling frequency control preserves the power of dynamic sampling with possible improvements in learning time and quality. We also demonstrate that adaptive sampling can be an alternative to multi-level sampling. This work opens many new relevant extension paths.
ISSN:1389-0417
1389-0417
DOI:10.1016/j.cogsys.2020.08.008