Optimal sampling in unbiased active learning

Gespeichert in:
Bibliographische Detailangaben
Titel: Optimal sampling in unbiased active learning
Autoren: Imberg, Henrik, 1991, Jonasson, Johan, 1966, Axelson-Fisk, Marina, 1972
Quelle: Statistical sampling in machine learning 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), Online Proceedings of Machine Learning Research. 108:559-569
Schlagwörter: Optimal design, Weighted loss, Sampling weights, Generalised linear models, Unequal probability sampling, Active learning
Beschreibung: A common belief in unbiased active learning is that, in order to capture the most informative instances, the sampling probabilities should be proportional to the uncertainty of the class labels. We argue that this produces suboptimal predictions and present sampling schemes for unbiased pool-based active learning that minimise the actual prediction error, and demonstrate a better predictive performance than competing methods on a number of benchmark datasets. In contrast, both probabilistic and deterministic uncertainty sampling performed worse than simple random sampling on some of the datasets.
Dateibeschreibung: electronic
Zugangs-URL: https://research.chalmers.se/publication/536361
https://research.chalmers.se/publication/520253
https://research.chalmers.se/publication/519957
http://proceedings.mlr.press/v108/imberg20a/imberg20a.pdf
Datenbank: SwePub
Beschreibung
Abstract:A common belief in unbiased active learning is that, in order to capture the most informative instances, the sampling probabilities should be proportional to the uncertainty of the class labels. We argue that this produces suboptimal predictions and present sampling schemes for unbiased pool-based active learning that minimise the actual prediction error, and demonstrate a better predictive performance than competing methods on a number of benchmark datasets. In contrast, both probabilistic and deterministic uncertainty sampling performed worse than simple random sampling on some of the datasets.
ISSN:26403498