Pool-Based Sequential Active Learning for Regression

Active learning (AL) is a machine-learning approach for reducing the data labeling effort. Given a pool of unlabeled samples, it tries to select the most useful ones to label so that a model built from them can achieve the best possible performance. This paper focuses on pool-based sequential AL for...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transaction on neural networks and learning systems Ročník 30; číslo 5; s. 1348 - 1359
Hlavný autor:	Wu, Dongrui
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	United States IEEE 01.05.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Active learning Active learning (AL) Colleges & universities Computational modeling Domains inductive learning Labeling Learning algorithms Machine learning passive sampling Performance enhancement Predictive models ridge regression Sociology Statistics Training Training data transductive learning
ISSN:	2162-237X, 2162-2388, 2162-2388
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Active learning (AL) is a machine-learning approach for reducing the data labeling effort. Given a pool of unlabeled samples, it tries to select the most useful ones to label so that a model built from them can achieve the best possible performance. This paper focuses on pool-based sequential AL for regression (ALR). We first propose three essential criteria that an ALR approach should consider in selecting the most useful unlabeled samples: informativeness, representativeness, and diversity, and compare four existing ALR approaches against them. We then propose a new ALR approach using passive sampling, which considers both the representativeness and the diversity in both the initialization and subsequent iterations. Remarkably, this approach can also be integrated with other existing ALR approaches in the literature to further improve the performance. Extensive experiments on 11 University of California, Irvine, Carnegie Mellon University StatLib, and University of Florida Media Core data sets from various domains verified the effectiveness of our proposed ALR approaches.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2018.2868649