Empirical investigation of active learning strategies

Many predictive tasks require labeled data to induce classification models. The data labeling process may have a high cost. Several strategies have been proposed to optimize the selection of the most relevant examples, a process referred to as active learning. However, a lack of empirical studies co...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Neurocomputing (Amsterdam) Ročník 326-327; s. 15 - 27
Hlavní autoři:	Pereira-Santos, Davi, Prudêncio, Ricardo Bastos Cavalcante, de Carvalho, André C.P.L.F.
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 31.01.2019
Témata:	Active learning Agnostic active learning Data labeling Data sampling Non-agnostic active learning Partially labeled data Partially labeled data Non-agnostic active learning Agnostic active learning Data labeling Active learning Data sampling
ISSN:	0925-2312, 1872-8286
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Many predictive tasks require labeled data to induce classification models. The data labeling process may have a high cost. Several strategies have been proposed to optimize the selection of the most relevant examples, a process referred to as active learning. However, a lack of empirical studies comparing different active learning approaches across multiple datasets makes it difficult identifying the most promising strategies, or even assessing the relative gain of active learning over the trivial random selection of instances. In this study, a comprehensive comparison of active learning strategies is presented, with various instance selection criteria, different classification algorithms and a large number of datasets. The experimental results confirm the effectiveness of active learning and provide insights about the relationship between classification algorithms and active learning strategies. Additionally, ranking curves with bands are introduced as a means to summarize in a single chart the performance of each active learning strategy for different classification algorithms and datasets.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2017.05.105