Empirical investigation of active learning strategies

Many predictive tasks require labeled data to induce classification models. The data labeling process may have a high cost. Several strategies have been proposed to optimize the selection of the most relevant examples, a process referred to as active learning. However, a lack of empirical studies co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) Jg. 326-327; S. 15 - 27
Hauptverfasser: Pereira-Santos, Davi, Prudêncio, Ricardo Bastos Cavalcante, de Carvalho, André C.P.L.F.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 31.01.2019
Schlagworte:
ISSN:0925-2312, 1872-8286
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Many predictive tasks require labeled data to induce classification models. The data labeling process may have a high cost. Several strategies have been proposed to optimize the selection of the most relevant examples, a process referred to as active learning. However, a lack of empirical studies comparing different active learning approaches across multiple datasets makes it difficult identifying the most promising strategies, or even assessing the relative gain of active learning over the trivial random selection of instances. In this study, a comprehensive comparison of active learning strategies is presented, with various instance selection criteria, different classification algorithms and a large number of datasets. The experimental results confirm the effectiveness of active learning and provide insights about the relationship between classification algorithms and active learning strategies. Additionally, ranking curves with bands are introduced as a means to summarize in a single chart the performance of each active learning strategy for different classification algorithms and datasets.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2017.05.105