Label augmented and weighted majority voting for crowdsourcing

Crowdsourcing provides an efficient way to obtain multiple noisy labels from different crowd workers for each unlabeled instance. Label integration methods are designed to infer the unknown true label of each instance from its multiple noisy label set. We argue that when the label quality is higher...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Information sciences Ročník 606; s. 397 - 409
Hlavní autoři: Chen, Ziqi, Jiang, Liangxiao, Li, Chaoqun
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.08.2022
Témata:
ISSN:0020-0255, 1872-6291
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Crowdsourcing provides an efficient way to obtain multiple noisy labels from different crowd workers for each unlabeled instance. Label integration methods are designed to infer the unknown true label of each instance from its multiple noisy label set. We argue that when the label quality is higher than random classification, the more the number of labels, the better the performance of label integration methods. However, in real-world crowdsourcing scenarios, each instance cannot obtain enough labels for saving costs. To solve this problem, this paper proposes a novel label integration method called label augmented and weighted majority voting (LAWMV). At first, LAWMV uses the K-nearest neighbors (KNN) algorithm to find each instance’s K-nearest neighbors (including itself) and merges their multiple noisy label sets to obtain its augmented multiple noisy label set. Then, the labels from different neighbors are weighted by the distances and the label similarities between each instance and its neighbors. Finally, the integrated label of each instance is inferred by weighted majority voting (MV). The experimental results on 34 simulated and two real-world crowdsourced datasets show that LAWMV significantly outperforms all the other state-of-the-art label integration methods.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2022.05.066