A multi-objective evolutionary algorithm for robust positive-unlabeled learning

Positive and unlabeled (PU) learning is to learn a binary classifier with good generalization ability from PU data. A variety of PU learning algorithms with promising performance have been proposed. However, most of them assume that PU samples are “clean”, which is not true in real applications due...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Information sciences Ročník 678; s. 120992
Hlavní autoři: Qiu, Jianfeng, Tang, Qi, Tan, Ming, Li, Kaixuan, Xie, Juan, Cai, Xiaoqiang, Cheng, Fan
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.09.2024
Témata:
ISSN:0020-0255, 1872-6291
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Positive and unlabeled (PU) learning is to learn a binary classifier with good generalization ability from PU data. A variety of PU learning algorithms with promising performance have been proposed. However, most of them assume that PU samples are “clean”, which is not true in real applications due to the existing noisy or redundant samples. Thus, how to obtain a robust PU classifier with better performance is a challenging problem. To this end, we propose a novel multi-objective evolutionary algorithm to tackle it, named BPUSS-MOEA. Specifically, we firstly transform the robust PU learning into a bi-objective PU sample selection (BPUSS) problem, in which two objectives are designed. One is the number of selected “clean” PU samples and the other is the PU accuracy. Then, a dual-coding scheme is designed to represent the selected “clean” PU samples and the labels of U samples. With the dual-coding scheme, a novel offspring generation strategy is developed to achieve the offsprings with high quality. To further improve the performance of BPUSS-MOEA, an effective population initialization strategy is designed. Experiments on 10 datasets with different noise levels show that compared with the state-of-the-arts, the proposed algorithm demonstrates its robustness in terms of the PU accuracy.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.120992