BiCSA-PUL: binary crow search algorithm for enhancing positive and unlabeled learning

This paper presents a novel metaheuristic binary crow search algorithm (CSA) designed for positive-unlabeled (PU) learning, a paradigm where only positive and unlabeled data are available, with applications in many diversified fields, such as medical diagnosis and fraud detection. The algorithm repr...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of information technology (Singapore. Online) Ročník 17; číslo 3; s. 1729 - 1743
Hlavní autori: Azizi, Nabil, Ben Othmane, Mohamed, Hamouma, Moumen, Siam, Abderrahim, Haouassi, Hichem, Ledmi, Makhlouf, Hamdi-Cherif, Aboubekeur
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Singapore Springer Nature Singapore 01.04.2025
Springer Nature B.V
Predmet:
ISSN:2511-2104, 2511-2112
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:This paper presents a novel metaheuristic binary crow search algorithm (CSA) designed for positive-unlabeled (PU) learning, a paradigm where only positive and unlabeled data are available, with applications in many diversified fields, such as medical diagnosis and fraud detection. The algorithm represent a useful adaptation of CSA, itself inspired by the food-hiding behavior of crows. The proposed BiCSA-PUL (binary crow search algorithm for positive-unlabeled learning) selects reliable negative samples from unlabeled data using binary vectors, and updates positions employing Hamming distance, guided by a modified F1-score, as fitness function. The algorithm was tested on 30 samples from 10 diverse datasets, outperforming seven state-of-the-art PU learning methods. The results reveal that BiCSA-PUL provides a robust and efficient approach for PU learning, significantly improving fitness and reliability. This work opens new avenues for applying metaheuristic optimization methods to challenging classification problems with limited labeled data. The main limitations are the potentially time-intensive process of parameters tuning and reliance on initial sampling.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2511-2104
2511-2112
DOI:10.1007/s41870-024-02367-y