Evaluation multi label feature selection for text classification using weighted borda count approach

Due to the existence of text data, multi-label (ML) text classification is an essential task in machine learning. Feature selection is an essential and effective preprocess to enhance the learning process. Choosing a Multi-Label Feature Selection (MLFS) algorithm is the most basic, critical, and sen...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Iranian Joint Congress on Fuzzy and Intelligent Systems (Online) s. 1 - 6
Hlavní autori: Miri, Mohsen, Dowlatshahi, Mohammad Bagher, Hashemi, Amin
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 02.03.2022
Predmet:
ISSN:2771-1374
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Due to the existence of text data, multi-label (ML) text classification is an essential task in machine learning. Feature selection is an essential and effective preprocess to enhance the learning process. Choosing a Multi-Label Feature Selection (MLFS) algorithm is the most basic, critical, and sensitive choice in ML classification operations. If this choice is based on a criterion, it cannot be attributed to always being sound. Choosing the best algorithm must be evaluated using several different criteria to be examined from different aspects. In this article, we turn the issue into an election and use the Weighted Borda Count method for voting. We do the voting in three stages continuously so that a subset of different features does the voting. In the second stage, voting of different methods is done with six criteria, and each criterion selects the methods in order of priority from the beginning to the end. Voting steps 1 and 2 are performed on eighteen text datasets used. Finally, in the final voting stage, the methods are evaluated and voted on by different text datasets. The final result of the voting in the third stage shows the desired MLFS methods based on their performance from beginning to end. According to the experiments performed and the results obtained, it can be seen that the selection of the algorithm based on several different criteria and considering the overall performance of the algorithm will be better than the selection based on one criterion.
ISSN:2771-1374
DOI:10.1109/CFIS54774.2022.9756467