Evaluation multi label feature selection for text classification using weighted borda count approach

Due to the existence of text data, multi-label (ML) text classification is an essential task in machine learning. Feature selection is an essential and effective preprocess to enhance the learning process. Choosing a Multi-Label Feature Selection (MLFS) algorithm is the most basic, critical, and sen...

Full description

Saved in:

Bibliographic Details
Published in:	Iranian Joint Congress on Fuzzy and Intelligent Systems (Online) pp. 1 - 6
Main Authors:	Miri, Mohsen, Dowlatshahi, Mohammad Bagher, Hashemi, Amin
Format:	Conference Proceeding
Language:	English
Published:	IEEE 02.03.2022
Subjects:	Classification algorithms Feature extraction Machine learning Machine learning algorithms Multi-label feature selection Task analysis Text categorization Text classification Voting Weighted Borda Count
ISSN:	2771-1374
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Due to the existence of text data, multi-label (ML) text classification is an essential task in machine learning. Feature selection is an essential and effective preprocess to enhance the learning process. Choosing a Multi-Label Feature Selection (MLFS) algorithm is the most basic, critical, and sensitive choice in ML classification operations. If this choice is based on a criterion, it cannot be attributed to always being sound. Choosing the best algorithm must be evaluated using several different criteria to be examined from different aspects. In this article, we turn the issue into an election and use the Weighted Borda Count method for voting. We do the voting in three stages continuously so that a subset of different features does the voting. In the second stage, voting of different methods is done with six criteria, and each criterion selects the methods in order of priority from the beginning to the end. Voting steps 1 and 2 are performed on eighteen text datasets used. Finally, in the final voting stage, the methods are evaluated and voted on by different text datasets. The final result of the voting in the third stage shows the desired MLFS methods based on their performance from beginning to end. According to the experiments performed and the results obtained, it can be seen that the selection of the algorithm based on several different criteria and considering the overall performance of the algorithm will be better than the selection based on one criterion.
ISSN:	2771-1374
DOI:	10.1109/CFIS54774.2022.9756467