Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification

Microarray gene expression data are often accompanied by a large number of genes and a small number of samples. However, only a few of these genes are relevant to cancer, resulting in significant gene selection challenges. Hence, we propose a two-stage gene selection approach by combining extreme gr...

Full description

Saved in:
Bibliographic Details
Published in:Medical & biological engineering & computing Vol. 60; no. 3; pp. 663 - 681
Main Authors: Deng, Xiongshi, Li, Min, Deng, Shaobo, Wang, Lei
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 01.03.2022
Springer Nature B.V
Subjects:
ISSN:0140-0118, 1741-0444, 1741-0444
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microarray gene expression data are often accompanied by a large number of genes and a small number of samples. However, only a few of these genes are relevant to cancer, resulting in significant gene selection challenges. Hence, we propose a two-stage gene selection approach by combining extreme gradient boosting (XGBoost) and a multi-objective optimization genetic algorithm (XGBoost-MOGA) for cancer classification in microarray datasets. In the first stage, the genes are ranked using an ensemble-based feature selection using XGBoost. This stage can effectively remove irrelevant genes and yield a group comprising the most relevant genes related to the class. In the second stage, XGBoost-MOGA searches for an optimal gene subset based on the most relevant genes’ group using a multi-objective optimization genetic algorithm. We performed comprehensive experiments to compare XGBoost-MOGA with other state-of-the-art feature selection methods using two well-known learning classifiers on 14 publicly available microarray expression datasets. The experimental results show that XGBoost–MOGA yields significantly better results than previous state-of-the-art algorithms in terms of various evaluation criteria, such as accuracy, F-score, precision, and recall. Graphical abstract
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0140-0118
1741-0444
1741-0444
DOI:10.1007/s11517-021-02476-x