Hybridization of data-driven threshold algorithm with fuzzy particle swarm optimization technique for gene selection in microarray data

Microarrays have revolutionized genomics by enabling the simultaneous measurement of thousands of gene expressions. However, the high dimensionality of microarray data poses challenges in identifying relevant genes for disease diagnosis and biomarker discovery. This article introduces a novel hybrid...

Full description

Saved in:
Bibliographic Details
Published in:Scientific African Vol. 23; p. e02012
Main Authors: Adebayo, Paul Olujide, Jimoh, Rasheed Gbenga, Yahya, Waheed Babatunde
Format: Journal Article
Language:English
Published: Elsevier B.V 01.03.2024
Elsevier
Subjects:
ISSN:2468-2276, 2468-2276
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microarrays have revolutionized genomics by enabling the simultaneous measurement of thousands of gene expressions. However, the high dimensionality of microarray data poses challenges in identifying relevant genes for disease diagnosis and biomarker discovery. This article introduces a novel hybrid approach for gene selection in microarray data that combines a data-driven threshold algorithm with Fuzzy Particle Swarm Optimization (FPSO) optimisation capabilities. The proposed hybrid method serves multiple objectives, including minimizing the number of selected genes for model training, reducing computational costs, assessing each gene's contribution to the underlying condition, and enhancing classifier performance for improved accuracy. The data-driven threshold algorithm automatically determines an optimal threshold value based on dataset characteristics, addressing the often-challenging task of threshold setting in gene selection. In contrast, FPSO employs a Fuzzy logic approach for parameter settings during its global search and leverages the threshold algorithm's robustness as selection criteria. The synergy between FPSO and the threshold approach forms the core of this method, enabling the simultaneous achievement of multiple objectives, such as minimizing gene count, assessing gene contributions to the disease, reducing computational expenses, and maximizing classifier performance. Compared with existing solutions, experimental evaluations on real microarray datasets demonstrate the superiority of this hybrid approach in terms of gene selection performance and computational efficiency. The selected genes exhibit improved classification accuracy and biological relevance, enhancing their value for downstream analysis tasks. However, it is important to note that the hybrid algorithm faced challenges when dealing with multi-class microarray datasets. Future work will emphasise adapting the method to accommodate the unique characteristics of such datasets.
ISSN:2468-2276
2468-2276
DOI:10.1016/j.sciaf.2023.e02012