Hybridization of data-driven threshold algorithm with fuzzy particle swarm optimization technique for gene selection in microarray data

Microarrays have revolutionized genomics by enabling the simultaneous measurement of thousands of gene expressions. However, the high dimensionality of microarray data poses challenges in identifying relevant genes for disease diagnosis and biomarker discovery. This article introduces a novel hybrid...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Scientific African Ročník 23; s. e02012
Hlavní autoři: Adebayo, Paul Olujide, Jimoh, Rasheed Gbenga, Yahya, Waheed Babatunde
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.03.2024
Elsevier
Témata:
ISSN:2468-2276, 2468-2276
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Microarrays have revolutionized genomics by enabling the simultaneous measurement of thousands of gene expressions. However, the high dimensionality of microarray data poses challenges in identifying relevant genes for disease diagnosis and biomarker discovery. This article introduces a novel hybrid approach for gene selection in microarray data that combines a data-driven threshold algorithm with Fuzzy Particle Swarm Optimization (FPSO) optimisation capabilities. The proposed hybrid method serves multiple objectives, including minimizing the number of selected genes for model training, reducing computational costs, assessing each gene's contribution to the underlying condition, and enhancing classifier performance for improved accuracy. The data-driven threshold algorithm automatically determines an optimal threshold value based on dataset characteristics, addressing the often-challenging task of threshold setting in gene selection. In contrast, FPSO employs a Fuzzy logic approach for parameter settings during its global search and leverages the threshold algorithm's robustness as selection criteria. The synergy between FPSO and the threshold approach forms the core of this method, enabling the simultaneous achievement of multiple objectives, such as minimizing gene count, assessing gene contributions to the disease, reducing computational expenses, and maximizing classifier performance. Compared with existing solutions, experimental evaluations on real microarray datasets demonstrate the superiority of this hybrid approach in terms of gene selection performance and computational efficiency. The selected genes exhibit improved classification accuracy and biological relevance, enhancing their value for downstream analysis tasks. However, it is important to note that the hybrid algorithm faced challenges when dealing with multi-class microarray datasets. Future work will emphasise adapting the method to accommodate the unique characteristics of such datasets.
ISSN:2468-2276
2468-2276
DOI:10.1016/j.sciaf.2023.e02012