Gene expression selection for cancer classification using intelligent collaborative filtering and hamming distance guided multi-objective swarm optimization

High dimensional microarray cancer datasets contain thousands of genes with a very few numbers of samples. High class imbalance, presence of noisy and redundant genes and overlapping nature of extracted features among different disease classes deteriorate the disease prediction accuracy. An intellig...

Full description

Saved in:
Bibliographic Details
Published in:Applied soft computing Vol. 170; p. 112654
Main Authors: Agarwalla, Prativa, Mukhopadhyay, Sumitra
Format: Journal Article
Language:English
Published: Elsevier B.V 01.02.2025
Subjects:
ISSN:1568-4946
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High dimensional microarray cancer datasets contain thousands of genes with a very few numbers of samples. High class imbalance, presence of noisy and redundant genes and overlapping nature of extracted features among different disease classes deteriorate the disease prediction accuracy. An intelligent collaborative filtering (ICF) assisted and hamming distance guided multi-objective swarm intelligence framework (HIMS) is proposed for efficient selection of optimal gene set for disease identification. In the framework, first intelligent collaborative filtering (ICF) has been introduced to improve the prediction ability which combines the features from different feature selection tools. Then, a multi-objective multi-population search (MOMPS) algorithm has been proposed which contributes as a core part of HIMS. It generates more diversified solutions by avoiding local trapping. Hamming distance operator has been applied here as an alternative of sorting mechanism for the selection of Pareto optimal solutions. It also helps to reduce the computational complexity. Along with that, a time-varying U-shaped function is introduced for the binary conversion process for feature selection. Extensive experiments were conducted on 16 different single and multi-class datasets to study the efficacy of HIMS. The experimental results show that HIMS performs favorably well in comparison with other existing techniques with fewer numbers of genes. •Intelligent collaborative filtering from multimodal filter source.•Interactive multi-objective multi-population search (MOMPS) for gene selection.•Hamming distance operator based dominant gene selection.•Time-variant U-shaped discretization function for gene selection.•Adaptively tuned classifier for cancer classification with optimally selected genes.
ISSN:1568-4946
DOI:10.1016/j.asoc.2024.112654