Disease biomarker identification based on sample network optimization

•We propose the multi-objective evolution algorithm for identifying biomarkers (MOESIB) to identify disease biomarker based on sample similarity networks, which combines feature selection and multi-objective evolution.•We constructed sample network to calculate the influence of the sample to each cl...

Full description

Saved in:
Bibliographic Details
Published in:Methods (San Diego, Calif.) Vol. 213; pp. 42 - 49
Main Authors: Wei, Pi-Jing, Ma, Wenwen, Li, Yanxin, Su, Yansen
Format: Journal Article
Language:English
Published: United States Elsevier Inc 01.05.2023
Subjects:
ISSN:1046-2023, 1095-9130, 1095-9130
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We propose the multi-objective evolution algorithm for identifying biomarkers (MOESIB) to identify disease biomarker based on sample similarity networks, which combines feature selection and multi-objective evolution.•We constructed sample network to calculate the influence of the sample to each class, and help to classify samples more accurate, which overcome the limitation of ignoring the similarities and associations among the disease samples.•In the multi-objective evolutionary algorithm, we take the classification accuracy and selected gene number as optimization objectives to evaluate the individuals in population.•We propose two strategies: elite guidance strategy and fusion selection strategy in the algorithm of MOESIB, which can select key genes to construct better networks with high classification accuracy. A large amount of evidence shows that biomarkers are discriminant features related to disease development. Thus, the identification of disease biomarkers has become a basic problem in the analysis of complex diseases in the medical fields, such as disease stage judgment, disease diagnosis and treatment. Research based on networks have become one of the most popular methods. Several algorithms based on networks have been proposed to identify biomarkers, however the networks of genes or molecules ignored the similarities and associations among the samples. It is essential to further understand how to construct and optimize the networks to make the identified biomarkers more accurate. On this basis, more effective strategies can be developed to improve the performance of biomarkers identification. In this study, a multi-objective evolution algorithm based on sample similarity networks has been proposed for disease biomarker identification. Specifically, we design the sample similarity networks to extract the structural characteristic information among samples, which used to calculate the influence of the sample to each class. Besides, based on the networks and the group of biomarkers we choose in every iteration, we can divide samples into different classes by the importance for each class. Then, in the process of evolution algorithm population iteration, we develop the elite guidance strategy and fusion selection strategy to select the biomarkers which make the sample classification more accurate. The experiment results on the five gene expression datasets suggests that the algorithm we proposed is superior over some state-of-the-art disease biomarker identification methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1046-2023
1095-9130
1095-9130
DOI:10.1016/j.ymeth.2023.03.005