An Approach of Epistasis Detection Using Integer Linear Programming Optimizing Bayesian Network

Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc . Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network ( BN ) has...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE/ACM transactions on computational biology and bioinformatics Ročník 19; číslo 5; s. 2654 - 2671
Hlavní autori: Yang, Xuan, Yang, Chen, Lei, Jimeng, Liu, Jianxiao
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York IEEE 01.09.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1545-5963, 1557-9964, 1557-9964
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc . Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network ( BN ) has been widely used in constructing the network of SNPs and phenotype traits and thus to mine epistatic loci. However, the shortcoming of BN is that it is easy to fall into local optimum and unable to process large-scale of SNPs. In this work, we transform the problem of learning Bayesian network into the optimization of integer linear programming ( ILP ). We use the algorithms of branch-and-bound and cutting planes to get the global optimal Bayesian network ( ILPBN ), and thus to get epistatic loci influencing specific phenotype traits. In order to handle large-scale of SNP loci and further to improve efficiency, we use the method of optimizing Markov blanket to reduce the number of candidate parent nodes for each node. In addition, we use α-BIC that is suitable for processing the epistatis mining to calculate the BN score. We use four properties of BN decomposable scoring functions to further reduce the number of candidate parent sets for each node. Experiment results show that ILPBN can not only process 2-locus and 3-locus epistasis mining, but also realize multi-locus epistasis detection. Finally, we compare ILPBN with several popular epistasis mining algorithms by using simulated and real Age-related macular disease (AMD) dataset. Experiment results show that ILPBN has better epistasis detection accuracy, F1-score and false positive rate in premise of ensuring the efficiency compared with other methods. Availability: Codes and dataset are available at: http://122.205.95.139/ILPBN/ .
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1545-5963
1557-9964
1557-9964
DOI:10.1109/TCBB.2021.3092719