An Approach of Epistasis Detection Using Integer Linear Programming Optimizing Bayesian Network
Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc . Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network ( BN ) has...
Uložené v:
| Vydané v: | IEEE/ACM transactions on computational biology and bioinformatics Ročník 19; číslo 5; s. 2654 - 2671 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.09.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1545-5963, 1557-9964, 1557-9964 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc . Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network ( BN ) has been widely used in constructing the network of SNPs and phenotype traits and thus to mine epistatic loci. However, the shortcoming of BN is that it is easy to fall into local optimum and unable to process large-scale of SNPs. In this work, we transform the problem of learning Bayesian network into the optimization of integer linear programming ( ILP ). We use the algorithms of branch-and-bound and cutting planes to get the global optimal Bayesian network ( ILPBN ), and thus to get epistatic loci influencing specific phenotype traits. In order to handle large-scale of SNP loci and further to improve efficiency, we use the method of optimizing Markov blanket to reduce the number of candidate parent nodes for each node. In addition, we use α-BIC that is suitable for processing the epistatis mining to calculate the BN score. We use four properties of BN decomposable scoring functions to further reduce the number of candidate parent sets for each node. Experiment results show that ILPBN can not only process 2-locus and 3-locus epistasis mining, but also realize multi-locus epistasis detection. Finally, we compare ILPBN with several popular epistasis mining algorithms by using simulated and real Age-related macular disease (AMD) dataset. Experiment results show that ILPBN has better epistasis detection accuracy, F1-score and false positive rate in premise of ensuring the efficiency compared with other methods. Availability: Codes and dataset are available at: http://122.205.95.139/ILPBN/ . |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1545-5963 1557-9964 1557-9964 |
| DOI: | 10.1109/TCBB.2021.3092719 |