Improving the prediction of DNA-protein binding by integrating multi-scale dense convolutional network with fault-tolerant coding
Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to signi...
Gespeichert in:
| Veröffentlicht in: | Analytical biochemistry Jg. 656; S. 114878 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
01.11.2022
|
| Schlagworte: | |
| ISSN: | 0003-2697, 1096-0309, 1096-0309 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization. |
|---|---|
| AbstractList | Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization. Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization. |
| ArticleNumber | 114878 |
| Author | Jiang, Yuanhao Shen, Long-Chen Gao, Shang Song, Jiangning Yin, Yu-Hang Yu, Dong-Jun |
| Author_xml | – sequence: 1 givenname: Yu-Hang surname: Yin fullname: Yin, Yu-Hang – sequence: 2 givenname: Long-Chen surname: Shen fullname: Shen, Long-Chen – sequence: 3 givenname: Yuanhao surname: Jiang fullname: Jiang, Yuanhao – sequence: 4 givenname: Shang surname: Gao fullname: Gao, Shang – sequence: 5 givenname: Jiangning surname: Song fullname: Song, Jiangning – sequence: 6 givenname: Dong-Jun surname: Yu fullname: Yu, Dong-Jun |
| BookMark | eNqNkTFvFDEQhS0UJC6BntIlzR5j7653XUYBQqQoaaC2Zr2ziQ-ffdi-RCn553h1VEhIVKPRfO9Jb945OwsxEGPvBWwFCPVxt8VpK0HKrRDdOIyv2EaAVg20oM_YBgDaRio9vGHnOe8AKtWrDft1sz-k-OTCAy-PxA-JZmeLi4HHhX-6u2zqtZALfHJhXqnphbtQ6CFhWdf90RfXZIue-EwhE7cxPEV_XD3Q80DlOaYf_NmVR75gpZsSPSUMpZKr41v2ekGf6d2fecG-f_n87eprc3t_fXN1edvYVkFplJ2GeempTjH2Ui8dCisUTbNaNCHZeehIjJ2U_dxNWredwMlqBOzUgnJsL9iHk29N9PNIuZi9y5a8x0DxmI0cxNiOMA7qP1DQQ1ff3FdUnVCbYs6JFmNdwTV8Sei8EWDWeszO4GTWesypniqEv4SH5PaYXv4t-Q3kwJhs |
| CitedBy_id | crossref_primary_10_1007_s00371_025_03946_1 crossref_primary_10_1016_j_engappai_2023_106353 crossref_primary_10_1016_j_compbiolchem_2024_108183 crossref_primary_10_31083_j_fbl2812346 crossref_primary_10_3390_math11214439 crossref_primary_10_1109_TCBB_2024_3404136 |
| Cites_doi | 10.1016/j.bpj.2011.04.037 10.1093/nar/gkab383 10.1016/j.jtbi.2018.01.023 10.1093/bib/bbab445 10.1093/nar/gkw203 10.1016/j.mito.2014.02.004 10.1093/bib/bbab101 10.1021/acs.jproteome.0c00864 10.1093/bioinformatics/btq003 10.1038/nrg3306 10.1093/bib/bbaa229 10.1093/nar/gkj143 10.1137/080737770 10.1016/j.jtbi.2018.10.027 10.1093/bioinformatics/btz339 10.1093/nar/gkt574 10.1038/nbt1486 10.1093/bib/bbab001 10.1007/s13042-019-00990-x 10.1101/gr.133306.111 10.1093/nar/gkw521 10.1093/nar/gkw226 10.1093/nar/gkv577 10.1038/s41598-018-33321-1 10.1038/nbt.3300 10.2174/1574893614666181212102030 10.1093/bioinformatics/btw255 10.1016/j.ab.2021.114241 10.1126/science.1242463 10.1093/nar/gkv416 10.1186/1471-2105-8-463 10.1021/acs.jcim.7b00397 10.1056/NEJMoa0807917 10.1093/bioinformatics/btw203 10.1021/jm100574m 10.1093/nar/gku846 10.1038/nbt.3121 10.1038/nature11247 10.1371/journal.pone.0251865 10.1093/bioinformatics/btw024 10.1038/nbt1053 10.1109/TCBB.2018.2819660 10.1186/s12859-015-0797-4 10.1021/acs.jcim.8b00749 10.1371/journal.pcbi.1003711 10.1093/bioinformatics/btz768 10.1109/TCBB.2007.1000 10.1093/bib/bbaa171 10.1093/nar/gki949 |
| ContentType | Journal Article |
| Copyright | Copyright © 2022 Elsevier Inc. All rights reserved. |
| Copyright_xml | – notice: Copyright © 2022 Elsevier Inc. All rights reserved. |
| DBID | AAYXX CITATION 7X8 7S9 L.6 |
| DOI | 10.1016/j.ab.2022.114878 |
| DatabaseName | CrossRef MEDLINE - Academic AGRICOLA AGRICOLA - Academic |
| DatabaseTitle | CrossRef MEDLINE - Academic AGRICOLA AGRICOLA - Academic |
| DatabaseTitleList | MEDLINE - Academic AGRICOLA |
| Database_xml | – sequence: 1 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Anatomy & Physiology Chemistry |
| EISSN | 1096-0309 |
| ExternalDocumentID | 10_1016_j_ab_2022_114878 |
| GroupedDBID | --- --K --M -~X .55 .GJ .~1 0R~ 1B1 1RT 1~. 1~5 23M 4.4 457 4G. 53G 5GY 5VS 6J9 7-5 71M 85S 8P~ 9DU 9JM 9JN AABNK AAEDT AAEDW AAHBH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARLI AATTM AAXKI AAXUO AAYWO AAYXX ABDPE ABEFU ABFNM ABFRF ABGSF ABMAC ABOCM ABUDA ABUFD ABWVN ABXDB ACDAQ ACGFO ACKIV ACLOT ACNCT ACNNM ACRLP ACRPL ACVFH ADBBV ADCNI ADECG ADEZE ADFGL ADIYS ADMUD ADNMO ADRHT ADUVX ADVLN ADXHL AEBSH AEFWE AEHWI AEIPS AEKER AENEX AEUPX AFJKZ AFPUW AFTJW AFXIZ AFZHZ AGHFR AGQPQ AGRDE AGUBO AGYEJ AHHHB AI. AIEXJ AIGII AIIUN AIKHN AITUG AJSZI AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU APXCP ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CAG CITATION COF CS3 DM4 EBS EFBJH EFKBS EFLBG EJD EO8 EO9 EP2 EP3 F5P FA8 FDB FEDTE FGOYB FIRID FLBIZ FNPLU FYGXN G-2 G-Q GBLVA HLW HVGLF HZ~ H~9 IHE J1W J5H K-O KOM L7B LG5 LX2 M41 MO0 MVM N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RNS ROL RPZ SBG SCB SCC SDF SDG SDP SES SEW SPC SPCBC SSK SSU SSZ T5K VH1 WH7 WUQ X7M XOL XPP Y6R YYP ZGI ZKB ZMT ZY4 ~HD 7X8 7S9 L.6 |
| ID | FETCH-LOGICAL-c360t-6cb7df5e6cb18529f4a1c16ebd6f9eaecd74e184225d4b99341abc9a0a46fa283 |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000887375200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0003-2697 1096-0309 |
| IngestDate | Sat Sep 27 23:25:31 EDT 2025 Wed Oct 01 14:35:09 EDT 2025 Tue Nov 18 22:45:20 EST 2025 Sat Nov 29 07:32:23 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c360t-6cb7df5e6cb18529f4a1c16ebd6f9eaecd74e184225d4b99341abc9a0a46fa283 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PQID | 2709741485 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_2718380876 proquest_miscellaneous_2709741485 crossref_citationtrail_10_1016_j_ab_2022_114878 crossref_primary_10_1016_j_ab_2022_114878 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-11-00 20221101 |
| PublicationDateYYYYMMDD | 2022-11-01 |
| PublicationDate_xml | – month: 11 year: 2022 text: 2022-11-00 |
| PublicationDecade | 2020 |
| PublicationTitle | Analytical biochemistry |
| PublicationYear | 2022 |
| References | Han (10.1016/j.ab.2022.114878_bib21) 2022; 23 Matys (10.1016/j.ab.2022.114878_bib14) 2006; 34 Bottou (10.1016/j.ab.2022.114878_bib53) 2010 Wang (10.1016/j.ab.2022.114878_bib12) 2014; 42 Song (10.1016/j.ab.2022.114878_bib60) 2018; 443 Fornes (10.1016/j.ab.2022.114878_bib15) 2020; 48 Quang (10.1016/j.ab.2022.114878_bib19) 2016; 44 Huang (10.1016/j.ab.2022.114878_bib46) 2010; 26 Bao (10.1016/j.ab.2022.114878_bib36) 2019 Smyth (10.1016/j.ab.2022.114878_bib9) 2008; 359 Kilpinen (10.1016/j.ab.2022.114878_bib43) 2013; 342 Xu (10.1016/j.ab.2022.114878_bib55) 2021; 22 Luo (10.1016/j.ab.2022.114878_bib18) 2020; 36 Siebert (10.1016/j.ab.2022.114878_bib41) 2016; 44 Bhardwaj (10.1016/j.ab.2022.114878_bib22) 2005; 33 Trabelsi (10.1016/j.ab.2022.114878_bib37) 2019; 35 Vaswani (10.1016/j.ab.2022.114878_bib27) 2017 Zhang (10.1016/j.ab.2022.114878_bib32) 2018; 16 Devlin (10.1016/j.ab.2022.114878_bib28) 2018 Consortium (10.1016/j.ab.2022.114878_bib44) 2012; 489 Ghandi (10.1016/j.ab.2022.114878_bib54) 2016; 32 Alipanahi (10.1016/j.ab.2022.114878_bib16) 2015; 33 Huang (10.1016/j.ab.2022.114878_bib26) 2017 Gholamalinezhad (10.1016/j.ab.2022.114878_bib51) 2020 Zhu (10.1016/j.ab.2022.114878_bib57) 2019; 59 Hu (10.1016/j.ab.2022.114878_bib59) 2018; 58 Shen (10.1016/j.ab.2022.114878_bib20) 2021; 22 Ghandi (10.1016/j.ab.2022.114878_bib24) 2014; 10 Zhao (10.1016/j.ab.2022.114878_bib29) 2021; 49 Xu (10.1016/j.ab.2022.114878_bib56) 2015 Zhang (10.1016/j.ab.2022.114878_bib38) 2021; 22 Tompa (10.1016/j.ab.2022.114878_bib1) 2005; 23 Telorac (10.1016/j.ab.2022.114878_bib49) 2016; 44 Furey (10.1016/j.ab.2022.114878_bib11) 2012; 13 Zhang (10.1016/j.ab.2022.114878_bib35) 2020; 11 Eggeling (10.1016/j.ab.2022.114878_bib42) 2015; 16 Çatalyürek (10.1016/j.ab.2022.114878_bib50) 2010; 32 Kumar (10.1016/j.ab.2022.114878_bib5) 2007; 8 Hu (10.1016/j.ab.2022.114878_bib61) 2021; 626 Gualberto (10.1016/j.ab.2022.114878_bib7) 2014; 19 Shen (10.1016/j.ab.2022.114878_bib34) 2018; 8 Qu (10.1016/j.ab.2022.114878_bib3) 2019; 14 Tan (10.1016/j.ab.2022.114878_bib2) 2016; 32 Shendure (10.1016/j.ab.2022.114878_bib10) 2008; 26 Szegedy (10.1016/j.ab.2022.114878_bib47) 2015 Liu (10.1016/j.ab.2022.114878_bib31) 2021; 22 Adilina (10.1016/j.ab.2022.114878_bib58) 2019; 460 Schmidtke (10.1016/j.ab.2022.114878_bib8) 2010; 53 He (10.1016/j.ab.2022.114878_bib39) 2021; 22 Kuntz (10.1016/j.ab.2022.114878_bib4) 2012; 22 Wong (10.1016/j.ab.2022.114878_bib23) 2013; 41 He (10.1016/j.ab.2022.114878_bib13) 2015; 33 He (10.1016/j.ab.2022.114878_bib25) 2016 Du (10.1016/j.ab.2022.114878_bib33) 2021; 20 Keilwagen (10.1016/j.ab.2022.114878_bib40) 2015; 43 Zeng (10.1016/j.ab.2022.114878_bib17) 2016; 32 Aeling (10.1016/j.ab.2022.114878_bib6) 2007; 4 Bailey (10.1016/j.ab.2022.114878_bib45) 2015; 43 Sela (10.1016/j.ab.2022.114878_bib48) 2011; 101 Min (10.1016/j.ab.2022.114878_bib30) 2021; 16 Paszke (10.1016/j.ab.2022.114878_bib52) 2019 |
| References_xml | – volume: 101 start-page: 160 year: 2011 ident: 10.1016/j.ab.2022.114878_bib48 article-title: DNA sequence correlations shape nonspecific transcription factor-DNA binding affinity publication-title: Biophys. J. doi: 10.1016/j.bpj.2011.04.037 – volume: 49 start-page: W523 year: 2021 ident: 10.1016/j.ab.2022.114878_bib29 article-title: PlantDeepSEA, a deep learning-based web service to predict the regulatory effects of genomic variants in plants publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkab383 – volume: 443 start-page: 125 year: 2018 ident: 10.1016/j.ab.2022.114878_bib60 article-title: PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework publication-title: J. Theor. Biol. doi: 10.1016/j.jtbi.2018.01.023 – volume: 23 start-page: bbab445 year: 2022 ident: 10.1016/j.ab.2022.114878_bib21 article-title: MAResNet: predicting transcription factor binding sites by combining multi-scale bottom-up and top-down attention and residual network publication-title: Briefings Bioinf. doi: 10.1093/bib/bbab445 – volume: 44 start-page: 6142 year: 2016 ident: 10.1016/j.ab.2022.114878_bib49 article-title: Identification and characterization of DNA sequences that prevent glucocorticoid receptor binding to nearby response elements publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkw203 – start-page: 177 year: 2010 ident: 10.1016/j.ab.2022.114878_bib53 article-title: Large-scale machine learning with stochastic gradient descent publication-title: Proc. COMPSTAT – volume: 19 start-page: 323 year: 2014 ident: 10.1016/j.ab.2022.114878_bib7 article-title: DNA-binding proteins in plant mitochondria: implications for transcription publication-title: Mitochondrion doi: 10.1016/j.mito.2014.02.004 – volume: 22 start-page: bbab101 year: 2021 ident: 10.1016/j.ab.2022.114878_bib20 article-title: SAResNet: self-attention residual network for predicting DNA-protein binding publication-title: Briefings Bioinf. doi: 10.1093/bib/bbab101 – volume: 20 start-page: 1639 year: 2021 ident: 10.1016/j.ab.2022.114878_bib33 article-title: Using chou's 5-step rule to predict DNA-protein binding with multi-scale complementary feature publication-title: J. Proteome Res. doi: 10.1021/acs.jproteome.0c00864 – volume: 26 start-page: 680 year: 2010 ident: 10.1016/j.ab.2022.114878_bib46 article-title: A web server for clustering and comparing biological sequences publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq003 – volume: 13 start-page: 840 year: 2012 ident: 10.1016/j.ab.2022.114878_bib11 article-title: ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions publication-title: Nat. Rev. Genet. doi: 10.1038/nrg3306 – volume: 22 year: 2021 ident: 10.1016/j.ab.2022.114878_bib39 article-title: A survey on deep learning in DNA/RNA motif mining publication-title: Briefings Bioinf. doi: 10.1093/bib/bbaa229 – volume: 34 start-page: D108 year: 2006 ident: 10.1016/j.ab.2022.114878_bib14 article-title: TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkj143 – volume: 32 start-page: 656 year: 2010 ident: 10.1016/j.ab.2022.114878_bib50 article-title: On two-dimensional sparse matrix partitioning: models, methods, and a recipe publication-title: SIAM J. Sci. Comput. doi: 10.1137/080737770 – volume: 460 start-page: 64 year: 2019 ident: 10.1016/j.ab.2022.114878_bib58 article-title: Effective DNA binding protein prediction by using key features via Chou's general PseAAC publication-title: J. Theor. Biol. doi: 10.1016/j.jtbi.2018.10.027 – year: 2018 ident: 10.1016/j.ab.2022.114878_bib28 – volume: 35 start-page: i269 year: 2019 ident: 10.1016/j.ab.2022.114878_bib37 article-title: Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities publication-title: Bioinformatics doi: 10.1093/bioinformatics/btz339 – start-page: 4700 year: 2017 ident: 10.1016/j.ab.2022.114878_bib26 article-title: Densely connected convolutional networks publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog. – volume: 41 year: 2013 ident: 10.1016/j.ab.2022.114878_bib23 article-title: DNA motif elucidation using belief propagation publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkt574 – volume: 26 start-page: 1135 year: 2008 ident: 10.1016/j.ab.2022.114878_bib10 article-title: Next-generation DNA sequencing publication-title: Nat. Biotechnol. doi: 10.1038/nbt1486 – volume: 48 start-page: D87 year: 2020 ident: 10.1016/j.ab.2022.114878_bib15 article-title: JASPAR 2020: update of the open-access database of transcription factor binding profiles publication-title: Nucleic Acids Res. – volume: 22 year: 2021 ident: 10.1016/j.ab.2022.114878_bib31 article-title: Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation publication-title: Briefings Bioinf. doi: 10.1093/bib/bbab001 – volume: 11 start-page: 841 year: 2020 ident: 10.1016/j.ab.2022.114878_bib35 article-title: DeepSite: bidirectional LSTM and CNN models for predicting DNA–protein binding publication-title: Int. J. Machine learn. Cyber. doi: 10.1007/s13042-019-00990-x – volume: 22 start-page: 1907 year: 2012 ident: 10.1016/j.ab.2022.114878_bib4 article-title: Transcription factor redundancy and tissue-specific regulation: evidence from functional and physical network connectivity publication-title: Genome Res. doi: 10.1101/gr.133306.111 – volume: 44 start-page: 6055 year: 2016 ident: 10.1016/j.ab.2022.114878_bib41 article-title: Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkw521 – volume: 44 year: 2016 ident: 10.1016/j.ab.2022.114878_bib19 article-title: DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkw226 – volume: 43 year: 2015 ident: 10.1016/j.ab.2022.114878_bib40 article-title: Varying levels of complexity in transcription factor binding motifs publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkv577 – volume: 8 start-page: 1 year: 2018 ident: 10.1016/j.ab.2022.114878_bib34 article-title: Recurrent neural network for predicting transcription factor binding sites publication-title: Sci. Rep. doi: 10.1038/s41598-018-33321-1 – start-page: 30 year: 2017 ident: 10.1016/j.ab.2022.114878_bib27 article-title: Attention is all you need publication-title: Adv. Neural Inf. Process. Syst. – volume: 33 start-page: 831 year: 2015 ident: 10.1016/j.ab.2022.114878_bib16 article-title: Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning publication-title: Nat. Biotechnol. doi: 10.1038/nbt.3300 – volume: 14 start-page: 246 year: 2019 ident: 10.1016/j.ab.2022.114878_bib3 article-title: A review of DNA-binding proteins prediction methods publication-title: Curr. Bioinf. doi: 10.2174/1574893614666181212102030 – volume: 32 start-page: i121 year: 2016 ident: 10.1016/j.ab.2022.114878_bib17 article-title: Convolutional neural network architectures for predicting DNA–protein binding publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw255 – start-page: 1 year: 2015 ident: 10.1016/j.ab.2022.114878_bib47 article-title: Going deeper with convolutions publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog. – volume: 626 year: 2021 ident: 10.1016/j.ab.2022.114878_bib61 article-title: Accurate prediction of protein-ATP binding residues using position-specific frequency matrix publication-title: Anal. Biochem. doi: 10.1016/j.ab.2021.114241 – start-page: 126 year: 2019 ident: 10.1016/j.ab.2022.114878_bib36 – volume: 342 start-page: 744 year: 2013 ident: 10.1016/j.ab.2022.114878_bib43 article-title: Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription publication-title: Science doi: 10.1126/science.1242463 – volume: 43 start-page: W39 year: 2015 ident: 10.1016/j.ab.2022.114878_bib45 article-title: The MEME suite publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkv416 – volume: 8 start-page: 1 year: 2007 ident: 10.1016/j.ab.2022.114878_bib5 article-title: Identification of DNA-binding proteins using support vector machines and evolutionary profiles publication-title: BMC Bioinf. doi: 10.1186/1471-2105-8-463 – volume: 22 year: 2021 ident: 10.1016/j.ab.2022.114878_bib38 article-title: Locating transcription factor binding sites by fully convolutional neural network publication-title: Briefings Bioinf. – volume: 58 start-page: 501 year: 2018 ident: 10.1016/j.ab.2022.114878_bib59 article-title: ATPbind: accurate protein–ATP binding site prediction by combining sequence-profiling and structure-based comparisons publication-title: J. Chem. Inf. Model. doi: 10.1021/acs.jcim.7b00397 – volume: 359 start-page: 2767 year: 2008 ident: 10.1016/j.ab.2022.114878_bib9 article-title: Shared and distinct genetic variants in type 1 diabetes and celiac disease publication-title: N. Engl. J. Med. doi: 10.1056/NEJMoa0807917 – start-page: 32 year: 2019 ident: 10.1016/j.ab.2022.114878_bib52 article-title: Pytorch: an imperative style, high-performance deep learning library publication-title: Adv. Neural Inf. Process. Syst. – volume: 32 start-page: 2205 year: 2016 ident: 10.1016/j.ab.2022.114878_bib54 article-title: gkmSVM: an R package for gapped-kmer SVM publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw203 – volume: 53 start-page: 5858 year: 2010 ident: 10.1016/j.ab.2022.114878_bib8 article-title: Understanding and predicting druggability. A high-throughput method for detection of drug binding sites publication-title: J. Med. Chem. doi: 10.1021/jm100574m – volume: 42 year: 2014 ident: 10.1016/j.ab.2022.114878_bib12 article-title: MACE: model based analysis of ChIP-exo publication-title: Nucleic Acids Res. doi: 10.1093/nar/gku846 – volume: 33 start-page: 395 year: 2015 ident: 10.1016/j.ab.2022.114878_bib13 article-title: ChIP-nexus enables improved detection of in vivo transcription factor binding footprints publication-title: Nat. Biotechnol. doi: 10.1038/nbt.3121 – volume: 489 start-page: 57 year: 2012 ident: 10.1016/j.ab.2022.114878_bib44 article-title: An integrated encyclopedia of DNA elements in the human genome publication-title: Nature doi: 10.1038/nature11247 – volume: 16 year: 2021 ident: 10.1016/j.ab.2022.114878_bib30 article-title: Protein transfer learning improves identification of heat shock protein families publication-title: PLoS One doi: 10.1371/journal.pone.0251865 – volume: 32 start-page: 1555 year: 2016 ident: 10.1016/j.ab.2022.114878_bib2 article-title: TFBSTools: an R/bioconductor package for transcription factor binding site analysis publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw024 – volume: 23 start-page: 137 year: 2005 ident: 10.1016/j.ab.2022.114878_bib1 article-title: Assessing computational tools for the discovery of transcription factor binding sites publication-title: Nat. Biotechnol. doi: 10.1038/nbt1053 – year: 2020 ident: 10.1016/j.ab.2022.114878_bib51 – volume: 16 start-page: 1184 year: 2018 ident: 10.1016/j.ab.2022.114878_bib32 article-title: High-order convolutional neural network architecture for predicting DNA-protein binding sites publication-title: IEEE ACM Trans. Comput. Biol. Bioinf doi: 10.1109/TCBB.2018.2819660 – volume: 16 start-page: 1 year: 2015 ident: 10.1016/j.ab.2022.114878_bib42 article-title: Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data publication-title: BMC Bioinf. doi: 10.1186/s12859-015-0797-4 – volume: 59 start-page: 3057 year: 2019 ident: 10.1016/j.ab.2022.114878_bib57 article-title: DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines publication-title: J. Chem. Inf. Model. doi: 10.1021/acs.jcim.8b00749 – start-page: 1 year: 2015 ident: 10.1016/j.ab.2022.114878_bib56 – volume: 10 year: 2014 ident: 10.1016/j.ab.2022.114878_bib24 article-title: Enhanced regulatory sequence prediction using gapped k-mer features publication-title: PLoS Comput. Biol. doi: 10.1371/journal.pcbi.1003711 – volume: 36 start-page: 1405 year: 2020 ident: 10.1016/j.ab.2022.114878_bib18 article-title: Expectation pooling: an effective and interpretable pooling method for predicting DNA–protein binding publication-title: Bioinformatics doi: 10.1093/bioinformatics/btz768 – volume: 4 start-page: 117 year: 2007 ident: 10.1016/j.ab.2022.114878_bib6 article-title: DNA deformation energy as an indirect recognition mechanism in protein-DNA interactions publication-title: IEEE ACM Trans. Comput. Biol. Bioinf doi: 10.1109/TCBB.2007.1000 – volume: 22 start-page: bbaa171 year: 2021 ident: 10.1016/j.ab.2022.114878_bib55 article-title: An in silico approach to identification, categorization and prediction of nucleic acid binding proteins publication-title: Briefings Bioinf. doi: 10.1093/bib/bbaa171 – start-page: 770 year: 2016 ident: 10.1016/j.ab.2022.114878_bib25 article-title: Deep residual learning for image recognition publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog. – volume: 33 start-page: 6486 year: 2005 ident: 10.1016/j.ab.2022.114878_bib22 article-title: Kernel-based machine learning protocol for predicting DNA-binding proteins publication-title: Nucleic Acids Res. doi: 10.1093/nar/gki949 |
| SSID | ssj0011456 |
| Score | 2.4608412 |
| Snippet | Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years,... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Enrichment Source Index Database |
| StartPage | 114878 |
| SubjectTerms | chromatin immunoprecipitation data collection DNA gene expression nucleotide sequences prediction |
| Title | Improving the prediction of DNA-protein binding by integrating multi-scale dense convolutional network with fault-tolerant coding |
| URI | https://www.proquest.com/docview/2709741485 https://www.proquest.com/docview/2718380876 |
| Volume | 656 |
| WOSCitedRecordID | wos000887375200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect (Freedom Collection) customDbUrl: eissn: 1096-0309 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011456 issn: 0003-2697 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Pb9MwFLa6DgkuE2wgxmAyEkJCyKVJEyc5VmVo7FAhMaTuFNmxs2YqzrSm1Xrkr-Df5Tl2nAbExA5c0jZ9caN8X9-zn98PhN4IHiQ58ynxIi8kgS5DGLMsJ55PR_AdMECKutlENJ3Gs1nypdf72eTCrBeRUvHtbXL9X6GGcwC2Tp29B9xuUDgB7wF0OALscPwn4Fs3gcmC0jsxzbTw43RM6soMhXrPC5PQwjeuZoT-WEcYkiVApzOqlIlmX9s7BjyViRs3DtycgTSpyoUEk1eBpLOErj4zW2yMu5wXujmX6S7ndI2pYHCxIqfMXmjKRZqEiFJdksm8TVY7K6x3-2LF1JyVLniI1Q7fr_NmEOvGgBWw59wYjWoeEZ-aYN1GNdNwW7nqpZvp9_OH3jcuiKsB4wM9-KAV7ZbY_s30uYDEJtbtKmU81SOkZoQdtOtHYRL30e7488nszG1QeUHdGNjdtd0BN6GD3bvozni6Br-exZw_Rnt2-YHHhjZPUE-qfXQAKFXl9w1-i-uA4HqnZR89nDRwHaAfjlUYWIVbVuEyx1uswpZVmG_wFqvwFqtwzSrcYRW2rMKaVbjLKmxY9RR9-3RyPjkltnsHyUZ0WBGa8UjkoYRXnaCf5AHzMo9KLmieSCYzEQXSiwMwKCLgME0OPMazhA1ZQEF7xKNnqK9KJZ8jLGgk44RzIWC6nPtgc2QshhmXQUQFS9gh-tA84TSzpe11h5VF-jdcD9E7d8W1Ketyh-zrBrQUnrreUGNKlqtl6kdDWI6DUHiXDBjNWBd-fHGP3zxCj9o_yUvUr25W8hV6kK2rYnlzjHaiWXxsKfkLSb25vg |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+the+prediction+of+DNA-protein+binding+by+integrating+multi-scale+dense+convolutional+network+with+fault-tolerant+coding&rft.jtitle=Analytical+biochemistry&rft.au=Yin%2C+Yu-Hang&rft.au=Shen%2C+Long-Chen&rft.au=Jiang%2C+Yuanhao&rft.au=Gao%2C+Shang&rft.date=2022-11-01&rft.issn=0003-2697&rft.volume=656&rft.spage=114878&rft_id=info:doi/10.1016%2Fj.ab.2022.114878&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ab_2022_114878 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0003-2697&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0003-2697&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0003-2697&client=summon |