Improving the prediction of DNA-protein binding by integrating multi-scale dense convolutional network with fault-tolerant coding

Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to signi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Analytical biochemistry Jg. 656; S. 114878
Hauptverfasser: Yin, Yu-Hang, Shen, Long-Chen, Jiang, Yuanhao, Gao, Shang, Song, Jiangning, Yu, Dong-Jun
Format: Journal Article
Sprache:Englisch
Veröffentlicht: 01.11.2022
Schlagworte:
ISSN:0003-2697, 1096-0309, 1096-0309
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.
AbstractList Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.
Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years, with the rapid development of deep learning techniques, advanced deep neural networks have been introduced into the field and shown to significantly improve the prediction performance of DPB. However, these methods are primarily based on the DNA sequences measured by the ChIP-seq technology, failing to consider the possible partial variations of the motif sequences and errors of the sequencing technology itself. To address this, we propose a novel computational method, termed MSDenseNet, which combines a new fault-tolerant coding (FTC) scheme with the dense connectional deep neural networks. Three important factors can be attributed to the success of MSDenseNet: First, MSDenseNet utilizes a powerful feature representation approach, which transforms the raw DNA sequence into fusion coding using the fault-tolerant feature sequence; Second, in terms of network structure, MSDenseNet uses a multi-scale convolution within the dense layer and the multi-scale convolution preceding the dense block. This is shown to be able to significantly improve the network performance and accelerate the network convergence speed, and third, building upon the advanced deep neural network, MSDenseNet is capable of effectively mining the hidden complex relationship between the internal attributes of fusion sequence features to enhance the prediction of DPB. Benchmarking experiments on 690 ChIP-seq datasets show that MSDenseNet achieves an average AUC of 0.933 and outperforms the state-of-the-art method. The source code of MSDenseNet is available at https://github.com/csbio-njust-edu/msdensenet. The results show that MSDenseNet can effectively predict DPB. We anticipate that MSDenseNet will be exploited as a powerful tool to facilitate a more exhaustive understanding of DNA-binding proteins and help toward their functional characterization.
ArticleNumber 114878
Author Jiang, Yuanhao
Shen, Long-Chen
Gao, Shang
Song, Jiangning
Yin, Yu-Hang
Yu, Dong-Jun
Author_xml – sequence: 1
  givenname: Yu-Hang
  surname: Yin
  fullname: Yin, Yu-Hang
– sequence: 2
  givenname: Long-Chen
  surname: Shen
  fullname: Shen, Long-Chen
– sequence: 3
  givenname: Yuanhao
  surname: Jiang
  fullname: Jiang, Yuanhao
– sequence: 4
  givenname: Shang
  surname: Gao
  fullname: Gao, Shang
– sequence: 5
  givenname: Jiangning
  surname: Song
  fullname: Song, Jiangning
– sequence: 6
  givenname: Dong-Jun
  surname: Yu
  fullname: Yu, Dong-Jun
BookMark eNqNkTFvFDEQhS0UJC6BntIlzR5j7653XUYBQqQoaaC2Zr2ziQ-ffdi-RCn553h1VEhIVKPRfO9Jb945OwsxEGPvBWwFCPVxt8VpK0HKrRDdOIyv2EaAVg20oM_YBgDaRio9vGHnOe8AKtWrDft1sz-k-OTCAy-PxA-JZmeLi4HHhX-6u2zqtZALfHJhXqnphbtQ6CFhWdf90RfXZIue-EwhE7cxPEV_XD3Q80DlOaYf_NmVR75gpZsSPSUMpZKr41v2ekGf6d2fecG-f_n87eprc3t_fXN1edvYVkFplJ2GeempTjH2Ui8dCisUTbNaNCHZeehIjJ2U_dxNWredwMlqBOzUgnJsL9iHk29N9PNIuZi9y5a8x0DxmI0cxNiOMA7qP1DQQ1ff3FdUnVCbYs6JFmNdwTV8Sei8EWDWeszO4GTWesypniqEv4SH5PaYXv4t-Q3kwJhs
CitedBy_id crossref_primary_10_1007_s00371_025_03946_1
crossref_primary_10_1016_j_engappai_2023_106353
crossref_primary_10_1016_j_compbiolchem_2024_108183
crossref_primary_10_31083_j_fbl2812346
crossref_primary_10_3390_math11214439
crossref_primary_10_1109_TCBB_2024_3404136
Cites_doi 10.1016/j.bpj.2011.04.037
10.1093/nar/gkab383
10.1016/j.jtbi.2018.01.023
10.1093/bib/bbab445
10.1093/nar/gkw203
10.1016/j.mito.2014.02.004
10.1093/bib/bbab101
10.1021/acs.jproteome.0c00864
10.1093/bioinformatics/btq003
10.1038/nrg3306
10.1093/bib/bbaa229
10.1093/nar/gkj143
10.1137/080737770
10.1016/j.jtbi.2018.10.027
10.1093/bioinformatics/btz339
10.1093/nar/gkt574
10.1038/nbt1486
10.1093/bib/bbab001
10.1007/s13042-019-00990-x
10.1101/gr.133306.111
10.1093/nar/gkw521
10.1093/nar/gkw226
10.1093/nar/gkv577
10.1038/s41598-018-33321-1
10.1038/nbt.3300
10.2174/1574893614666181212102030
10.1093/bioinformatics/btw255
10.1016/j.ab.2021.114241
10.1126/science.1242463
10.1093/nar/gkv416
10.1186/1471-2105-8-463
10.1021/acs.jcim.7b00397
10.1056/NEJMoa0807917
10.1093/bioinformatics/btw203
10.1021/jm100574m
10.1093/nar/gku846
10.1038/nbt.3121
10.1038/nature11247
10.1371/journal.pone.0251865
10.1093/bioinformatics/btw024
10.1038/nbt1053
10.1109/TCBB.2018.2819660
10.1186/s12859-015-0797-4
10.1021/acs.jcim.8b00749
10.1371/journal.pcbi.1003711
10.1093/bioinformatics/btz768
10.1109/TCBB.2007.1000
10.1093/bib/bbaa171
10.1093/nar/gki949
ContentType Journal Article
Copyright Copyright © 2022 Elsevier Inc. All rights reserved.
Copyright_xml – notice: Copyright © 2022 Elsevier Inc. All rights reserved.
DBID AAYXX
CITATION
7X8
7S9
L.6
DOI 10.1016/j.ab.2022.114878
DatabaseName CrossRef
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitle CrossRef
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitleList MEDLINE - Academic
AGRICOLA
Database_xml – sequence: 1
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Anatomy & Physiology
Chemistry
EISSN 1096-0309
ExternalDocumentID 10_1016_j_ab_2022_114878
GroupedDBID ---
--K
--M
-~X
.55
.GJ
.~1
0R~
1B1
1RT
1~.
1~5
23M
4.4
457
4G.
53G
5GY
5VS
6J9
7-5
71M
85S
8P~
9DU
9JM
9JN
AABNK
AAEDT
AAEDW
AAHBH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARLI
AATTM
AAXKI
AAXUO
AAYWO
AAYXX
ABDPE
ABEFU
ABFNM
ABFRF
ABGSF
ABMAC
ABOCM
ABUDA
ABUFD
ABWVN
ABXDB
ACDAQ
ACGFO
ACKIV
ACLOT
ACNCT
ACNNM
ACRLP
ACRPL
ACVFH
ADBBV
ADCNI
ADECG
ADEZE
ADFGL
ADIYS
ADMUD
ADNMO
ADRHT
ADUVX
ADVLN
ADXHL
AEBSH
AEFWE
AEHWI
AEIPS
AEKER
AENEX
AEUPX
AFJKZ
AFPUW
AFTJW
AFXIZ
AFZHZ
AGHFR
AGQPQ
AGRDE
AGUBO
AGYEJ
AHHHB
AI.
AIEXJ
AIGII
AIIUN
AIKHN
AITUG
AJSZI
AKBMS
AKRWK
AKYEP
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ANKPU
APXCP
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CAG
CITATION
COF
CS3
DM4
EBS
EFBJH
EFKBS
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FA8
FDB
FEDTE
FGOYB
FIRID
FLBIZ
FNPLU
FYGXN
G-2
G-Q
GBLVA
HLW
HVGLF
HZ~
H~9
IHE
J1W
J5H
K-O
KOM
L7B
LG5
LX2
M41
MO0
MVM
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RNS
ROL
RPZ
SBG
SCB
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSK
SSU
SSZ
T5K
VH1
WH7
WUQ
X7M
XOL
XPP
Y6R
YYP
ZGI
ZKB
ZMT
ZY4
~HD
7X8
7S9
L.6
ID FETCH-LOGICAL-c360t-6cb7df5e6cb18529f4a1c16ebd6f9eaecd74e184225d4b99341abc9a0a46fa283
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000887375200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0003-2697
1096-0309
IngestDate Sat Sep 27 23:25:31 EDT 2025
Wed Oct 01 14:35:09 EDT 2025
Tue Nov 18 22:45:20 EST 2025
Sat Nov 29 07:32:23 EST 2025
IsPeerReviewed true
IsScholarly true
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c360t-6cb7df5e6cb18529f4a1c16ebd6f9eaecd74e184225d4b99341abc9a0a46fa283
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PQID 2709741485
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2718380876
proquest_miscellaneous_2709741485
crossref_citationtrail_10_1016_j_ab_2022_114878
crossref_primary_10_1016_j_ab_2022_114878
PublicationCentury 2000
PublicationDate 2022-11-00
20221101
PublicationDateYYYYMMDD 2022-11-01
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-00
PublicationDecade 2020
PublicationTitle Analytical biochemistry
PublicationYear 2022
References Han (10.1016/j.ab.2022.114878_bib21) 2022; 23
Matys (10.1016/j.ab.2022.114878_bib14) 2006; 34
Bottou (10.1016/j.ab.2022.114878_bib53) 2010
Wang (10.1016/j.ab.2022.114878_bib12) 2014; 42
Song (10.1016/j.ab.2022.114878_bib60) 2018; 443
Fornes (10.1016/j.ab.2022.114878_bib15) 2020; 48
Quang (10.1016/j.ab.2022.114878_bib19) 2016; 44
Huang (10.1016/j.ab.2022.114878_bib46) 2010; 26
Bao (10.1016/j.ab.2022.114878_bib36) 2019
Smyth (10.1016/j.ab.2022.114878_bib9) 2008; 359
Kilpinen (10.1016/j.ab.2022.114878_bib43) 2013; 342
Xu (10.1016/j.ab.2022.114878_bib55) 2021; 22
Luo (10.1016/j.ab.2022.114878_bib18) 2020; 36
Siebert (10.1016/j.ab.2022.114878_bib41) 2016; 44
Bhardwaj (10.1016/j.ab.2022.114878_bib22) 2005; 33
Trabelsi (10.1016/j.ab.2022.114878_bib37) 2019; 35
Vaswani (10.1016/j.ab.2022.114878_bib27) 2017
Zhang (10.1016/j.ab.2022.114878_bib32) 2018; 16
Devlin (10.1016/j.ab.2022.114878_bib28) 2018
Consortium (10.1016/j.ab.2022.114878_bib44) 2012; 489
Ghandi (10.1016/j.ab.2022.114878_bib54) 2016; 32
Alipanahi (10.1016/j.ab.2022.114878_bib16) 2015; 33
Huang (10.1016/j.ab.2022.114878_bib26) 2017
Gholamalinezhad (10.1016/j.ab.2022.114878_bib51) 2020
Zhu (10.1016/j.ab.2022.114878_bib57) 2019; 59
Hu (10.1016/j.ab.2022.114878_bib59) 2018; 58
Shen (10.1016/j.ab.2022.114878_bib20) 2021; 22
Ghandi (10.1016/j.ab.2022.114878_bib24) 2014; 10
Zhao (10.1016/j.ab.2022.114878_bib29) 2021; 49
Xu (10.1016/j.ab.2022.114878_bib56) 2015
Zhang (10.1016/j.ab.2022.114878_bib38) 2021; 22
Tompa (10.1016/j.ab.2022.114878_bib1) 2005; 23
Telorac (10.1016/j.ab.2022.114878_bib49) 2016; 44
Furey (10.1016/j.ab.2022.114878_bib11) 2012; 13
Zhang (10.1016/j.ab.2022.114878_bib35) 2020; 11
Eggeling (10.1016/j.ab.2022.114878_bib42) 2015; 16
Çatalyürek (10.1016/j.ab.2022.114878_bib50) 2010; 32
Kumar (10.1016/j.ab.2022.114878_bib5) 2007; 8
Hu (10.1016/j.ab.2022.114878_bib61) 2021; 626
Gualberto (10.1016/j.ab.2022.114878_bib7) 2014; 19
Shen (10.1016/j.ab.2022.114878_bib34) 2018; 8
Qu (10.1016/j.ab.2022.114878_bib3) 2019; 14
Tan (10.1016/j.ab.2022.114878_bib2) 2016; 32
Shendure (10.1016/j.ab.2022.114878_bib10) 2008; 26
Szegedy (10.1016/j.ab.2022.114878_bib47) 2015
Liu (10.1016/j.ab.2022.114878_bib31) 2021; 22
Adilina (10.1016/j.ab.2022.114878_bib58) 2019; 460
Schmidtke (10.1016/j.ab.2022.114878_bib8) 2010; 53
He (10.1016/j.ab.2022.114878_bib39) 2021; 22
Kuntz (10.1016/j.ab.2022.114878_bib4) 2012; 22
Wong (10.1016/j.ab.2022.114878_bib23) 2013; 41
He (10.1016/j.ab.2022.114878_bib13) 2015; 33
He (10.1016/j.ab.2022.114878_bib25) 2016
Du (10.1016/j.ab.2022.114878_bib33) 2021; 20
Keilwagen (10.1016/j.ab.2022.114878_bib40) 2015; 43
Zeng (10.1016/j.ab.2022.114878_bib17) 2016; 32
Aeling (10.1016/j.ab.2022.114878_bib6) 2007; 4
Bailey (10.1016/j.ab.2022.114878_bib45) 2015; 43
Sela (10.1016/j.ab.2022.114878_bib48) 2011; 101
Min (10.1016/j.ab.2022.114878_bib30) 2021; 16
Paszke (10.1016/j.ab.2022.114878_bib52) 2019
References_xml – volume: 101
  start-page: 160
  year: 2011
  ident: 10.1016/j.ab.2022.114878_bib48
  article-title: DNA sequence correlations shape nonspecific transcription factor-DNA binding affinity
  publication-title: Biophys. J.
  doi: 10.1016/j.bpj.2011.04.037
– volume: 49
  start-page: W523
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib29
  article-title: PlantDeepSEA, a deep learning-based web service to predict the regulatory effects of genomic variants in plants
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkab383
– volume: 443
  start-page: 125
  year: 2018
  ident: 10.1016/j.ab.2022.114878_bib60
  article-title: PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework
  publication-title: J. Theor. Biol.
  doi: 10.1016/j.jtbi.2018.01.023
– volume: 23
  start-page: bbab445
  year: 2022
  ident: 10.1016/j.ab.2022.114878_bib21
  article-title: MAResNet: predicting transcription factor binding sites by combining multi-scale bottom-up and top-down attention and residual network
  publication-title: Briefings Bioinf.
  doi: 10.1093/bib/bbab445
– volume: 44
  start-page: 6142
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib49
  article-title: Identification and characterization of DNA sequences that prevent glucocorticoid receptor binding to nearby response elements
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkw203
– start-page: 177
  year: 2010
  ident: 10.1016/j.ab.2022.114878_bib53
  article-title: Large-scale machine learning with stochastic gradient descent
  publication-title: Proc. COMPSTAT
– volume: 19
  start-page: 323
  year: 2014
  ident: 10.1016/j.ab.2022.114878_bib7
  article-title: DNA-binding proteins in plant mitochondria: implications for transcription
  publication-title: Mitochondrion
  doi: 10.1016/j.mito.2014.02.004
– volume: 22
  start-page: bbab101
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib20
  article-title: SAResNet: self-attention residual network for predicting DNA-protein binding
  publication-title: Briefings Bioinf.
  doi: 10.1093/bib/bbab101
– volume: 20
  start-page: 1639
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib33
  article-title: Using chou's 5-step rule to predict DNA-protein binding with multi-scale complementary feature
  publication-title: J. Proteome Res.
  doi: 10.1021/acs.jproteome.0c00864
– volume: 26
  start-page: 680
  year: 2010
  ident: 10.1016/j.ab.2022.114878_bib46
  article-title: A web server for clustering and comparing biological sequences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq003
– volume: 13
  start-page: 840
  year: 2012
  ident: 10.1016/j.ab.2022.114878_bib11
  article-title: ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg3306
– volume: 22
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib39
  article-title: A survey on deep learning in DNA/RNA motif mining
  publication-title: Briefings Bioinf.
  doi: 10.1093/bib/bbaa229
– volume: 34
  start-page: D108
  year: 2006
  ident: 10.1016/j.ab.2022.114878_bib14
  article-title: TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkj143
– volume: 32
  start-page: 656
  year: 2010
  ident: 10.1016/j.ab.2022.114878_bib50
  article-title: On two-dimensional sparse matrix partitioning: models, methods, and a recipe
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/080737770
– volume: 460
  start-page: 64
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib58
  article-title: Effective DNA binding protein prediction by using key features via Chou's general PseAAC
  publication-title: J. Theor. Biol.
  doi: 10.1016/j.jtbi.2018.10.027
– year: 2018
  ident: 10.1016/j.ab.2022.114878_bib28
– volume: 35
  start-page: i269
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib37
  article-title: Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz339
– start-page: 4700
  year: 2017
  ident: 10.1016/j.ab.2022.114878_bib26
  article-title: Densely connected convolutional networks
  publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog.
– volume: 41
  year: 2013
  ident: 10.1016/j.ab.2022.114878_bib23
  article-title: DNA motif elucidation using belief propagation
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkt574
– volume: 26
  start-page: 1135
  year: 2008
  ident: 10.1016/j.ab.2022.114878_bib10
  article-title: Next-generation DNA sequencing
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt1486
– volume: 48
  start-page: D87
  year: 2020
  ident: 10.1016/j.ab.2022.114878_bib15
  article-title: JASPAR 2020: update of the open-access database of transcription factor binding profiles
  publication-title: Nucleic Acids Res.
– volume: 22
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib31
  article-title: Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation
  publication-title: Briefings Bioinf.
  doi: 10.1093/bib/bbab001
– volume: 11
  start-page: 841
  year: 2020
  ident: 10.1016/j.ab.2022.114878_bib35
  article-title: DeepSite: bidirectional LSTM and CNN models for predicting DNA–protein binding
  publication-title: Int. J. Machine learn. Cyber.
  doi: 10.1007/s13042-019-00990-x
– volume: 22
  start-page: 1907
  year: 2012
  ident: 10.1016/j.ab.2022.114878_bib4
  article-title: Transcription factor redundancy and tissue-specific regulation: evidence from functional and physical network connectivity
  publication-title: Genome Res.
  doi: 10.1101/gr.133306.111
– volume: 44
  start-page: 6055
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib41
  article-title: Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkw521
– volume: 44
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib19
  article-title: DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkw226
– volume: 43
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib40
  article-title: Varying levels of complexity in transcription factor binding motifs
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv577
– volume: 8
  start-page: 1
  year: 2018
  ident: 10.1016/j.ab.2022.114878_bib34
  article-title: Recurrent neural network for predicting transcription factor binding sites
  publication-title: Sci. Rep.
  doi: 10.1038/s41598-018-33321-1
– start-page: 30
  year: 2017
  ident: 10.1016/j.ab.2022.114878_bib27
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 33
  start-page: 831
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib16
  article-title: Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.3300
– volume: 14
  start-page: 246
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib3
  article-title: A review of DNA-binding proteins prediction methods
  publication-title: Curr. Bioinf.
  doi: 10.2174/1574893614666181212102030
– volume: 32
  start-page: i121
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib17
  article-title: Convolutional neural network architectures for predicting DNA–protein binding
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw255
– start-page: 1
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib47
  article-title: Going deeper with convolutions
  publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog.
– volume: 626
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib61
  article-title: Accurate prediction of protein-ATP binding residues using position-specific frequency matrix
  publication-title: Anal. Biochem.
  doi: 10.1016/j.ab.2021.114241
– start-page: 126
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib36
– volume: 342
  start-page: 744
  year: 2013
  ident: 10.1016/j.ab.2022.114878_bib43
  article-title: Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription
  publication-title: Science
  doi: 10.1126/science.1242463
– volume: 43
  start-page: W39
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib45
  article-title: The MEME suite
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkv416
– volume: 8
  start-page: 1
  year: 2007
  ident: 10.1016/j.ab.2022.114878_bib5
  article-title: Identification of DNA-binding proteins using support vector machines and evolutionary profiles
  publication-title: BMC Bioinf.
  doi: 10.1186/1471-2105-8-463
– volume: 22
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib38
  article-title: Locating transcription factor binding sites by fully convolutional neural network
  publication-title: Briefings Bioinf.
– volume: 58
  start-page: 501
  year: 2018
  ident: 10.1016/j.ab.2022.114878_bib59
  article-title: ATPbind: accurate protein–ATP binding site prediction by combining sequence-profiling and structure-based comparisons
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.7b00397
– volume: 359
  start-page: 2767
  year: 2008
  ident: 10.1016/j.ab.2022.114878_bib9
  article-title: Shared and distinct genetic variants in type 1 diabetes and celiac disease
  publication-title: N. Engl. J. Med.
  doi: 10.1056/NEJMoa0807917
– start-page: 32
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib52
  article-title: Pytorch: an imperative style, high-performance deep learning library
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 32
  start-page: 2205
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib54
  article-title: gkmSVM: an R package for gapped-kmer SVM
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw203
– volume: 53
  start-page: 5858
  year: 2010
  ident: 10.1016/j.ab.2022.114878_bib8
  article-title: Understanding and predicting druggability. A high-throughput method for detection of drug binding sites
  publication-title: J. Med. Chem.
  doi: 10.1021/jm100574m
– volume: 42
  year: 2014
  ident: 10.1016/j.ab.2022.114878_bib12
  article-title: MACE: model based analysis of ChIP-exo
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gku846
– volume: 33
  start-page: 395
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib13
  article-title: ChIP-nexus enables improved detection of in vivo transcription factor binding footprints
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.3121
– volume: 489
  start-page: 57
  year: 2012
  ident: 10.1016/j.ab.2022.114878_bib44
  article-title: An integrated encyclopedia of DNA elements in the human genome
  publication-title: Nature
  doi: 10.1038/nature11247
– volume: 16
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib30
  article-title: Protein transfer learning improves identification of heat shock protein families
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0251865
– volume: 32
  start-page: 1555
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib2
  article-title: TFBSTools: an R/bioconductor package for transcription factor binding site analysis
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw024
– volume: 23
  start-page: 137
  year: 2005
  ident: 10.1016/j.ab.2022.114878_bib1
  article-title: Assessing computational tools for the discovery of transcription factor binding sites
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt1053
– year: 2020
  ident: 10.1016/j.ab.2022.114878_bib51
– volume: 16
  start-page: 1184
  year: 2018
  ident: 10.1016/j.ab.2022.114878_bib32
  article-title: High-order convolutional neural network architecture for predicting DNA-protein binding sites
  publication-title: IEEE ACM Trans. Comput. Biol. Bioinf
  doi: 10.1109/TCBB.2018.2819660
– volume: 16
  start-page: 1
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib42
  article-title: Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data
  publication-title: BMC Bioinf.
  doi: 10.1186/s12859-015-0797-4
– volume: 59
  start-page: 3057
  year: 2019
  ident: 10.1016/j.ab.2022.114878_bib57
  article-title: DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.8b00749
– start-page: 1
  year: 2015
  ident: 10.1016/j.ab.2022.114878_bib56
– volume: 10
  year: 2014
  ident: 10.1016/j.ab.2022.114878_bib24
  article-title: Enhanced regulatory sequence prediction using gapped k-mer features
  publication-title: PLoS Comput. Biol.
  doi: 10.1371/journal.pcbi.1003711
– volume: 36
  start-page: 1405
  year: 2020
  ident: 10.1016/j.ab.2022.114878_bib18
  article-title: Expectation pooling: an effective and interpretable pooling method for predicting DNA–protein binding
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz768
– volume: 4
  start-page: 117
  year: 2007
  ident: 10.1016/j.ab.2022.114878_bib6
  article-title: DNA deformation energy as an indirect recognition mechanism in protein-DNA interactions
  publication-title: IEEE ACM Trans. Comput. Biol. Bioinf
  doi: 10.1109/TCBB.2007.1000
– volume: 22
  start-page: bbaa171
  year: 2021
  ident: 10.1016/j.ab.2022.114878_bib55
  article-title: An in silico approach to identification, categorization and prediction of nucleic acid binding proteins
  publication-title: Briefings Bioinf.
  doi: 10.1093/bib/bbaa171
– start-page: 770
  year: 2016
  ident: 10.1016/j.ab.2022.114878_bib25
  article-title: Deep residual learning for image recognition
  publication-title: Proc. IEEE Conf. Comput. Vis. Patt. Recog.
– volume: 33
  start-page: 6486
  year: 2005
  ident: 10.1016/j.ab.2022.114878_bib22
  article-title: Kernel-based machine learning protocol for predicting DNA-binding proteins
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gki949
SSID ssj0011456
Score 2.4608412
Snippet Accurate prediction of DNA-protein binding (DPB) is of great biological significance for studying the regulatory mechanism of gene expression. In recent years,...
SourceID proquest
crossref
SourceType Aggregation Database
Enrichment Source
Index Database
StartPage 114878
SubjectTerms chromatin immunoprecipitation
data collection
DNA
gene expression
nucleotide sequences
prediction
Title Improving the prediction of DNA-protein binding by integrating multi-scale dense convolutional network with fault-tolerant coding
URI https://www.proquest.com/docview/2709741485
https://www.proquest.com/docview/2718380876
Volume 656
WOSCitedRecordID wos000887375200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect (Freedom Collection)
  customDbUrl:
  eissn: 1096-0309
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011456
  issn: 0003-2697
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Pb9MwFLa6DgkuE2wgxmAyEkJCyKVJEyc5VmVo7FAhMaTuFNmxs2YqzrSm1Xrkr-Df5Tl2nAbExA5c0jZ9caN8X9-zn98PhN4IHiQ58ynxIi8kgS5DGLMsJ55PR_AdMECKutlENJ3Gs1nypdf72eTCrBeRUvHtbXL9X6GGcwC2Tp29B9xuUDgB7wF0OALscPwn4Fs3gcmC0jsxzbTw43RM6soMhXrPC5PQwjeuZoT-WEcYkiVApzOqlIlmX9s7BjyViRs3DtycgTSpyoUEk1eBpLOErj4zW2yMu5wXujmX6S7ndI2pYHCxIqfMXmjKRZqEiFJdksm8TVY7K6x3-2LF1JyVLniI1Q7fr_NmEOvGgBWw59wYjWoeEZ-aYN1GNdNwW7nqpZvp9_OH3jcuiKsB4wM9-KAV7ZbY_s30uYDEJtbtKmU81SOkZoQdtOtHYRL30e7488nszG1QeUHdGNjdtd0BN6GD3bvozni6Br-exZw_Rnt2-YHHhjZPUE-qfXQAKFXl9w1-i-uA4HqnZR89nDRwHaAfjlUYWIVbVuEyx1uswpZVmG_wFqvwFqtwzSrcYRW2rMKaVbjLKmxY9RR9-3RyPjkltnsHyUZ0WBGa8UjkoYRXnaCf5AHzMo9KLmieSCYzEQXSiwMwKCLgME0OPMazhA1ZQEF7xKNnqK9KJZ8jLGgk44RzIWC6nPtgc2QshhmXQUQFS9gh-tA84TSzpe11h5VF-jdcD9E7d8W1Ketyh-zrBrQUnrreUGNKlqtl6kdDWI6DUHiXDBjNWBd-fHGP3zxCj9o_yUvUr25W8hV6kK2rYnlzjHaiWXxsKfkLSb25vg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+the+prediction+of+DNA-protein+binding+by+integrating+multi-scale+dense+convolutional+network+with+fault-tolerant+coding&rft.jtitle=Analytical+biochemistry&rft.au=Yin%2C+Yu-Hang&rft.au=Shen%2C+Long-Chen&rft.au=Jiang%2C+Yuanhao&rft.au=Gao%2C+Shang&rft.date=2022-11-01&rft.issn=0003-2697&rft.volume=656&rft.spage=114878&rft_id=info:doi/10.1016%2Fj.ab.2022.114878&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ab_2022_114878
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0003-2697&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0003-2697&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0003-2697&client=summon