EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences

Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble al...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:BMC bioinformatics Ročník 7; číslo 1; s. 342
Hlavní autoři: Hu, Jianjun, Yang, Yifeng D, Kihara, Daisuke
Médium: Journal Article
Jazyk:angličtina
Vydáno: England BioMed Central 13.07.2006
BMC
Témata:
ISSN:1471-2105, 1471-2105
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.
AbstractList Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.
Abstract Background Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. Results We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. Conclusion We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.
Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.
Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.BACKGROUNDUnderstanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.RESULTSWe proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.CONCLUSIONWe proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.
ArticleNumber 342
Author Hu, Jianjun
Kihara, Daisuke
Yang, Yifeng D
AuthorAffiliation 1 Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
4 The Bindley Bioscience Center, Discovery Park, Purdue University, West Lafayette, IN, 47907, USA
2 Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
3 Markey Center for Structural Biology, Purdue University, West Lafayette, IN, 47907, USA
AuthorAffiliation_xml – name: 2 Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
– name: 3 Markey Center for Structural Biology, Purdue University, West Lafayette, IN, 47907, USA
– name: 4 The Bindley Bioscience Center, Discovery Park, Purdue University, West Lafayette, IN, 47907, USA
– name: 1 Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Author_xml – sequence: 1
  givenname: Jianjun
  surname: Hu
  fullname: Hu, Jianjun
– sequence: 2
  givenname: Yifeng D
  surname: Yang
  fullname: Yang, Yifeng D
– sequence: 3
  givenname: Daisuke
  surname: Kihara
  fullname: Kihara, Daisuke
BackLink https://www.ncbi.nlm.nih.gov/pubmed/16839417$$D View this record in MEDLINE/PubMed
BookMark eNqFks1vFSEUxSemxn7o2p1h5W4sl4EBXJg0bdUmrW50TRjm8kozAxXmNel_X56vvrQmxoQEcjn3l8PlHDZ7MUVsmrdAPwCo_hi4hJYBFa1sO85eNAe7yt6T835zWMoNpSAVFa-afehVpznIg-bq_OrsI7GRYCw4DxMSO61SDsv1THzKZAzFpTvMIa5IxtV6skvK92ROS_CFhEjOvp2Qgr_WGB2W181Lb6eCbx73o-bn5_Mfp1_by-9fLk5PLlvHlVpaAEpHoZ0aBg1MisF7MTJvrRCSUecUB4ZMahSj58pK58eBoXOit4pxKruj5mLLHZO9Mbc5zDbfm2SD-V1IeWVsXoKb0CDvgGvZi0FyrphUgB14QYe6umqisj5tWbfrYcbRYVyynZ5Bn9_EcG1W6c6A6DRlfQW8fwTkVOdQFjPXoeE02YhpXUyvZFdfqf8rBM2lFhqq8N1TSzsvf76tCsRW4HIqJaM3Lix2CWnjMEwGqNnEw2wCYDYBMNLUeNS-47_6duh_dDwAA6W6pA
CitedBy_id crossref_primary_10_3390_genes15060676
crossref_primary_10_1093_bioinformatics_btp090
crossref_primary_10_1186_s12864_018_5148_1
crossref_primary_10_1186_1471_2105_12_238
crossref_primary_10_1002_prot_23169
crossref_primary_10_1155_2013_283129
crossref_primary_10_1186_1471_2105_10_321
crossref_primary_10_1109_TCBB_2015_2496261
crossref_primary_10_1016_j_ress_2012_03_008
crossref_primary_10_1093_bib_bbv022
crossref_primary_10_1111_pbi_13550
crossref_primary_10_1002_prot_22082
crossref_primary_10_1186_1471_2105_9_S9_S6
crossref_primary_10_1093_nar_gkp248
crossref_primary_10_1186_1471_2105_8_S7_S21
crossref_primary_10_1093_nar_gkv300
crossref_primary_10_1109_TR_2015_2427156
crossref_primary_10_1128_MMBR_00037_08
crossref_primary_10_1089_cmb_2018_0113
crossref_primary_10_1186_1471_2105_14_9
crossref_primary_10_1186_1471_2229_13_42
crossref_primary_10_1093_bioinformatics_btn420
crossref_primary_10_1155_2018_3837060
crossref_primary_10_1186_1471_2105_13_317
crossref_primary_10_1145_3382078
Cites_doi 10.1089/1066527041410319
10.1186/1471-2105-4-23
10.1110/ps.0226702
10.1093/bioinformatics/btg329
10.1089/10665270252935566
10.1110/ps.08501
10.1126/science.8211139
10.1093/nar/gkg503
10.1093/nar/gkh063
10.1038/nbt1053
10.1093/protein/gzg063
10.1186/1471-2105-5-205
10.1093/bioinformatics/18.suppl_2.S100
10.1089/10665270252935421
10.1093/nar/gki791
10.1093/bioinformatics/btf872
10.1186/1471-2105-5-170
10.1002/prot.10543
10.1038/nbt1098-939
10.1016/S0959-437X(02)00277-0
10.1093/bioinformatics/bth438
10.1101/gr.6902
10.1093/bioinformatics/btg124
10.1073/pnas.84.13.4355
10.1093/bioinformatics/bti445
10.1002/prot.10530
10.1073/pnas.0630591100
10.1093/nar/gkh140
10.1101/gr.8.11.1202
10.1016/S0092-8674(00)81641-4
10.1002/prot.10557
10.1002/prot.10357
10.1038/nrg1315
10.1038/nbt717
10.1089/10665270252935430
ContentType Journal Article
Copyright Copyright © 2006 Hu et al; licensee BioMed Central Ltd. 2006 Hu et al; licensee BioMed Central Ltd.
Copyright_xml – notice: Copyright © 2006 Hu et al; licensee BioMed Central Ltd. 2006 Hu et al; licensee BioMed Central Ltd.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QO
7TM
8FD
FR3
P64
7X8
5PM
DOA
DOI 10.1186/1471-2105-7-342
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Biotechnology Research Abstracts
Nucleic Acids Abstracts
Technology Research Database
Engineering Research Database
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Engineering Research Database
Biotechnology Research Abstracts
Technology Research Database
Nucleic Acids Abstracts
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitleList MEDLINE

Engineering Research Database
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1471-2105
EndPage 342
ExternalDocumentID oai_doaj_org_article_e43149765b74482781e31f50b50b3d59
PMC1539026
16839417
10_1186_1471_2105_7_342
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: R01 GM075004
– fundername: NIGMS NIH HHS
  grantid: U24 GM077905
– fundername: NIGMS NIH HHS
  grantid: R01 GM-075004
GroupedDBID ---
0R~
123
23N
2VQ
2WC
4.4
53G
5VS
6J9
AAFWJ
AAJSJ
AAKPC
AASML
AAYXX
ABDBF
ACGFO
ACGFS
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADMLS
ADRAZ
ADUKV
AEAQA
AENEX
AFPKN
AFRAH
AHBYD
AHMBA
AHSBF
AHYZX
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
BAPOH
BAWUL
BCNDV
BENPR
BFQNJ
BMC
C1A
C6C
CITATION
CS3
DIK
DU5
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EJD
EMB
EMK
EMOBN
ESX
F5P
GROUPED_DOAJ
GX1
H13
HYE
IAO
ICD
IHR
INH
INR
IPNFZ
ISR
ITC
KQ8
M48
MK~
ML0
M~E
O5R
O5S
OK1
OVT
P2P
PGMZT
PIMPY
PQQKQ
RBZ
RIG
RNS
ROL
RPM
RSV
SBL
SOJ
SV3
TR2
TUS
W2D
WOQ
WOW
XH6
XSB
-A0
ACRMQ
ADINQ
ALIPV
C24
CGR
CUY
CVF
ECM
EIF
NPM
7QO
7TM
8FD
FR3
P64
7X8
5PM
ID FETCH-LOGICAL-c488t-1100d59c8bb91275bff5d2faa55720cc8412e279e5df48a7cfdb2ecc56a824073
IEDL.DBID DOA
ISICitedReferencesCount 38
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000239740000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1471-2105
IngestDate Fri Oct 03 12:18:45 EDT 2025
Thu Aug 21 14:06:04 EDT 2025
Thu Sep 04 14:44:46 EDT 2025
Tue Oct 07 09:34:30 EDT 2025
Wed Feb 19 01:49:10 EST 2025
Tue Nov 18 21:16:47 EST 2025
Sat Nov 29 02:17:58 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License This is an Open Access article distributed under the terms of the Creative Commons Attribution License (), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c488t-1100d59c8bb91275bff5d2faa55720cc8412e279e5df48a7cfdb2ecc56a824073
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://doaj.org/article/e43149765b74482781e31f50b50b3d59
PMID 16839417
PQID 19479591
PQPubID 23462
PageCount 1
ParticipantIDs doaj_primary_oai_doaj_org_article_e43149765b74482781e31f50b50b3d59
pubmedcentral_primary_oai_pubmedcentral_nih_gov_1539026
proquest_miscellaneous_68731279
proquest_miscellaneous_19479591
pubmed_primary_16839417
crossref_citationtrail_10_1186_1471_2105_7_342
crossref_primary_10_1186_1471_2105_7_342
PublicationCentury 2000
PublicationDate 2006-07-13
PublicationDateYYYYMMDD 2006-07-13
PublicationDate_xml – month: 07
  year: 2006
  text: 2006-07-13
  day: 13
PublicationDecade 2000
PublicationPlace England
PublicationPlace_xml – name: England
– name: London
PublicationTitle BMC bioinformatics
PublicationTitleAlternate BMC Bioinformatics
PublicationYear 2006
Publisher BioMed Central
BMC
Publisher_xml – name: BioMed Central
– name: BMC
References M Kanehisa (1081_CR39) 2004; 32 Database iss
T Wang (1081_CR14) 2003; 19
XS Liu (1081_CR32) 2002; 20
TL Bailey (1081_CR33) 1995
X Liu (1081_CR31) 2001
M Kellis (1081_CR16) 2004; 11
J Lundstrom (1081_CR28) 2001; 10
M Blanchette (1081_CR12) 2002; 9
A Tramontano (1081_CR26) 2003; 53 Suppl 6
J Nilsson (1081_CR36) 2002; 11
JJ Wyrick (1081_CR3) 2002; 12
G Aggarwal (1081_CR18) 2003; 4
J Buhler (1081_CR35) 2002; 9
J Hu (1081_CR6) 2005; 33
H Salgado (1081_CR38) 2004; 32
J Hu (1081_CR40) 2006
M Albrecht (1081_CR22) 2003; 16
K Ginalski (1081_CR24) 2003; 31
FP Roth (1081_CR30) 1998; 16
N Poluliakh (1081_CR37) 2003; 19
FC Holstege (1081_CR2) 1998; 95
CE Lawrence (1081_CR11) 1993; 262
R Osada (1081_CR8) 2004; 20
M Blanchette (1081_CR13) 2002; 12
D Fischer (1081_CR20) 2003; 51
K Nishikawa (1081_CR23) 1990; 62
K Ginalski (1081_CR19) 2003; 19
TL Bailey (1081_CR10) 1995; 21
LN Kinch (1081_CR25) 2003; 53 Suppl 6
C Venclovas (1081_CR27) 2003; 53 Suppl 6
TZ Sen (1081_CR29) 2004; 5
A Brazma (1081_CR1) 1998; 8
K Ellrott (1081_CR7) 2002; 18 Suppl 2
EM Conlon (1081_CR17) 2003; 100
S Sinha (1081_CR15) 2004; 5
HK Saini (1081_CR21) 2005; 21
G Thijs (1081_CR34) 2002; 9
WW Wasserman (1081_CR5) 2004; 5
M Gribskov (1081_CR9) 1987; 84
M Tompa (1081_CR4) 2005; 23
9847082 - Genome Res. 1998 Nov;8(11):1202-15
11262934 - Pac Symp Biocomput. 2001;:127-38
11893484 - Curr Opin Genet Dev. 2002 Apr;12(2):130-6
15131651 - Nat Rev Genet. 2004 Apr;5(4):276-87
14681419 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D303-6
12015878 - J Comput Biol. 2002;9(2):211-23
2086690 - Seikagaku. 1990 Dec;62(12):1490-6
14579328 - Proteins. 2003;53 Suppl 6:395-409
14579324 - Proteins. 2003;53 Suppl 6:352-68
12385991 - Bioinformatics. 2002;18 Suppl 2:S100-9
12626739 - Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3339-44
15297295 - Bioinformatics. 2004 Dec 12;20(18):3516-25
14668220 - Bioinformatics. 2003 Dec 12;19(18):2369-80
12015879 - J Comput Biol. 2002;9(2):225-42
15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44
15606919 - BMC Bioinformatics. 2004;5:205
12015892 - J Comput Biol. 2002;9(2):447-64
12584132 - Bioinformatics. 2003 Feb 12;19(3):423-4
11604541 - Protein Sci. 2001 Nov;10(11):2354-62
12761065 - Bioinformatics. 2003 May 22;19(8):1015-8
12101404 - Nat Biotechnol. 2002 Aug;20(8):835-9
15840708 - Bioinformatics. 2005 Jun 15;21(12):2917-20
16284194 - Nucleic Acids Res. 2005;33(15):4899-913
14681412 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80
12696054 - Proteins. 2003 May 15;51(3):434-41
12824309 - Nucleic Acids Res. 2003 Jul 1;31(13):3291-2
11997340 - Genome Res. 2002 May;12(5):739-48
9788350 - Nat Biotechnol. 1998 Oct;16(10):939-45
15285895 - J Comput Biol. 2004;11(2-3):319-55
15511292 - BMC Bioinformatics. 2004 Oct 28;5:170
12441395 - Protein Sci. 2002 Dec;11(12):2974-80
3474607 - Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8
12915722 - Protein Eng. 2003 Jul;16(7):459-62
9845373 - Cell. 1998 Nov 25;95(5):717-28
14579350 - Proteins. 2003;53 Suppl 6:585-95
12793912 - BMC Bioinformatics. 2003 Jun 7;4:23
8211139 - Science. 1993 Oct 8;262(5131):208-14
References_xml – volume: 11
  start-page: 319
  year: 2004
  ident: 1081_CR16
  publication-title: J Comput Biol
  doi: 10.1089/1066527041410319
– volume: 4
  start-page: 23
  year: 2003
  ident: 1081_CR18
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-4-23
– volume: 11
  start-page: 2974
  year: 2002
  ident: 1081_CR36
  publication-title: Protein Sci
  doi: 10.1110/ps.0226702
– volume: 19
  start-page: 2369
  year: 2003
  ident: 1081_CR14
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg329
– volume: 9
  start-page: 447
  year: 2002
  ident: 1081_CR34
  publication-title: J Comput Biol
  doi: 10.1089/10665270252935566
– volume: 10
  start-page: 2354
  year: 2001
  ident: 1081_CR28
  publication-title: Protein Sci
  doi: 10.1110/ps.08501
– volume: 262
  start-page: 208
  year: 1993
  ident: 1081_CR11
  publication-title: Science
  doi: 10.1126/science.8211139
– volume: 31
  start-page: 3291
  year: 2003
  ident: 1081_CR24
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkg503
– volume: 32 Database iss
  start-page: D277
  year: 2004
  ident: 1081_CR39
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh063
– volume: 23
  start-page: 137
  year: 2005
  ident: 1081_CR4
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt1053
– volume-title: Supplementary material for the paper
  year: 2006
  ident: 1081_CR40
– volume: 16
  start-page: 459
  year: 2003
  ident: 1081_CR22
  publication-title: Protein Eng
  doi: 10.1093/protein/gzg063
– volume: 5
  start-page: 205
  year: 2004
  ident: 1081_CR29
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-5-205
– volume: 18 Suppl 2
  start-page: S100
  year: 2002
  ident: 1081_CR7
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/18.suppl_2.S100
– volume: 9
  start-page: 211
  year: 2002
  ident: 1081_CR12
  publication-title: J Comput Biol
  doi: 10.1089/10665270252935421
– start-page: 51
  volume-title: Machine Learning
  year: 1995
  ident: 1081_CR33
– volume: 33
  start-page: 4899
  year: 2005
  ident: 1081_CR6
  publication-title: Nucleic Acid Res
  doi: 10.1093/nar/gki791
– volume: 19
  start-page: 423
  year: 2003
  ident: 1081_CR37
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btf872
– volume: 21
  start-page: 51
  year: 1995
  ident: 1081_CR10
  publication-title: Machine Learning
– volume: 5
  start-page: 170
  year: 2004
  ident: 1081_CR15
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-5-170
– volume: 53 Suppl 6
  start-page: 352
  year: 2003
  ident: 1081_CR26
  publication-title: Proteins
  doi: 10.1002/prot.10543
– volume: 16
  start-page: 939
  year: 1998
  ident: 1081_CR30
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt1098-939
– start-page: 127
  volume-title: Pac Symp Biocomput
  year: 2001
  ident: 1081_CR31
– volume: 12
  start-page: 130
  year: 2002
  ident: 1081_CR3
  publication-title: Curr Opin Genet Dev
  doi: 10.1016/S0959-437X(02)00277-0
– volume: 20
  start-page: 3516
  year: 2004
  ident: 1081_CR8
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bth438
– volume: 12
  start-page: 739
  year: 2002
  ident: 1081_CR13
  publication-title: Genome Res
  doi: 10.1101/gr.6902
– volume: 19
  start-page: 1015
  year: 2003
  ident: 1081_CR19
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg124
– volume: 84
  start-page: 4355
  year: 1987
  ident: 1081_CR9
  publication-title: Proc Natl Acad Sci U S A
  doi: 10.1073/pnas.84.13.4355
– volume: 21
  start-page: 2917
  year: 2005
  ident: 1081_CR21
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti445
– volume: 53 Suppl 6
  start-page: 585
  year: 2003
  ident: 1081_CR27
  publication-title: Proteins
  doi: 10.1002/prot.10530
– volume: 100
  start-page: 3339
  year: 2003
  ident: 1081_CR17
  publication-title: Proc Natl Acad Sci U S A
  doi: 10.1073/pnas.0630591100
– volume: 32
  start-page: D303
  year: 2004
  ident: 1081_CR38
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh140
– volume: 8
  start-page: 1202
  year: 1998
  ident: 1081_CR1
  publication-title: Genome Res
  doi: 10.1101/gr.8.11.1202
– volume: 62
  start-page: 1490
  year: 1990
  ident: 1081_CR23
  publication-title: Seikagaku
– volume: 95
  start-page: 717
  year: 1998
  ident: 1081_CR2
  publication-title: Cell
  doi: 10.1016/S0092-8674(00)81641-4
– volume: 53 Suppl 6
  start-page: 395
  year: 2003
  ident: 1081_CR25
  publication-title: Proteins
  doi: 10.1002/prot.10557
– volume: 51
  start-page: 434
  year: 2003
  ident: 1081_CR20
  publication-title: Proteins
  doi: 10.1002/prot.10357
– volume: 5
  start-page: 276
  year: 2004
  ident: 1081_CR5
  publication-title: Nat Rev Genet
  doi: 10.1038/nrg1315
– volume: 20
  start-page: 835
  year: 2002
  ident: 1081_CR32
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt717
– volume: 9
  start-page: 225
  year: 2002
  ident: 1081_CR35
  publication-title: J Comput Biol
  doi: 10.1089/10665270252935430
– reference: 12824309 - Nucleic Acids Res. 2003 Jul 1;31(13):3291-2
– reference: 15840708 - Bioinformatics. 2005 Jun 15;21(12):2917-20
– reference: 9788350 - Nat Biotechnol. 1998 Oct;16(10):939-45
– reference: 15511292 - BMC Bioinformatics. 2004 Oct 28;5:170
– reference: 12626739 - Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3339-44
– reference: 14579350 - Proteins. 2003;53 Suppl 6:585-95
– reference: 11893484 - Curr Opin Genet Dev. 2002 Apr;12(2):130-6
– reference: 14681412 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80
– reference: 15131651 - Nat Rev Genet. 2004 Apr;5(4):276-87
– reference: 12101404 - Nat Biotechnol. 2002 Aug;20(8):835-9
– reference: 12793912 - BMC Bioinformatics. 2003 Jun 7;4:23
– reference: 2086690 - Seikagaku. 1990 Dec;62(12):1490-6
– reference: 11604541 - Protein Sci. 2001 Nov;10(11):2354-62
– reference: 8211139 - Science. 1993 Oct 8;262(5131):208-14
– reference: 12015879 - J Comput Biol. 2002;9(2):225-42
– reference: 15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44
– reference: 12584132 - Bioinformatics. 2003 Feb 12;19(3):423-4
– reference: 14668220 - Bioinformatics. 2003 Dec 12;19(18):2369-80
– reference: 14579328 - Proteins. 2003;53 Suppl 6:395-409
– reference: 9847082 - Genome Res. 1998 Nov;8(11):1202-15
– reference: 14681419 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D303-6
– reference: 11262934 - Pac Symp Biocomput. 2001;:127-38
– reference: 14579324 - Proteins. 2003;53 Suppl 6:352-68
– reference: 12015892 - J Comput Biol. 2002;9(2):447-64
– reference: 12761065 - Bioinformatics. 2003 May 22;19(8):1015-8
– reference: 12915722 - Protein Eng. 2003 Jul;16(7):459-62
– reference: 15606919 - BMC Bioinformatics. 2004;5:205
– reference: 16284194 - Nucleic Acids Res. 2005;33(15):4899-913
– reference: 12696054 - Proteins. 2003 May 15;51(3):434-41
– reference: 9845373 - Cell. 1998 Nov 25;95(5):717-28
– reference: 12385991 - Bioinformatics. 2002;18 Suppl 2:S100-9
– reference: 15297295 - Bioinformatics. 2004 Dec 12;20(18):3516-25
– reference: 12015878 - J Comput Biol. 2002;9(2):211-23
– reference: 15285895 - J Comput Biol. 2004;11(2-3):319-55
– reference: 12441395 - Protein Sci. 2002 Dec;11(12):2974-80
– reference: 3474607 - Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8
– reference: 11997340 - Genome Res. 2002 May;12(5):739-48
SSID ssj0017805
Score 2.0653965
Snippet Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to...
Abstract Background Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 342
SubjectTerms Algorithms
Amino Acid Motifs
Cluster Analysis
Computational Biology - methods
DNA - chemistry
Escherichia coli
Escherichia coli - metabolism
Gene Expression Profiling
Pattern Recognition, Automated
Phylogeny
Sensitivity and Specificity
Sequence Alignment
Sequence Analysis, DNA - methods
Sequence Analysis, Protein - methods
Title EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences
URI https://www.ncbi.nlm.nih.gov/pubmed/16839417
https://www.proquest.com/docview/19479591
https://www.proquest.com/docview/68731279
https://pubmed.ncbi.nlm.nih.gov/PMC1539026
https://doaj.org/article/e43149765b74482781e31f50b50b3d59
Volume 7
WOSCitedRecordID wos000239740000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVADU
  databaseName: BioMedCentral
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RBZ
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.biomedcentral.com/search/
  providerName: BioMedCentral
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: DOA
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: M~E
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RSV
  dateStart: 20001201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Nb9QwEB2VCiQuiPK5FIoPHLiYxokd29wK3YpDd8WhlfZmxc6YrrTNot0tUi_89o6T7NJFrbggRTkkTuTMjDXz4pk3AB8CSpTBILelrLnUQXEbylQFEkUWfa1lbElcT_V4bCYT-_1Wq6-UE9bRA3eCO0TycJJ8pvJaJspKI7AQUWWejqJWbelepu0aTPX7B4mpv60r0oITqFE9qY8w5eHmGqfVJfMtf9TS9t8Va_6dMnnLB508hSd98MiOuknvwQ42z-BR107y-jmMhqPjz6xqGEFTvPQzZNXsx5zQ_8Ulo9iUpQrclLFJ3ootuh7088U1S-l4ccmmDTseH7FNbvULOD8Znn39xvt2CTzQKlzxRP5GAgnGe5to232Mqs5jVSml8ywEI0WOubao6ihNpUOsfU4aVGVlEq4rXsJuM2_wNbDaYh4ptPO1Qkku32vCsgTcMoMqYIUD-LQWmgs9l3hqaTFzLaYwpUtSdknKTjuS8gA-bh742dFo3D_0S9LCZljiv24vkFW43ircv6xiAO_XOnS0XtImSNXg_GrphJWpvbq4f0RpdEHyo3e86nT-Z8YlhZNS6AHoLWvYmuv2nWZ60XJ2k2OxBHff_I-P24fH3Y8gzUXxFnZXiyt8Bw_Dr9V0uTiAB3piDtrlQOfR7-ENeFIKlw
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EMD%3A+an+ensemble+algorithm+for+discovering+regulatory+motifs+in+DNA+sequences&rft.jtitle=BMC+bioinformatics&rft.au=Hu%2C+Jianjun&rft.au=Yang%2C+Yifeng+D&rft.au=Kihara%2C+Daisuke&rft.date=2006-07-13&rft.eissn=1471-2105&rft.volume=7&rft.spage=342&rft_id=info:doi/10.1186%2F1471-2105-7-342&rft_id=info%3Apmid%2F16839417&rft.externalDocID=16839417
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon