EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences
Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble al...
Uloženo v:
| Vydáno v: | BMC bioinformatics Ročník 7; číslo 1; s. 342 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
England
BioMed Central
13.07.2006
BMC |
| Témata: | |
| ISSN: | 1471-2105, 1471-2105 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.
We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.
We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system. |
|---|---|
| AbstractList | Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.
We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.
We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system. Abstract Background Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. Results We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. Conclusion We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system. Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences. We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system. Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.BACKGROUNDUnderstanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.RESULTSWe proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is applied to the motif discovery problem for the first time. The algorithm is tested on a benchmark dataset generated from E. coli RegulonDB. The EMD algorithm has achieved 22.4% improvement in terms of the nucleotide level prediction accuracy over the best stand-alone component algorithm. The advantage of the EMD algorithm is more significant for shorter input sequences, but most importantly, it always outperforms or at least stays at the same performance level of the stand-alone component algorithms even for longer sequences.We proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system.CONCLUSIONWe proposed an ensemble approach for the motif discovery problem by taking advantage of the availability of a large number of motif discovery programs. We have shown that the ensemble approach is an effective strategy for improving both sensitivity and specificity, thus the accuracy of the prediction. The advantage of the EMD algorithm is its flexibility in the sense that a new powerful algorithm can be easily added to the system. |
| ArticleNumber | 342 |
| Author | Hu, Jianjun Kihara, Daisuke Yang, Yifeng D |
| AuthorAffiliation | 1 Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA 4 The Bindley Bioscience Center, Discovery Park, Purdue University, West Lafayette, IN, 47907, USA 2 Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA 3 Markey Center for Structural Biology, Purdue University, West Lafayette, IN, 47907, USA |
| AuthorAffiliation_xml | – name: 2 Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA – name: 3 Markey Center for Structural Biology, Purdue University, West Lafayette, IN, 47907, USA – name: 4 The Bindley Bioscience Center, Discovery Park, Purdue University, West Lafayette, IN, 47907, USA – name: 1 Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA |
| Author_xml | – sequence: 1 givenname: Jianjun surname: Hu fullname: Hu, Jianjun – sequence: 2 givenname: Yifeng D surname: Yang fullname: Yang, Yifeng D – sequence: 3 givenname: Daisuke surname: Kihara fullname: Kihara, Daisuke |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/16839417$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFks1vFSEUxSemxn7o2p1h5W4sl4EBXJg0bdUmrW50TRjm8kozAxXmNel_X56vvrQmxoQEcjn3l8PlHDZ7MUVsmrdAPwCo_hi4hJYBFa1sO85eNAe7yt6T835zWMoNpSAVFa-afehVpznIg-bq_OrsI7GRYCw4DxMSO61SDsv1THzKZAzFpTvMIa5IxtV6skvK92ROS_CFhEjOvp2Qgr_WGB2W181Lb6eCbx73o-bn5_Mfp1_by-9fLk5PLlvHlVpaAEpHoZ0aBg1MisF7MTJvrRCSUecUB4ZMahSj58pK58eBoXOit4pxKruj5mLLHZO9Mbc5zDbfm2SD-V1IeWVsXoKb0CDvgGvZi0FyrphUgB14QYe6umqisj5tWbfrYcbRYVyynZ5Bn9_EcG1W6c6A6DRlfQW8fwTkVOdQFjPXoeE02YhpXUyvZFdfqf8rBM2lFhqq8N1TSzsvf76tCsRW4HIqJaM3Lix2CWnjMEwGqNnEw2wCYDYBMNLUeNS-47_6duh_dDwAA6W6pA |
| CitedBy_id | crossref_primary_10_3390_genes15060676 crossref_primary_10_1093_bioinformatics_btp090 crossref_primary_10_1186_s12864_018_5148_1 crossref_primary_10_1186_1471_2105_12_238 crossref_primary_10_1002_prot_23169 crossref_primary_10_1155_2013_283129 crossref_primary_10_1186_1471_2105_10_321 crossref_primary_10_1109_TCBB_2015_2496261 crossref_primary_10_1016_j_ress_2012_03_008 crossref_primary_10_1093_bib_bbv022 crossref_primary_10_1111_pbi_13550 crossref_primary_10_1002_prot_22082 crossref_primary_10_1186_1471_2105_9_S9_S6 crossref_primary_10_1093_nar_gkp248 crossref_primary_10_1186_1471_2105_8_S7_S21 crossref_primary_10_1093_nar_gkv300 crossref_primary_10_1109_TR_2015_2427156 crossref_primary_10_1128_MMBR_00037_08 crossref_primary_10_1089_cmb_2018_0113 crossref_primary_10_1186_1471_2105_14_9 crossref_primary_10_1186_1471_2229_13_42 crossref_primary_10_1093_bioinformatics_btn420 crossref_primary_10_1155_2018_3837060 crossref_primary_10_1186_1471_2105_13_317 crossref_primary_10_1145_3382078 |
| Cites_doi | 10.1089/1066527041410319 10.1186/1471-2105-4-23 10.1110/ps.0226702 10.1093/bioinformatics/btg329 10.1089/10665270252935566 10.1110/ps.08501 10.1126/science.8211139 10.1093/nar/gkg503 10.1093/nar/gkh063 10.1038/nbt1053 10.1093/protein/gzg063 10.1186/1471-2105-5-205 10.1093/bioinformatics/18.suppl_2.S100 10.1089/10665270252935421 10.1093/nar/gki791 10.1093/bioinformatics/btf872 10.1186/1471-2105-5-170 10.1002/prot.10543 10.1038/nbt1098-939 10.1016/S0959-437X(02)00277-0 10.1093/bioinformatics/bth438 10.1101/gr.6902 10.1093/bioinformatics/btg124 10.1073/pnas.84.13.4355 10.1093/bioinformatics/bti445 10.1002/prot.10530 10.1073/pnas.0630591100 10.1093/nar/gkh140 10.1101/gr.8.11.1202 10.1016/S0092-8674(00)81641-4 10.1002/prot.10557 10.1002/prot.10357 10.1038/nrg1315 10.1038/nbt717 10.1089/10665270252935430 |
| ContentType | Journal Article |
| Copyright | Copyright © 2006 Hu et al; licensee BioMed Central Ltd. 2006 Hu et al; licensee BioMed Central Ltd. |
| Copyright_xml | – notice: Copyright © 2006 Hu et al; licensee BioMed Central Ltd. 2006 Hu et al; licensee BioMed Central Ltd. |
| DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QO 7TM 8FD FR3 P64 7X8 5PM DOA |
| DOI | 10.1186/1471-2105-7-342 |
| DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Biotechnology Research Abstracts Nucleic Acids Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Engineering Research Database Biotechnology Research Abstracts Technology Research Database Nucleic Acids Abstracts Biotechnology and BioEngineering Abstracts MEDLINE - Academic |
| DatabaseTitleList | MEDLINE Engineering Research Database MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1471-2105 |
| EndPage | 342 |
| ExternalDocumentID | oai_doaj_org_article_e43149765b74482781e31f50b50b3d59 PMC1539026 16839417 10_1186_1471_2105_7_342 |
| Genre | Research Support, U.S. Gov't, Non-P.H.S Journal Article Research Support, N.I.H., Extramural |
| GrantInformation_xml | – fundername: NIGMS NIH HHS grantid: R01 GM075004 – fundername: NIGMS NIH HHS grantid: U24 GM077905 – fundername: NIGMS NIH HHS grantid: R01 GM-075004 |
| GroupedDBID | --- 0R~ 123 23N 2VQ 2WC 4.4 53G 5VS 6J9 AAFWJ AAJSJ AAKPC AASML AAYXX ABDBF ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADRAZ ADUKV AEAQA AENEX AFPKN AFRAH AHBYD AHMBA AHSBF AHYZX ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS BAPOH BAWUL BCNDV BENPR BFQNJ BMC C1A C6C CITATION CS3 DIK DU5 E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P GROUPED_DOAJ GX1 H13 HYE IAO ICD IHR INH INR IPNFZ ISR ITC KQ8 M48 MK~ ML0 M~E O5R O5S OK1 OVT P2P PGMZT PIMPY PQQKQ RBZ RIG RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS W2D WOQ WOW XH6 XSB -A0 ACRMQ ADINQ ALIPV C24 CGR CUY CVF ECM EIF NPM 7QO 7TM 8FD FR3 P64 7X8 5PM |
| ID | FETCH-LOGICAL-c488t-1100d59c8bb91275bff5d2faa55720cc8412e279e5df48a7cfdb2ecc56a824073 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 38 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000239740000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1471-2105 |
| IngestDate | Fri Oct 03 12:18:45 EDT 2025 Thu Aug 21 14:06:04 EDT 2025 Thu Sep 04 14:44:46 EDT 2025 Tue Oct 07 09:34:30 EDT 2025 Wed Feb 19 01:49:10 EST 2025 Tue Nov 18 21:16:47 EST 2025 Sat Nov 29 02:17:58 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c488t-1100d59c8bb91275bff5d2faa55720cc8412e279e5df48a7cfdb2ecc56a824073 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://doaj.org/article/e43149765b74482781e31f50b50b3d59 |
| PMID | 16839417 |
| PQID | 19479591 |
| PQPubID | 23462 |
| PageCount | 1 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_e43149765b74482781e31f50b50b3d59 pubmedcentral_primary_oai_pubmedcentral_nih_gov_1539026 proquest_miscellaneous_68731279 proquest_miscellaneous_19479591 pubmed_primary_16839417 crossref_citationtrail_10_1186_1471_2105_7_342 crossref_primary_10_1186_1471_2105_7_342 |
| PublicationCentury | 2000 |
| PublicationDate | 2006-07-13 |
| PublicationDateYYYYMMDD | 2006-07-13 |
| PublicationDate_xml | – month: 07 year: 2006 text: 2006-07-13 day: 13 |
| PublicationDecade | 2000 |
| PublicationPlace | England |
| PublicationPlace_xml | – name: England – name: London |
| PublicationTitle | BMC bioinformatics |
| PublicationTitleAlternate | BMC Bioinformatics |
| PublicationYear | 2006 |
| Publisher | BioMed Central BMC |
| Publisher_xml | – name: BioMed Central – name: BMC |
| References | M Kanehisa (1081_CR39) 2004; 32 Database iss T Wang (1081_CR14) 2003; 19 XS Liu (1081_CR32) 2002; 20 TL Bailey (1081_CR33) 1995 X Liu (1081_CR31) 2001 M Kellis (1081_CR16) 2004; 11 J Lundstrom (1081_CR28) 2001; 10 M Blanchette (1081_CR12) 2002; 9 A Tramontano (1081_CR26) 2003; 53 Suppl 6 J Nilsson (1081_CR36) 2002; 11 JJ Wyrick (1081_CR3) 2002; 12 G Aggarwal (1081_CR18) 2003; 4 J Buhler (1081_CR35) 2002; 9 J Hu (1081_CR6) 2005; 33 H Salgado (1081_CR38) 2004; 32 J Hu (1081_CR40) 2006 M Albrecht (1081_CR22) 2003; 16 K Ginalski (1081_CR24) 2003; 31 FP Roth (1081_CR30) 1998; 16 N Poluliakh (1081_CR37) 2003; 19 FC Holstege (1081_CR2) 1998; 95 CE Lawrence (1081_CR11) 1993; 262 R Osada (1081_CR8) 2004; 20 M Blanchette (1081_CR13) 2002; 12 D Fischer (1081_CR20) 2003; 51 K Nishikawa (1081_CR23) 1990; 62 K Ginalski (1081_CR19) 2003; 19 TL Bailey (1081_CR10) 1995; 21 LN Kinch (1081_CR25) 2003; 53 Suppl 6 C Venclovas (1081_CR27) 2003; 53 Suppl 6 TZ Sen (1081_CR29) 2004; 5 A Brazma (1081_CR1) 1998; 8 K Ellrott (1081_CR7) 2002; 18 Suppl 2 EM Conlon (1081_CR17) 2003; 100 S Sinha (1081_CR15) 2004; 5 HK Saini (1081_CR21) 2005; 21 G Thijs (1081_CR34) 2002; 9 WW Wasserman (1081_CR5) 2004; 5 M Gribskov (1081_CR9) 1987; 84 M Tompa (1081_CR4) 2005; 23 9847082 - Genome Res. 1998 Nov;8(11):1202-15 11262934 - Pac Symp Biocomput. 2001;:127-38 11893484 - Curr Opin Genet Dev. 2002 Apr;12(2):130-6 15131651 - Nat Rev Genet. 2004 Apr;5(4):276-87 14681419 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D303-6 12015878 - J Comput Biol. 2002;9(2):211-23 2086690 - Seikagaku. 1990 Dec;62(12):1490-6 14579328 - Proteins. 2003;53 Suppl 6:395-409 14579324 - Proteins. 2003;53 Suppl 6:352-68 12385991 - Bioinformatics. 2002;18 Suppl 2:S100-9 12626739 - Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3339-44 15297295 - Bioinformatics. 2004 Dec 12;20(18):3516-25 14668220 - Bioinformatics. 2003 Dec 12;19(18):2369-80 12015879 - J Comput Biol. 2002;9(2):225-42 15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44 15606919 - BMC Bioinformatics. 2004;5:205 12015892 - J Comput Biol. 2002;9(2):447-64 12584132 - Bioinformatics. 2003 Feb 12;19(3):423-4 11604541 - Protein Sci. 2001 Nov;10(11):2354-62 12761065 - Bioinformatics. 2003 May 22;19(8):1015-8 12101404 - Nat Biotechnol. 2002 Aug;20(8):835-9 15840708 - Bioinformatics. 2005 Jun 15;21(12):2917-20 16284194 - Nucleic Acids Res. 2005;33(15):4899-913 14681412 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80 12696054 - Proteins. 2003 May 15;51(3):434-41 12824309 - Nucleic Acids Res. 2003 Jul 1;31(13):3291-2 11997340 - Genome Res. 2002 May;12(5):739-48 9788350 - Nat Biotechnol. 1998 Oct;16(10):939-45 15285895 - J Comput Biol. 2004;11(2-3):319-55 15511292 - BMC Bioinformatics. 2004 Oct 28;5:170 12441395 - Protein Sci. 2002 Dec;11(12):2974-80 3474607 - Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8 12915722 - Protein Eng. 2003 Jul;16(7):459-62 9845373 - Cell. 1998 Nov 25;95(5):717-28 14579350 - Proteins. 2003;53 Suppl 6:585-95 12793912 - BMC Bioinformatics. 2003 Jun 7;4:23 8211139 - Science. 1993 Oct 8;262(5131):208-14 |
| References_xml | – volume: 11 start-page: 319 year: 2004 ident: 1081_CR16 publication-title: J Comput Biol doi: 10.1089/1066527041410319 – volume: 4 start-page: 23 year: 2003 ident: 1081_CR18 publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-4-23 – volume: 11 start-page: 2974 year: 2002 ident: 1081_CR36 publication-title: Protein Sci doi: 10.1110/ps.0226702 – volume: 19 start-page: 2369 year: 2003 ident: 1081_CR14 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg329 – volume: 9 start-page: 447 year: 2002 ident: 1081_CR34 publication-title: J Comput Biol doi: 10.1089/10665270252935566 – volume: 10 start-page: 2354 year: 2001 ident: 1081_CR28 publication-title: Protein Sci doi: 10.1110/ps.08501 – volume: 262 start-page: 208 year: 1993 ident: 1081_CR11 publication-title: Science doi: 10.1126/science.8211139 – volume: 31 start-page: 3291 year: 2003 ident: 1081_CR24 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkg503 – volume: 32 Database iss start-page: D277 year: 2004 ident: 1081_CR39 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkh063 – volume: 23 start-page: 137 year: 2005 ident: 1081_CR4 publication-title: Nat Biotechnol doi: 10.1038/nbt1053 – volume-title: Supplementary material for the paper year: 2006 ident: 1081_CR40 – volume: 16 start-page: 459 year: 2003 ident: 1081_CR22 publication-title: Protein Eng doi: 10.1093/protein/gzg063 – volume: 5 start-page: 205 year: 2004 ident: 1081_CR29 publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-5-205 – volume: 18 Suppl 2 start-page: S100 year: 2002 ident: 1081_CR7 publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.suppl_2.S100 – volume: 9 start-page: 211 year: 2002 ident: 1081_CR12 publication-title: J Comput Biol doi: 10.1089/10665270252935421 – start-page: 51 volume-title: Machine Learning year: 1995 ident: 1081_CR33 – volume: 33 start-page: 4899 year: 2005 ident: 1081_CR6 publication-title: Nucleic Acid Res doi: 10.1093/nar/gki791 – volume: 19 start-page: 423 year: 2003 ident: 1081_CR37 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btf872 – volume: 21 start-page: 51 year: 1995 ident: 1081_CR10 publication-title: Machine Learning – volume: 5 start-page: 170 year: 2004 ident: 1081_CR15 publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-5-170 – volume: 53 Suppl 6 start-page: 352 year: 2003 ident: 1081_CR26 publication-title: Proteins doi: 10.1002/prot.10543 – volume: 16 start-page: 939 year: 1998 ident: 1081_CR30 publication-title: Nat Biotechnol doi: 10.1038/nbt1098-939 – start-page: 127 volume-title: Pac Symp Biocomput year: 2001 ident: 1081_CR31 – volume: 12 start-page: 130 year: 2002 ident: 1081_CR3 publication-title: Curr Opin Genet Dev doi: 10.1016/S0959-437X(02)00277-0 – volume: 20 start-page: 3516 year: 2004 ident: 1081_CR8 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth438 – volume: 12 start-page: 739 year: 2002 ident: 1081_CR13 publication-title: Genome Res doi: 10.1101/gr.6902 – volume: 19 start-page: 1015 year: 2003 ident: 1081_CR19 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg124 – volume: 84 start-page: 4355 year: 1987 ident: 1081_CR9 publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.84.13.4355 – volume: 21 start-page: 2917 year: 2005 ident: 1081_CR21 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bti445 – volume: 53 Suppl 6 start-page: 585 year: 2003 ident: 1081_CR27 publication-title: Proteins doi: 10.1002/prot.10530 – volume: 100 start-page: 3339 year: 2003 ident: 1081_CR17 publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.0630591100 – volume: 32 start-page: D303 year: 2004 ident: 1081_CR38 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkh140 – volume: 8 start-page: 1202 year: 1998 ident: 1081_CR1 publication-title: Genome Res doi: 10.1101/gr.8.11.1202 – volume: 62 start-page: 1490 year: 1990 ident: 1081_CR23 publication-title: Seikagaku – volume: 95 start-page: 717 year: 1998 ident: 1081_CR2 publication-title: Cell doi: 10.1016/S0092-8674(00)81641-4 – volume: 53 Suppl 6 start-page: 395 year: 2003 ident: 1081_CR25 publication-title: Proteins doi: 10.1002/prot.10557 – volume: 51 start-page: 434 year: 2003 ident: 1081_CR20 publication-title: Proteins doi: 10.1002/prot.10357 – volume: 5 start-page: 276 year: 2004 ident: 1081_CR5 publication-title: Nat Rev Genet doi: 10.1038/nrg1315 – volume: 20 start-page: 835 year: 2002 ident: 1081_CR32 publication-title: Nat Biotechnol doi: 10.1038/nbt717 – volume: 9 start-page: 225 year: 2002 ident: 1081_CR35 publication-title: J Comput Biol doi: 10.1089/10665270252935430 – reference: 12824309 - Nucleic Acids Res. 2003 Jul 1;31(13):3291-2 – reference: 15840708 - Bioinformatics. 2005 Jun 15;21(12):2917-20 – reference: 9788350 - Nat Biotechnol. 1998 Oct;16(10):939-45 – reference: 15511292 - BMC Bioinformatics. 2004 Oct 28;5:170 – reference: 12626739 - Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3339-44 – reference: 14579350 - Proteins. 2003;53 Suppl 6:585-95 – reference: 11893484 - Curr Opin Genet Dev. 2002 Apr;12(2):130-6 – reference: 14681412 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80 – reference: 15131651 - Nat Rev Genet. 2004 Apr;5(4):276-87 – reference: 12101404 - Nat Biotechnol. 2002 Aug;20(8):835-9 – reference: 12793912 - BMC Bioinformatics. 2003 Jun 7;4:23 – reference: 2086690 - Seikagaku. 1990 Dec;62(12):1490-6 – reference: 11604541 - Protein Sci. 2001 Nov;10(11):2354-62 – reference: 8211139 - Science. 1993 Oct 8;262(5131):208-14 – reference: 12015879 - J Comput Biol. 2002;9(2):225-42 – reference: 15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44 – reference: 12584132 - Bioinformatics. 2003 Feb 12;19(3):423-4 – reference: 14668220 - Bioinformatics. 2003 Dec 12;19(18):2369-80 – reference: 14579328 - Proteins. 2003;53 Suppl 6:395-409 – reference: 9847082 - Genome Res. 1998 Nov;8(11):1202-15 – reference: 14681419 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D303-6 – reference: 11262934 - Pac Symp Biocomput. 2001;:127-38 – reference: 14579324 - Proteins. 2003;53 Suppl 6:352-68 – reference: 12015892 - J Comput Biol. 2002;9(2):447-64 – reference: 12761065 - Bioinformatics. 2003 May 22;19(8):1015-8 – reference: 12915722 - Protein Eng. 2003 Jul;16(7):459-62 – reference: 15606919 - BMC Bioinformatics. 2004;5:205 – reference: 16284194 - Nucleic Acids Res. 2005;33(15):4899-913 – reference: 12696054 - Proteins. 2003 May 15;51(3):434-41 – reference: 9845373 - Cell. 1998 Nov 25;95(5):717-28 – reference: 12385991 - Bioinformatics. 2002;18 Suppl 2:S100-9 – reference: 15297295 - Bioinformatics. 2004 Dec 12;20(18):3516-25 – reference: 12015878 - J Comput Biol. 2002;9(2):211-23 – reference: 15285895 - J Comput Biol. 2004;11(2-3):319-55 – reference: 12441395 - Protein Sci. 2002 Dec;11(12):2974-80 – reference: 3474607 - Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8 – reference: 11997340 - Genome Res. 2002 May;12(5):739-48 |
| SSID | ssj0017805 |
| Score | 2.0653965 |
| Snippet | Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to... Abstract Background Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have... |
| SourceID | doaj pubmedcentral proquest pubmed crossref |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
| StartPage | 342 |
| SubjectTerms | Algorithms Amino Acid Motifs Cluster Analysis Computational Biology - methods DNA - chemistry Escherichia coli Escherichia coli - metabolism Gene Expression Profiling Pattern Recognition, Automated Phylogeny Sensitivity and Specificity Sequence Alignment Sequence Analysis, DNA - methods Sequence Analysis, Protein - methods |
| Title | EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/16839417 https://www.proquest.com/docview/19479591 https://www.proquest.com/docview/68731279 https://pubmed.ncbi.nlm.nih.gov/PMC1539026 https://doaj.org/article/e43149765b74482781e31f50b50b3d59 |
| Volume | 7 |
| WOSCitedRecordID | wos000239740000001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVADU databaseName: BioMedCentral customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RBZ dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.biomedcentral.com/search/ providerName: BioMedCentral – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DOA dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M~E dateStart: 20000101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RSV dateStart: 20001201 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Nb9QwEB2VCiQuiPK5FIoPHLiYxokd29wK3YpDd8WhlfZmxc6YrrTNot0tUi_89o6T7NJFrbggRTkkTuTMjDXz4pk3AB8CSpTBILelrLnUQXEbylQFEkUWfa1lbElcT_V4bCYT-_1Wq6-UE9bRA3eCO0TycJJ8pvJaJspKI7AQUWWejqJWbelepu0aTPX7B4mpv60r0oITqFE9qY8w5eHmGqfVJfMtf9TS9t8Va_6dMnnLB508hSd98MiOuknvwQ42z-BR107y-jmMhqPjz6xqGEFTvPQzZNXsx5zQ_8Ulo9iUpQrclLFJ3ootuh7088U1S-l4ccmmDTseH7FNbvULOD8Znn39xvt2CTzQKlzxRP5GAgnGe5to232Mqs5jVSml8ywEI0WOubao6ihNpUOsfU4aVGVlEq4rXsJuM2_wNbDaYh4ptPO1Qkku32vCsgTcMoMqYIUD-LQWmgs9l3hqaTFzLaYwpUtSdknKTjuS8gA-bh742dFo3D_0S9LCZljiv24vkFW43ircv6xiAO_XOnS0XtImSNXg_GrphJWpvbq4f0RpdEHyo3e86nT-Z8YlhZNS6AHoLWvYmuv2nWZ60XJ2k2OxBHff_I-P24fH3Y8gzUXxFnZXiyt8Bw_Dr9V0uTiAB3piDtrlQOfR7-ENeFIKlw |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EMD%3A+an+ensemble+algorithm+for+discovering+regulatory+motifs+in+DNA+sequences&rft.jtitle=BMC+bioinformatics&rft.au=Hu%2C+Jianjun&rft.au=Yang%2C+Yifeng+D&rft.au=Kihara%2C+Daisuke&rft.date=2006-07-13&rft.eissn=1471-2105&rft.volume=7&rft.spage=342&rft_id=info:doi/10.1186%2F1471-2105-7-342&rft_id=info%3Apmid%2F16839417&rft.externalDocID=16839417 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon |