Randomized algorithms for motif detection
Motif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software tool. (1) We de...
Uložené v:
| Vydané v: | Journal of bioinformatics and computational biology Ročník 3; číslo 5; s. 1039 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Singapore
01.10.2005
|
| Predmet: | |
| ISSN: | 0219-7200 |
| On-line prístup: | Zistit podrobnosti o prístupe |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Motif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software tool.
(1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most x l for each string, where l is the length of the motif and can be any positive number given by the user. (2) We design an improved EM algorithm that outperforms the original EM algorithm. (3) We develop a software tool, MotifDetector, that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large.
It is available for free at http://www.cs.cityu.edu.hk/~lwang/software/motif/index.html, subject to copyright restrictions. |
|---|---|
| AbstractList | Motif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software tool.MOTIVATIONMotif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software tool.(1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most x l for each string, where l is the length of the motif and can be any positive number given by the user. (2) We design an improved EM algorithm that outperforms the original EM algorithm. (3) We develop a software tool, MotifDetector, that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large.RESULTS(1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most x l for each string, where l is the length of the motif and can be any positive number given by the user. (2) We design an improved EM algorithm that outperforms the original EM algorithm. (3) We develop a software tool, MotifDetector, that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large.It is available for free at http://www.cs.cityu.edu.hk/~lwang/software/motif/index.html, subject to copyright restrictions.AVAILABILITYIt is available for free at http://www.cs.cityu.edu.hk/~lwang/software/motif/index.html, subject to copyright restrictions. Motif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic probes etc. In this paper, we propose a randomized algorithm, design an improved EM algorithm and combine them to form a software tool. (1) We design a randomized algorithm for consensus pattern problem. We can show that with high probability, our randomized algorithm finds a pattern in polynomial time with cost error at most x l for each string, where l is the length of the motif and can be any positive number given by the user. (2) We design an improved EM algorithm that outperforms the original EM algorithm. (3) We develop a software tool, MotifDetector, that uses our randomized algorithm to find good seeds and uses the improved EM algorithm to do local search. We compare MotifDetector with Buhler and Tompa's PROJECTION which is considered to be the best known software for motif detection. Simulations show that MotifDetector is slower than PROJECTION when the pattern length is relatively small, and outperforms PROJECTION when the pattern length becomes large. It is available for free at http://www.cs.cityu.edu.hk/~lwang/software/motif/index.html, subject to copyright restrictions. |
| Author | Dong, Liang Wang, Lusheng |
| Author_xml | – sequence: 1 givenname: Lusheng surname: Wang fullname: Wang, Lusheng email: lwang@cs.cityu.edu.hk organization: Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong, P. R. China. lwang@cs.cityu.edu.hk – sequence: 2 givenname: Liang surname: Dong fullname: Dong, Liang |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/16278946$$D View this record in MEDLINE/PubMed |
| BookMark | eNo1j0tLAzEUhbOo2If-ADcyK8HF6E2aZO4spVgVCoLoerjNQyOTSZ3MLPTX22JdHfj4OIczZ5MudY6xCw43nEtxm0HwuhIAoAC4Apyw2QGVBzZl85w_AYRUHE_ZlGtRYS31jF2_UGdTDD_OFtS-pz4MHzEXPvVFTEPwhXWDM0NI3Rk78dRmd37MBXtb37-uHsvN88PT6m5TmqUSWOpaIS2VJPJYcSW2IAitUU6TriQncoCa5NJoC7WoABVJtOiNFTWC52LBrv56d336Gl0emhiycW1LnUtjbjRWuF-Se_HyKI7b6Gyz60Ok_rv5Pyd-AbCzT7A |
| CitedBy_id | crossref_primary_10_1016_j_jcss_2011_01_003 crossref_primary_10_3390_a6040636 crossref_primary_10_3390_computation9120146 crossref_primary_10_1137_080720401 crossref_primary_10_1109_TCBB_2011_21 crossref_primary_10_1007_s00453_014_9952_y crossref_primary_10_1016_j_dib_2020_105216 crossref_primary_10_1137_080739069 crossref_primary_10_1145_1921659_1921672 |
| ContentType | Journal Article |
| DBID | CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1142/s0219720005001508 |
| DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Biology |
| ExternalDocumentID | 16278946 |
| Genre | Research Support, Non-U.S. Gov't Journal Article |
| GroupedDBID | --- 0R~ 36B 4.4 53G 5GY ADSJI AENEX ALMA_UNASSIGNED_HOLDINGS CAG CGR COF CS3 CUY CVF DU5 EBS ECM EIF EJD EMOBN ESX F5P HZ~ IL9 NPM O9- P71 RWJ TWZ 7X8 |
| ID | FETCH-LOGICAL-c3528-6958a354aaf87152b02a8dc5e6a6741aae086a43c6d0927085a48d8fcd2980f12 |
| IEDL.DBID | 7X8 |
| ISSN | 0219-7200 |
| IngestDate | Fri Jul 11 09:43:52 EDT 2025 Sat Sep 28 08:48:21 EDT 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c3528-6958a354aaf87152b02a8dc5e6a6741aae086a43c6d0927085a48d8fcd2980f12 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 16278946 |
| PQID | 68783524 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_68783524 pubmed_primary_16278946 |
| PublicationCentury | 2000 |
| PublicationDate | 2005-Oct 20051001 |
| PublicationDateYYYYMMDD | 2005-10-01 |
| PublicationDate_xml | – month: 10 year: 2005 text: 2005-Oct |
| PublicationDecade | 2000 |
| PublicationPlace | Singapore |
| PublicationPlace_xml | – name: Singapore |
| PublicationTitle | Journal of bioinformatics and computational biology |
| PublicationTitleAlternate | J Bioinform Comput Biol |
| PublicationYear | 2005 |
| SSID | ssj0024518 |
| Score | 1.795209 |
| Snippet | Motif detection for DNA sequences has many important applications in biological studies, e.g. locating binding sites regulatory signals, designing genetic... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 1039 |
| SubjectTerms | Algorithms Conserved Sequence Data Interpretation, Statistical DNA - chemistry DNA - genetics Likelihood Functions Models, Genetic Models, Statistical Sequence Alignment - methods Sequence Analysis, DNA - methods Sequence Homology, Nucleic Acid |
| Title | Randomized algorithms for motif detection |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/16278946 https://www.proquest.com/docview/68783524 |
| Volume | 3 |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEB5WV8GL78f67MGLh2KbJmkKgoi4eNBlEZW9LbNJqgtuu9pV0F_vpA88iQcJ5BBoCZNJ8s0j3wAckw5QY-ibQAQ-5yL2R5GytOOFTZAwhS4juo83ca-nBoOk34Kz5i2MS6tszsTyoDa5dj7yU6mcj4Lx8-mr72pGudhqXUBjDtoRARmn0_FA_TDtidK7R5dY4sekDHVMM-TstHCDbowm6Ex-9Tu-LO-Z7sr_ZrgKyzW-9C4qhViDls3WYbGqOPm5ASd3mJl8Mv6yxsOXJ_rB7HlSeIRcPZeVl3rGzsrkrGwTHrpX95fXfl0twdeOocWXiVAYCY6YkhEk2ChgqIwWVqIk2IBoyXpBHmlpgoTFBLWQK6NSbViigjRkWzCf5ZndAY8LHqGSWiYouLSoLDeR5o7cTduR4R04aiQwJG10IQbMbP5eDBsZdGC7EuJwWpFmDEPp3txyufvnt3uw1PCjBuE-tFPah_YAFvTHbFy8HZaLTH2vf_sNOG2uSw |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Randomized+algorithms+for+motif+detection&rft.jtitle=Journal+of+bioinformatics+and+computational+biology&rft.au=Wang%2C+Lusheng&rft.au=Dong%2C+Liang&rft.date=2005-10-01&rft.issn=0219-7200&rft.volume=3&rft.issue=5&rft.spage=1039&rft_id=info:doi/10.1142%2Fs0219720005001508&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0219-7200&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0219-7200&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0219-7200&client=summon |