RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment
Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally r...
Gespeichert in:
| Veröffentlicht in: | Bioinformatics Jg. 23; H. 15; S. 1883 - 1891 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Oxford
Oxford University Press
01.08.2007
Oxford Publishing Limited (England) |
| Schlagworte: | |
| ISSN: | 1367-4803, 1367-4811, 1460-2059, 1367-4811 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally related sequences. Predicting common RNA secondary structures in multiple unaligned sequences remains a challenge in bioinformatics research. Methods and Results: We present a new sampling based algorithm to predict common RNA secondary structures in multiple unaligned sequences. Our algorithm finds the common structure between two sequences by probabilistically sampling aligned stems based on stem conservation calculated from intrasequence base pairing probabilities and intersequence base alignment probabilities. It iteratively updates these probabilities based on sampled structures and subsequently recalculates stem conservation using the updated probabilities. The iterative process terminates upon convergence of the sampled structures. We extend the algorithm to multiple sequences by a consistency-based method, which iteratively incorporates and reinforces consistent structure information from pairwise comparisons into consensus structures. The algorithm has no limitation on predicting pseudoknots. In extensive testing on real sequence data, our algorithm outperformed other leading RNA structure prediction methods in both sensitivity and specificity with a reasonably fast speed. It also generated better structural alignments than other programs in sequences of a wide range of identities, which more accurately represent the RNA secondary structure conservations. Availability: The algorithm is implemented in a C program, RNA Sampler, which is available at http://ural.wustl.edu/software.html Contact: xingxu@ural.wustl.edu and stormo@genetics.wustl.edu. Supplementary information: Supplementary data are available at Bioinformatics online. |
|---|---|
| Bibliographie: | To whom correspondence should be addressed. istex:B202C3A5F2A32F4E63822014CC5253C7880199BC ark:/67375/HXZ-RK9KZ7QH-H Associate Editor: John Quackenbush ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1367-4803 1367-4811 1460-2059 1367-4811 |
| DOI: | 10.1093/bioinformatics/btm272 |