RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment

Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics Jg. 23; H. 15; S. 1883 - 1891
Hauptverfasser: Xu, Xing, Ji, Yongmei, Stormo, Gary D.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Oxford Oxford University Press 01.08.2007
Oxford Publishing Limited (England)
Schlagworte:
ISSN:1367-4803, 1367-4811, 1460-2059, 1367-4811
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally related sequences. Predicting common RNA secondary structures in multiple unaligned sequences remains a challenge in bioinformatics research. Methods and Results: We present a new sampling based algorithm to predict common RNA secondary structures in multiple unaligned sequences. Our algorithm finds the common structure between two sequences by probabilistically sampling aligned stems based on stem conservation calculated from intrasequence base pairing probabilities and intersequence base alignment probabilities. It iteratively updates these probabilities based on sampled structures and subsequently recalculates stem conservation using the updated probabilities. The iterative process terminates upon convergence of the sampled structures. We extend the algorithm to multiple sequences by a consistency-based method, which iteratively incorporates and reinforces consistent structure information from pairwise comparisons into consensus structures. The algorithm has no limitation on predicting pseudoknots. In extensive testing on real sequence data, our algorithm outperformed other leading RNA structure prediction methods in both sensitivity and specificity with a reasonably fast speed. It also generated better structural alignments than other programs in sequences of a wide range of identities, which more accurately represent the RNA secondary structure conservations. Availability: The algorithm is implemented in a C program, RNA Sampler, which is available at http://ural.wustl.edu/software.html Contact: xingxu@ural.wustl.edu and stormo@genetics.wustl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliographie:To whom correspondence should be addressed.
istex:B202C3A5F2A32F4E63822014CC5253C7880199BC
ark:/67375/HXZ-RK9KZ7QH-H
Associate Editor: John Quackenbush
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btm272