RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment

Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally r...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics Jg. 23; H. 15; S. 1883 - 1891
Hauptverfasser:	Xu, Xing, Ji, Yongmei, Stormo, Gary D.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Oxford Oxford University Press 01.08.2007 Oxford Publishing Limited (England)
Schlagworte:	Algorithms Base Sequence Bioinformatics Biological and medical sciences Computer Simulation Databases, Genetic Fundamental and applied biological sciences. Psychology General aspects Genetics Information Storage and Retrieval - methods Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Chemical Models, Molecular Molecular Sequence Data Nucleic Acid Conformation RNA - chemistry RNA - genetics RNA - ultrastructure Sample Size Sequence Alignment - methods Sequence Analysis, RNA - methods RNA Conservation Prediction Samplings Regulator gene C language Secondary structure Algorithm Original document Alignment Regulation(control) Bioinformatics Comparative study
ISSN:	1367-4803, 1367-4811, 1460-2059, 1367-4811
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally related sequences. Predicting common RNA secondary structures in multiple unaligned sequences remains a challenge in bioinformatics research. Methods and Results: We present a new sampling based algorithm to predict common RNA secondary structures in multiple unaligned sequences. Our algorithm finds the common structure between two sequences by probabilistically sampling aligned stems based on stem conservation calculated from intrasequence base pairing probabilities and intersequence base alignment probabilities. It iteratively updates these probabilities based on sampled structures and subsequently recalculates stem conservation using the updated probabilities. The iterative process terminates upon convergence of the sampled structures. We extend the algorithm to multiple sequences by a consistency-based method, which iteratively incorporates and reinforces consistent structure information from pairwise comparisons into consensus structures. The algorithm has no limitation on predicting pseudoknots. In extensive testing on real sequence data, our algorithm outperformed other leading RNA structure prediction methods in both sensitivity and specificity with a reasonably fast speed. It also generated better structural alignments than other programs in sequences of a wide range of identities, which more accurately represent the RNA secondary structure conservations. Availability: The algorithm is implemented in a C program, RNA Sampler, which is available at http://ural.wustl.edu/software.html Contact: xingxu@ural.wustl.edu and stormo@genetics.wustl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliographie:	To whom correspondence should be addressed. istex:B202C3A5F2A32F4E63822014CC5253C7880199BC ark:/67375/HXZ-RK9KZ7QH-H Associate Editor: John Quackenbush ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1367-4803 1367-4811 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/btm272