Energy landscape analysis for regulatory RNA finding using scalable distributed cyberinfrastructure

SUMMARY We investigate the folding energy landscape for a given RNA sequence through Boltzmann ensemble (BE) sampling of RNA secondary structures. The ensemble of sampled structures is used to derive distributions of energies and base‐pair distances between two configurations. We identify structural...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation Jg. 23; H. 17; S. 2292 - 2304
Hauptverfasser: Kim, Joohyun, Huang, Wei, Maddineni, Sharath, Aboul-ela, Fareed, Jha, Shantenu
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Chichester, UK John Wiley & Sons, Ltd 10.12.2011
Schlagworte:
ISSN:1532-0626, 1532-0634, 1532-0634
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:SUMMARY We investigate the folding energy landscape for a given RNA sequence through Boltzmann ensemble (BE) sampling of RNA secondary structures. The ensemble of sampled structures is used to derive distributions of energies and base‐pair distances between two configurations. We identify structural features that can be utilized for RNA gene finding. Characterization of the EL through BE sampling of secondary structures is computationally demanding and has multiple heterogeneous stages. We develop the Distributed Adaptive Runtime Environment to effectively address the computational requirements. Distributed Adaptive Runtime Environment is built upon an extensible and interoperable pilot‐job and supports the concurrent execution of a broad range of task sizes across a range of infrastructure. It is used to investigate two RNA systems of different sizes, S‐adenosyl methionine (SAM) binding RNA sequences known as SAM‐I riboswitches, and the S gene of the bovine corona virus RNA genome. We demonstrate how the implementation lowers the total time to solution for increases in RNA length, the number of sequences investigated, and the number of sampled structures. The distributions of energies and base‐pair distances reveal variations in folding dynamics and pathways among the SAM riboswitch sequences. Our results for BCoV RNA genome sequences also indicate sensitivity of folding to coding‐neutral variations in sequence. We search for a characteristic motif from within the SAM‐I consensus structure – a four‐way junction, among BE sampled structures for all 2910 SAM‐I sequences identified from Rfam (the curated ncRNA family database). We find that BE sampling provides insight into the variations in conformational distribution among sequences of the same ncRNA family. Therefore, BE sampling of secondary structures is a viable pre‐processing or post‐processing tool to complement comparative sequence analysis. The understanding gained shows how appropriately designed cyberinfrastructure can provide new insight into RNA folding and structure formation. Copyright © 2011 John Wiley & Sons, Ltd.
Bibliographie:ark:/67375/WNG-N2C6JDBX-3
istex:C06975D57FCA7300C0DE70650D16AFE0922E3B1C
ArticleID:CPE1796
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0626
1532-0634
1532-0634
DOI:10.1002/cpe.1796