Enabling Large-Scale Bioinformatics Data Analysis with Cloud Computing

The petabyte scale of the Big Data generation in bioinformatics requires the introduction of advanced computational techniques to enable efficient knowledge discovery from data. Many data analysis tools in bioinformatics have been developed but few have been adapted to take advantage of high perform...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications S. 640 - 645
Hauptverfasser:	Karlsson, J., Torreno, O., Ramet, D., Klambauer, G., Cano, M., Trelles, O.
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.07.2012
Schlagworte:	Benchmark testing Big Data Bioinformatics Cloud computing Computer architecture Educational institutions Map Reduce Schedules
ISBN:	1467316318, 9781467316316
ISSN:	2158-9178
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The petabyte scale of the Big Data generation in bioinformatics requires the introduction of advanced computational techniques to enable efficient knowledge discovery from data. Many data analysis tools in bioinformatics have been developed but few have been adapted to take advantage of high performance computing (HPC) resources. For some of these tools, an attractive option is to employ a map/reduce strategy. On the other hand, Cloud Computing could be an important platform to run such tools in parallel because it provides on-demand, elastic computational resources. This paper presents a software suite for Microsoft Azure which supports legacy software (without modifications of the algorithm). We demonstrate the feasibility of the approach by benchmarking a typical bioinformatics tool, namely dotplot.
ISBN:	1467316318 9781467316316
ISSN:	2158-9178
DOI:	10.1109/ISPA.2012.95