Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework

Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data in...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:PloS one Ročník 10; číslo 3; s. e0116781
Hlavní autoři: Li, Zhenlong, Yang, Chaowei, Jin, Baoxuan, Yu, Manzhu, Liu, Kai, Sun, Min, Zhan, Matthew
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Public Library of Science 05.03.2015
Public Library of Science (PLoS)
Témata:
ISSN:1932-6203, 1932-6203
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
Conceived and designed the experiments: CY ZL BJ. Performed the experiments: ZL KL MY MZ. Analyzed the data: ZL CY BJ KL MY. Contributed reagents/materials/analysis tools: ZL CY BJ MS KL. Wrote the paper: ZL CY MY BJ.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0116781