A Survey of Data-Intensive Scientific Workflow Management

Nowadays, more and more computer-based scientific experiments need to handle massive amounts of data. Their data processing consists of multiple computational steps and dependencies within them. A data-intensive scientific workflow is useful for modeling such process. Since the sequential execution...

Full description

Saved in:
Bibliographic Details
Published in:Journal of grid computing Vol. 13; no. 4; pp. 457 - 493
Main Authors: Liu, Ji, Pacitti, Esther, Valduriez, Patrick, Mattoso, Marta
Format: Journal Article
Language:English
Published: Dordrecht Springer Netherlands 01.12.2015
Springer Nature B.V
Springer Verlag
Subjects:
ISSN:1570-7873, 1572-9184
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Nowadays, more and more computer-based scientific experiments need to handle massive amounts of data. Their data processing consists of multiple computational steps and dependencies within them. A data-intensive scientific workflow is useful for modeling such process. Since the sequential execution of data-intensive scientific workflows may take much time, Scientific Workflow Management Systems ( SWfMSs ) should enable the parallel execution of data-intensive scientific workflows and exploit the resources distributed in different infrastructures such as grid and cloud. This paper provides a survey of data-intensive scientific workflow management in SWfMSs and their parallelization techniques. Based on a SWfMS functional architecture, we give a comparative analysis of the existing solutions. Finally, we identify research issues for improving the execution of data-intensive scientific workflows in a multisite cloud.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1570-7873
1572-9184
DOI:10.1007/s10723-015-9329-8