Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities

With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amoun...

Full description

Saved in:
Bibliographic Details
Published in:Future generation computer systems Vol. 75; pp. 284 - 298
Main Authors: Cohen-Boulakia, Sarah, Belhajjame, Khalid, Collin, Olivier, Chopard, Jérôme, Froidevaux, Christine, Gaignard, Alban, Hinsen, Konrad, Larmande, Pierre, Bras, Yvan Le, Lemoine, Frédéric, Mareuil, Fabien, Ménager, Hervé, Pradal, Christophe, Blanchet, Christophe
Format: Journal Article
Language:English
Published: Elsevier B.V 01.10.2017
Elsevier
Subjects:
ISSN:0167-739X, 1872-7115
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence that many if not most scientific discoveries will not stand the test of time: increasing the reproducibility of computed results is of paramount importance. The objective we set out in this paper is to place scientific workflows in the context of reproducibility. To do so, we define several kinds of reproducibility that can be reached when scientific workflows are used to perform experiments. We characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework. We also discuss the remaining challenges posed by reproducible scientific workflows in the life sciences. Our study was guided by three use cases from the life science domain involving in silico experiments. •Use cases from the Life Sciences highlighting reproducibility and reuse needs.•Terminology to describe reproducibility levels in scientific workflows.•Criteria to define reproducible-friendly workflow systems and evaluation of systems.•Challenges and opportunities in scientific workflows reproducibility.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2017.01.012