Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities

With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence that many if not most scientific discoveries will not stand the test of time: increasing the reproducibility of computed results is of paramount importance. The objective we set out in this paper is to place scientific workflows in the context of reproducibility. To do so, we define several kinds of reproducibility that can be reached when scientific workflows are used to perform experiments. We characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework. We also discuss the remaining challenges posed by reproducible scientific workflows in the life sciences. Our study was guided by three use cases from the life science domain involving in silico experiments.

Saved in:
Bibliographic Details
Main Authors: Cohen-Boulakia, Sarah, Belhajjame, Khalid, Collin, Olivier, Chopard, Jérôme, Froidevaux, Christine, Gaignard, Alban, Hinsen, Konrad, Larmande, Pierre, Le Bras, Yvan, Lemoine, Frédéric, Mareuil, Fabien, Ménager, Hervé, Pradal, Christophe, Blanchet, Christophe
Format: article biblioteca
Language:eng
Subjects:C30 - Documentation et information, U30 - Méthodes de recherche, informatique, analyse de données, fouille de textes, http://aims.fao.org/aos/agrovoc/c_27769, http://aims.fao.org/aos/agrovoc/c_15962, http://aims.fao.org/aos/agrovoc/c_dca12b72,
Online Access:http://agritrop.cirad.fr/583255/
http://agritrop.cirad.fr/583255/1/scientific-workflows-computational%283%29.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-cirad-fr-583255
record_format koha
spelling dig-cirad-fr-5832552024-01-29T05:38:00Z http://agritrop.cirad.fr/583255/ http://agritrop.cirad.fr/583255/ Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities. Cohen-Boulakia Sarah, Belhajjame Khalid, Collin Olivier, Chopard Jérôme, Froidevaux Christine, Gaignard Alban, Hinsen Konrad, Larmande Pierre, Le Bras Yvan, Lemoine Frédéric, Mareuil Fabien, Ménager Hervé, Pradal Christophe, Blanchet Christophe. 2017. Future Generation Computer Systems, 75 : 284-298.https://doi.org/10.1016/j.future.2017.01.012 <https://doi.org/10.1016/j.future.2017.01.012> Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities Cohen-Boulakia, Sarah Belhajjame, Khalid Collin, Olivier Chopard, Jérôme Froidevaux, Christine Gaignard, Alban Hinsen, Konrad Larmande, Pierre Le Bras, Yvan Lemoine, Frédéric Mareuil, Fabien Ménager, Hervé Pradal, Christophe Blanchet, Christophe eng 2017 Future Generation Computer Systems C30 - Documentation et information U30 - Méthodes de recherche informatique analyse de données fouille de textes http://aims.fao.org/aos/agrovoc/c_27769 http://aims.fao.org/aos/agrovoc/c_15962 http://aims.fao.org/aos/agrovoc/c_dca12b72 With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence that many if not most scientific discoveries will not stand the test of time: increasing the reproducibility of computed results is of paramount importance. The objective we set out in this paper is to place scientific workflows in the context of reproducibility. To do so, we define several kinds of reproducibility that can be reached when scientific workflows are used to perform experiments. We characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework. We also discuss the remaining challenges posed by reproducible scientific workflows in the life sciences. Our study was guided by three use cases from the life science domain involving in silico experiments. article info:eu-repo/semantics/article Journal Article info:eu-repo/semantics/acceptedVersion http://agritrop.cirad.fr/583255/1/scientific-workflows-computational%283%29.pdf text Cirad license info:eu-repo/semantics/openAccess https://agritrop.cirad.fr/mention_legale.html https://doi.org/10.1016/j.future.2017.01.012 10.1016/j.future.2017.01.012 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.future.2017.01.012 info:eu-repo/semantics/altIdentifier/purl/https://doi.org/10.1016/j.future.2017.01.012
institution CIRAD FR
collection DSpace
country Francia
countrycode FR
component Bibliográfico
access En linea
databasecode dig-cirad-fr
tag biblioteca
region Europa del Oeste
libraryname Biblioteca del CIRAD Francia
language eng
topic C30 - Documentation et information
U30 - Méthodes de recherche
informatique
analyse de données
fouille de textes
http://aims.fao.org/aos/agrovoc/c_27769
http://aims.fao.org/aos/agrovoc/c_15962
http://aims.fao.org/aos/agrovoc/c_dca12b72
C30 - Documentation et information
U30 - Méthodes de recherche
informatique
analyse de données
fouille de textes
http://aims.fao.org/aos/agrovoc/c_27769
http://aims.fao.org/aos/agrovoc/c_15962
http://aims.fao.org/aos/agrovoc/c_dca12b72
spellingShingle C30 - Documentation et information
U30 - Méthodes de recherche
informatique
analyse de données
fouille de textes
http://aims.fao.org/aos/agrovoc/c_27769
http://aims.fao.org/aos/agrovoc/c_15962
http://aims.fao.org/aos/agrovoc/c_dca12b72
C30 - Documentation et information
U30 - Méthodes de recherche
informatique
analyse de données
fouille de textes
http://aims.fao.org/aos/agrovoc/c_27769
http://aims.fao.org/aos/agrovoc/c_15962
http://aims.fao.org/aos/agrovoc/c_dca12b72
Cohen-Boulakia, Sarah
Belhajjame, Khalid
Collin, Olivier
Chopard, Jérôme
Froidevaux, Christine
Gaignard, Alban
Hinsen, Konrad
Larmande, Pierre
Le Bras, Yvan
Lemoine, Frédéric
Mareuil, Fabien
Ménager, Hervé
Pradal, Christophe
Blanchet, Christophe
Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
description With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence that many if not most scientific discoveries will not stand the test of time: increasing the reproducibility of computed results is of paramount importance. The objective we set out in this paper is to place scientific workflows in the context of reproducibility. To do so, we define several kinds of reproducibility that can be reached when scientific workflows are used to perform experiments. We characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework. We also discuss the remaining challenges posed by reproducible scientific workflows in the life sciences. Our study was guided by three use cases from the life science domain involving in silico experiments.
format article
topic_facet C30 - Documentation et information
U30 - Méthodes de recherche
informatique
analyse de données
fouille de textes
http://aims.fao.org/aos/agrovoc/c_27769
http://aims.fao.org/aos/agrovoc/c_15962
http://aims.fao.org/aos/agrovoc/c_dca12b72
author Cohen-Boulakia, Sarah
Belhajjame, Khalid
Collin, Olivier
Chopard, Jérôme
Froidevaux, Christine
Gaignard, Alban
Hinsen, Konrad
Larmande, Pierre
Le Bras, Yvan
Lemoine, Frédéric
Mareuil, Fabien
Ménager, Hervé
Pradal, Christophe
Blanchet, Christophe
author_facet Cohen-Boulakia, Sarah
Belhajjame, Khalid
Collin, Olivier
Chopard, Jérôme
Froidevaux, Christine
Gaignard, Alban
Hinsen, Konrad
Larmande, Pierre
Le Bras, Yvan
Lemoine, Frédéric
Mareuil, Fabien
Ménager, Hervé
Pradal, Christophe
Blanchet, Christophe
author_sort Cohen-Boulakia, Sarah
title Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
title_short Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
title_full Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
title_fullStr Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
title_full_unstemmed Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
title_sort scientific workflows for computational reproducibility in the life sciences: status, challenges and opportunities
url http://agritrop.cirad.fr/583255/
http://agritrop.cirad.fr/583255/1/scientific-workflows-computational%283%29.pdf
work_keys_str_mv AT cohenboulakiasarah scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT belhajjamekhalid scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT collinolivier scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT chopardjerome scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT froidevauxchristine scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT gaignardalban scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT hinsenkonrad scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT larmandepierre scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT lebrasyvan scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT lemoinefrederic scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT mareuilfabien scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT menagerherve scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT pradalchristophe scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
AT blanchetchristophe scientificworkflowsforcomputationalreproducibilityinthelifesciencesstatuschallengesandopportunities
_version_ 1792499197444882432