The EOSC-Life Workflow Collaboratory for the Life Sciences
DOI:
https://doi.org/10.52825/cordi.v1i.352Keywords:
FAIR data, digital objects, Computational Workflows, Data Intensive Bio-Science, Fair Workflows, Fair SoftwareAbstract
Workflows have become a major tool for the processing of Research Data, for example, data collection and data cleaning pipelines, data analytics, and data update feeds populating public archives. The EOSC-Life Research Infrastructure Cluster project brought together Europe’s Life Science Research Infrastructures to create an Open, Digital and Collaborative space for biological and medical research to develop a cloud-based Workflow Collaboratory. As adopting FAIR practices extends beyond data, the Workflow Collaboratory drives the implementation of FAIR computational workflows and tools. It fosters tool-focused collaborations and reuse via the sharing of data analysis workflows and offers an ecosystem of services for researchers and workflow specialists to find, use and reuse workflows. It’s web-friendly Digital Object Metadata Framework, based on RO-Crate and Bioschemas, supports the description and exchange of workflows across the services.
Downloads
References
T. Reiter, P.T. Brooks, L. Irber, S.E.K. Joslin, C.M. Reid, C. Scott, C.T. Brown, N.T. Pierce-Ward, “Streamlining data-intensive biology with workflow systems”, GigaSci-ence, vol.10, no.1, pp:1-19, January 2021, https://doi.org/10.1093/gigascience/giaa140
C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D. Garijo, Y. Gil, M.R. Crusoe, K. Peters, D. Schober, “FAIR Computational Workflows. Data Intelligence” vol.2, no.1, pp:108–121, 2020, https://doi.org/10.1162/dint_a_00033
T. Kluyver, B. Ragan-Kelley, F. Pérez, B.E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J.B. Hamrick, J. Grout, S. Corlay et al “Jupyter notebooks—a publishing for-mat for reproducible computational workflows” In F Loizides, B Scmidt (eds) Interna-tional conference on electronic publishing. IOS Press, ELPUB, Göttingen, 2016, pp:87–90
P. Di Tommaso, M. Chatzou, E. Floden, P.P. Barja, E. Palumbo, C. Notredame, “Nextflow enables reproducible computational workflows”. Nat Biotechnol vol.35, pp:316–319, 2017, https://doi.org/10.1038/nbt.3820
J. Köster, S. Rahmann, “Snakemake—a scalable bioinformatics workflow engine”, Bioinformatics, vol.28, no.19, pp:2520–2522, October 2012, https://doi.org/10.1093/bioinformatics/bts480
E Afgan, D. Baker, B Batut, et al. (2018) “The Galaxy platform for accessible, repro-ducible and collaborative biomedical analyses: 2018 update”, Nucleic Acids Research, vol.46, pp:W537–W544, 2018, https://doi.org/10.1093/nar/gky379
M.R. Crusoe, S. Abeln, A. Iosup, P. Amstutz, J. Chilton, N. Tijanić, H. Ménager, S. Soiland-Reyes, B. Gavrilović, C. Goble, “The CWL Community Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Lan-guage”, CACM, vol.65, no.6, pp:54-63 June 2022, https://doi.org/10.1145/3486897
F. da Veiga Leprevost et al, BioContainers: an open-source and community-driven framework for software standardization, Bioinformatics, vol.33, no.16, pp: 2580–2582, August 2017, https://doi.org/10.1093/bioinformatics/btx192
J. Ison, et al. Tools and data services registry: a community effort to document bioin-formatics resources. Nucleic Acids Research. 2015, vol.44, no.D1, pp:D38–D47January 2016, https://doi.org/10.1093/nar/gkv1116
C. Goble, S. Soiland-Reyes, F. Bacall, S. Owen, L. Pireddu, S. Leo. EOSC-Life Im-plementation of a mechanism for publishing and sharing workflows across instances of the environment. 2023, Zenodo. https://doi.org/10.5281/zenodo.7886545
M. Barker, N.P. Chue Hong, D.S. Katz, A-L. Lamprecht, C. Martinez-Ortiz, F. Psomo-poulos, J. Harrow, L.J. Castro, M. Gruenpeter, P. Andrea Martinez, T. Honeyman. “In-troducing the FAIR Principles for research software”. Sci Data 9, vol.622, 2022, https://doi.org/10.1038/s41597-022-01710-x
A. Gray, L.J. Castro, N. Juty, C. Goble “Schema.org for Scientific Data” in A. Choudhary, G. Fox, T. Hey (eds) Artificial Intelligence for Science, pp:495-514, 2023, https://doi.org/10.1142/9789811265679_0027
J. Ison, M. Kalas, I. Jonassen, D. Bolser, M. Uludag, H. McWilliam, J. Malone, R. Lopez, S. Pettifer, P. Rice, “EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats”, Bioinformatics, vol.29, no.10, pp:1325-32, May 2013 https://doi.org/10.1093/bioinformatics/btt113
S. Soiland-Reyes, P. Sefton, M. Crosas, L.J. Castro, F. Coppens, J.M. Fernández, D. Garijo, B. Grüning, M. La Rosa, S. Leo, E. Ó Carragáin, M. Portier, A. Trisovic, RO-Crate Community, P. Groth, C. Goble “Packaging Research Artefacts with RO-Crate”, Data Science, vol.5, no.2, pp: 97 – 138. 2022, https://doi.org/: 10.3233/DS-210053
S. Soiland-Reyes, P. Sefton, L.J. Castro, F. Coppens, D. Garijo, S. Leo, M. Portier, P. Groth, “Creating lightweight FAIR Digital Objects with RO-Crate”, Research Ideas and Outcomes vol.8, no.e93937, 2022, https://doi.org/10.3897/rio.8.e93937
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2023 Carole Goble, Finn Bacall, Stian Soiland-Reyes, Stuart Owen, Ignacio Eguinoa, Bert Droesbeke, Hervé Ménager, Laura Rodriguez-Navas, José M. Fernández, Björn Grüning, Simone Leo, Luca Pireddu, Michael Crusoe, Johan Gustafsson, Salvador Capella-Gutierrez, Frederik Coppens
This work is licensed under a Creative Commons Attribution 4.0 International License.
Accepted 2023-06-29
Published 2023-09-07
Funding data
-
H2020 Research Infrastructures
Grant numbers H2020-INFRAEDI-02-2018 823830;H2020-INFRAEOSC-2018-2 824087 -
Bioplatforms Australia