FAIR Research Data With NOMAD
FAIRmat's Distributed, Schema-based Research-data Infrastructure to Harmonize RDM in Materials Science
DOI:
https://doi.org/10.52825/cordi.v1i.376Keywords:
Materials Science, Research Data Management, Metadata, FAIR data, electronic lab notebookAbstract
Scientific research is becoming increasingly data centric, which requires more effort to manage, share, and publish data.
NOMAD is a web-based platform that provides research data management (RDM) for materials-science data. In addition to core RDM functions like uploading and sharing files, NOMAD automatically extracts structured data from supported file formats, normalizes, and converts data from these formats. NOMAD provides an extendable framework for managing not just files, but structured machine-actionable harmonized and inter-operable data. This is the basis for a faceted search with domain-specific filters, a comprehensive API, structured data entry via customizable ELNs, integrated data-analysis and machine-learning tools. NOMAD is run as a free public service and can additionally be operated by research institutes. Connecting NOMAD installations through the public services will allow a federated data infrastructure to share data between research institutes and further harmonize RDM within a large research domain such as materials science.
Downloads
References
M. Scheffler, M. Aeschlimann, M. Albrecht, et al., “Fair data enabling new horizons for ma- terials research,” Nature, vol. 604, no. 7907, pp. 635–642, 2022. DOI: 10.1038/s41586- 022-04501-x.
L.Sbailo`,A ́.Fekete,L.M.Ghiringhelli,andM.Scheffler,“Thenomadartificial-intelligence toolkit: Turning materials-science data into knowledge and understanding,” npj Computa- tional Materials, vol. 8, no. 1, p. 250, 2022. DOI: 10.1038/s41524-022-00935-z.
M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, et al., “The fair guiding principles for scientific data management and stewardship,” Scientific data, vol. 3, no. 1, pp. 1–9, 2016. DOI: 10.1038/sdata.2016.18.
L. M. Ghiringhelli, C. Carbogno, S. Levchenko, et al., “Towards efficient data exchange and sharing for big-data driven materials science: Metadata and data formats,” npj com- putational materials, vol. 3, no. 1, p. 46, 2017. DOI: 10.1038/s41524-017-0048-5.
L. M. Ghiringhelli, C. Baldauf, T. Bereau, et al., “Shared metadata for data-centric materi- als science,” arXiv preprint arXiv:2205.14774, 2022. DOI: 10.48550/arXiv.2205.14774.
European Organization For Nuclear Research and OpenAIRE, Zenodo, en, 2013. DOI: 10.25495/7GXK-RD71. [Online]. Available: https://www.zenodo.org/.
M. Ko ̈nnecke, F. A. Akeroyd, H. J. Bernstein, et al., “The nexus data format,” Jour- nal of applied crystallography, vol. 48, no. 1, pp. 301–305, 2015. DOI: 10 . 1107 / S1600576714027575.
J. Starr, J. Ashton, A. Barton, et al., Datacite metadata schema for the publication and citation of research data, version 3, 2013. DOI: 10.5438/0008. [Online]. Available: https: //schema.datacite.org/meta/kernel-3.0/doc/DataCite-MetadataKernel_v3.0. pdf.
R. Albertoni, D. Browning, S. Cox, A. G. Beltran, A. Perego, and P. Winstanley, Data catalog vocabulary (dcat) - version 2, 2020. [Online]. Available: https://www.w3.org/ TR/vocab-dcat-2/.
C. W. Andersen, R. Armiento, E. Blokhin, et al., “Optimade, an api for exchanging materi- als data,” Scientific data, vol. 8, no. 1, pp. 1–10, 2021. DOI: 10.1038/s41597-021-00974- z.
D. Miller, J. Whitlock, M. Gardiner, M. Ralphson, R. Ratovsky, and U. Sarid, Openapi specification v3.0.3, 2020. [Online]. Available: https://spec.openapis.org/oas/v3.0. 3.
L. Himanen, P. Rinke, and A. S. Foster, “Materials structure genealogy and high- throughput topological classification of surfaces and 2d materials,” npj Computational Materials, vol. 4, no. 1, pp. 1–10, 2018. DOI: 10.1038/s41524-018-0107-6.
A. H. Larsen, J. J. Mortensen, J. Blomqvist, et al., “The atomic simulation environment—a python library for working with atoms,” Journal of Physics: Condensed Matter, vol. 29, no. 27, p. 273 002, 2017. DOI: 10.1088/1361-648X/aa680e.
S. P. Ong, W. D. Richards, A. Jain, et al., “Python materials genomics (pymatgen): A robust, open-source python library for materials analysis,” Computational Materials Sci- ence, vol. 68, pp. 314–319, 2013. DOI: 10.1016/j.commatsci.2012.10.028.
A. Jain, S. P. Ong, G. Hautier, et al., “Commentary: The materials project: A materi- als genome approach to accelerating materials innovation,” APL materials, vol. 1, no. 1, p. 011 002, 2013. DOI: 10.1063/1.4812323.
S. Curtarolo, W. Setyawan, G. L. Hart, et al., “Aflow: An automatic framework for high- throughput materials discovery,” Computational Materials Science, vol. 58, pp. 218–226, 2012. DOI: 10.1016/j.commatsci.2012.02.005.
J. E. Saal, S. Kirklin, M. Aykol, B. Meredig, and C. Wolverton, “Materials design and discovery with high-throughput density functional theory: The open quantum materials database (oqmd),” Jom, vol. 65, no. 11, pp. 1501–1509, 2013. DOI: 10.1007/s11837- 013-0755-4.
P. Ewels, T. Sikora, V. Serin, C. P. Ewels, and L. Lajaunie, “A complete overhaul of the electron energy-loss spectroscopy and x-ray absorption spectroscopy database: Eelsdb.eu,” Microscopy and Microanalysis, vol. 22, pp. 717–724, Feb. 2016, ISSN: 1435- 8115. DOI: 10 . 1017 / S1431927616000179. [Online]. Available: http : / / journals . cambridge.org/article_S1431927616000179.
T. J. Jacobsson, A. Hultqvist, A. Garc ́ıa-Ferna ́ndez, et al., “An open-access database and analysis tool for perovskite solar cells based on the fair data principles,” Nature Energy, vol. 7, no. 1, pp. 107–115, Jan. 2022, ISSN: 2058-7546. DOI: 10.1038/s41560-021- 00941-3. [Online]. Available: https://doi.org/10.1038/s41560-021-00941-3.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2023 Markus Scheidgen, Sebastian Brückner, Sandor Brockhauser, Luca M. Ghiringhelli, Felix Dietrich, Ahmed E. Mansour, Martin Albrecht, Heiko B. Weber, Silvana Botti, Martin Aeschlimann, Claudia Draxl
This work is licensed under a Creative Commons Attribution 4.0 International License.
Accepted 2023-06-29
Published 2023-09-07
Funding data
-
Deutsche Forschungsgemeinschaft
Grant numbers 460197019 -
Horizon 2020 Framework Programme
Grant numbers 951786;676580