Science funders and researchers need to recognize the time, resources and effort required to curate open data (see Nature 537, 138; 2016). Although organizations such as the US National Science Foundation and the European Commission are aiming to make data repositories financially self-sustaining, this is unlikely to happen within one or two funding cycles.
There is no reliable business model to finance the curation and maintenance of data repositories. Databases therefore often restrict access to subscribers (see, for example, go.nature.com/2dzc59o), curtailing opportunities for interoperability and collaboration.
Curation is not fully automated for most data types. This means that — in the life sciences, for example — many popular databases must resort to time-consuming manual curation to check data quality, reliability, provenance, format and metadata (S. Leonelli Data-Centric Biology Chicago Univ. Press; 2016).
Crowdsourcing models are promising in this respect because data producers ensure that the deposited data are accurate and reusable, but these models are still not widely deployed (see go.nature.com/2d6p9kc).
To make open data effective as a research tool, computational and field-specific skills need to mesh. This will ensure that data infrastructures are user-friendly and resilient in the face of vertiginous developments.
About this article
Cite this article
Leonelli, S. Open data: curation is under-resourced. Nature 538, 41 (2016). https://doi.org/10.1038/538041d
This article is cited by
Governance of research consortia: challenges of implementing Responsible Research and Innovation within Europe
Life Sciences, Society and Policy (2020)
Scientific Data (2019)