Practical costs of data sharing

Aside from the ethics and etiquette of fully open data-sharing (Nature 507, 140; 2014), there are practical issues that journals still need to address.

One is the cost of sharing data. Both the Public Library of Science and the UK Royal Society recommend the storage repository Dryad, which currently charges US$15 for the first gigabyte of data over its 10-gigabyte limit, and $10 per gigabyte thereafter. However, studies in areas such as neuroscience can generate terabytes of raw data (1 terabyte is 1,000 gigabytes) — a quantity that few labs could afford to upload.

And, given that searching Dryad for 'neuroscience' yields just three papers but 2,286 for 'ecology', a 'one-size-fits-all' data-sharing policy may not work across all disciplines.

Another concern is the availability of new computer code. Researchers often write their own data-analysis code for each new study, but do not always document it fully. Making code usable by others may therefore require considerable extra work — particularly given the diversity of computing platforms and software versions (see also N. Barnes Nature 467, 753; 2010).

These challenges also vary by discipline: analysis may comprise a few lines of code in some fields but thousands in others, as dictated by the requirements of individual papers.

Goodhill, G. Practical costs of data sharing. Nature 509, 33 (2014).

