Data sharing is not only good citizenship for researchers, but is also required by funding agencies and many journals. The scientific community needs to develop better incentives to encourage compliance and reward those who share.
In June, David Van Essen and the Society for Neuroscience brought together a group of people who are not usually found at the same meetings. Their aim was to explore how journals and databases could interact to improve the usefulness and sharing of neuroscience data. Attendees included editors and publishers from various journals, database organizers, informaticists, librarians and funding agency representatives. The discussion revealed considerable confusion about journals' existing policies for data sharing and availability in connection with publication.
In theory, obtaining published data from another researcher for reanalysis or to test a computational model should be as simple as asking for it. In practice, it's not always so straightforward. Some of the problems are technical—obstacles include incompatible file formats, complex datasets and the development of minimal information standards that allow other scientists to understand how the data were collected—but other difficulties are cultural. Requirements that researchers share data systematically are fairly recent, and the institutions and infrastructure to facilitate this process are not yet fully formed.
Data sharing is a common requirement of funding or publication, though this obligation may come as a surprise to some authors—and to their colleagues who have had trouble acquiring data from other laboratories. Many granting agencies, including the Wellcome Trust in the United Kingdom and the National Institutes of Health and the Howard Hughes Medical Institute in the United States, require grantees to share data whenever possible, so as to maximize the usefulness of the data whose collection they have funded.
Similarly, journals have a responsibility to ensure that other researchers can replicate and build on the studies that they have published. Publication in any Nature journal carries an obligation to make the materials, data and associated protocols underlying the paper available, as detailed in the guide to authors (http://www.nature.com/authors/editorial_policies/availability.html). Most authors are aware that they are required to deposit gene sequences or microarray data into standard databases before publication, but they may not realize that they must also make other data reported in the paper available to any interested reader after the date of publication. For commonly requested datasets, the easiest way to meet this requirement is usually to deposit the data in an appropriate database, if one exists. The Neuroinformatics Committee of the Society for Neuroscience has an extensive listing of neuroscience databases at http://ndg.sfn.org.
What should a researcher do if a request for data is met with refusal or persistent stonewalling? For data related to a Nature publication, the scientist who wants the data should contact the chief editor of the journal that published the paper. The editor will attempt to resolve the complaint directly with the authors. If necessary, the editor may refer the complaint to the authors' funding agency or attach a note to the paper online stating that readers have been unable to obtain the data or materials necessary to replicate the work.
Another area of confusion concerns intellectual property rights to published data. Whether authors sign a license to allow a journal to publish their paper (as they do for the Nature journals) or give up their copyright, journals have rights only to the published paper, not to the data on which the paper is based. In particular, depositing data into a publicly accessible database does not interfere with the authors' ability to publish future papers based on the data—though it may also allow others to publish papers based on the same data, depending on the policies of the particular database. (Any such publication should cite the source of the data, of course.)
Similarly, data that are made available online as supplementary information connected with a publication are part of the public record, and so are available for further research or reanalysis. In most cases, including for all Nature journals, supplementary information is freely available and can be accessed without a subscription or site license, so posting data as supplementary information does not cause them to become proprietary.
If data sharing is to become a routine part of academic life, universities and funding agencies will need to make further efforts to encourage it. One major step forward would be universities to give credit for good citizenship, as reflected in data sharing, during hiring and promotion decisions. This process would be facilitated by a system to track the downloading and use of shared data. Similarly, funding agencies may give preference in awarding grants to scientists who can demonstrate that they have provided easy access to their data collected in connection with previous grants.
The community will also need to overcome a major infrastructure problem with database sustainability. It is much easier to obtain funding to create or improve a database than to maintain one. This situation creates a chicken-and-egg problem because scientists are less likely to submit data to a precariously funded database, and underused databases are correspondingly more likely to encounter funding difficulties.
Some researchers have expressed concerns that data sharing may add little value for most of the scientific community. One entry-level benefit is that sharing data increases citation of the original paper (Piwowar, H.A. et al. PLoS ONE 2, e308, 2007; doi: 10.1371/journal.pone.0000308). Does anyone want your data? That's hard to predict, but the easier it becomes to request data and to receive credit for sharing it, the more likely people are to ask. After all, no one ever knocked on your door asking to buy those figurines collecting dust in your cabinet before you listed them on eBay. Your data, too, may simply be awaiting an effective matchmaker.
About this article
Cite this article
Got data?. Nat Neurosci 10, 931 (2007). https://doi.org/10.1038/nn0807-931