After thousands of hours of investigation, three clinical trials at Duke University in Durham, North Carolina, were suspended in late 2009 because of the irreproducibility of the genomic 'signatures' used to select cancer therapies for patients. Journals have a duty to help the community by maintaining reproducibility as a cornerstone of the scientific process.
The independent reanalysis of these signatures took so long because the information accompanying the associated publications was incomplete. Unfortunately, this is common: for example, a survey of 18 published microarray gene-expression analyses found that the results of only two were exactly reproducible (J. P. Ioannidis et al. Nature Genet. 41, 149–155; 2009). Inadequate information meant that 10 could not be reproduced.
To counter this problem, journals should demand that authors submit sufficient detail for the independent assessment of their paper's conclusions. We recommend that all primary data are backed up with adequate documentation and sample annotation; all primary data sources, such as database accessions or URL links, are presented; and all scripts and software source codes are supplied, with instructions. Analytical (non-scriptable) protocols should be described step by step, and the research protocol, including any plans for research and analysis, should be provided (see http://go.nature.com/UaF2Kv). Files containing such information could be stored as supplements by the journal.
There may be some situations that preclude authors from supplying complete data or code — in protecting patient confidentiality, for example. In such cases, authors should justify the omission and assure independent reproducibility by alternative means.
The quality of scientific output will benefit from setting these standards. As a community, we owe it to patients and to the public to do what we can to ensure the validity of the research we publish.
Author information
Authors and Affiliations
Corresponding author
Supplementary information
Supplementary Information
This document contains supplementary information. (PDF 41 kb)
Rights and permissions
About this article
Cite this article
Baggerly, K. Disclose all data in publications. Nature 467, 401 (2010). https://doi.org/10.1038/467401b
Published:
Issue Date:
DOI: https://doi.org/10.1038/467401b
This article is cited by
-
Semantic representation and enrichment of information retrieval experimental data
International Journal on Digital Libraries (2017)
-
DOI for geoscience data - how early practices shape present perceptions
Earth Science Informatics (2016)
-
Panoramica sul microarray
La Rivista Italiana della Medicina di Laboratorio - Italian Journal of Laboratory Medicine (2015)
-
Ki MoSys: a web-based repository of experimental data for KInetic MOdels of biological SYStems
BMC Systems Biology (2014)
-
A simple and reproducible breast cancer prognostic test
BMC Genomics (2013)