Correspondence | Published:


Archive computer code with raw data

Nature volume 534, page 326 (16 June 2016) | Download Citation

As the leader of a young research group, I recognize the need to archive more than just the raw data that underpin scientific papers. Archiving computer code is also important for safeguarding scientific integrity and for facilitating ongoing projects.

Most scientific journals demand that researchers make their primary data publicly available in the interest of reproducibility. Access to the associated computer code enables statistical analyses and calculations to be validated (see Nature 514, 536; 2014). The more explicit the links between the data, the code and the resulting outputs (including tables and figures), the easier it is to reproduce the findings.

Software tools such as knitr and R Markdown allow the description and code of a statistical analysis to be combined into a single document, providing a pipeline from the raw data to the final results and figures. Outputs are updated by re-running the scripts using version-control tools such as Git and GitHub.

My group has elected to use these tools and to include R Markdown files as supplementary information to our publications (see, for example, M. A. Stoffel et al. Proc. Natl Acad. Sci. USA 112, E5005–E5012; 2015). I suggest that journals encourage this practice to help to fight the reproducibility crisis.

Author information


  1. University of Bielefeld, Germany.

    • Joseph I. Hoffman


  1. Search for Joseph I. Hoffman in:

Corresponding author

Correspondence to Joseph I. Hoffman.

About this article

Publication history




By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing