Separate authorship categories to recognize data collectors and code developers

Ewers, Robert M.; Barlow, Jos; Banks-Leite, Cristina; Rahbek, Carsten

doi:10.1038/s41559-019-1033-9

Download PDF

Correspondence
Published: 04 November 2019

Separate authorship categories to recognize data collectors and code developers

Nature Ecology & Evolution volume 3, page 1610 (2019)Cite this article

2579 Accesses
7 Citations
20 Altmetric
Metrics details

Subjects

To the Editor — The current, authorship-based system for recognizing individual contributions to science only patchily recognizes the contributions of the primary data collection that underpins, and code development that supports, the entire discipline. Data collectors and code developers — scientific resource generators — are progressively being forced to donate the grant income and time and effort of generating, curating and documenting data and code to the discipline as a whole^1,2,3. Yet resource users — those that re-use previously published data and codes to generate new knowledge and publications — benefit from that time and effort but are not required to recognize it in any standardized manner. We need a new way to quantify and value what is currently anonymous; the fundamental contribution to scientific progress that generating scientific resources provides.

Many scientists agree that authorship is the ultimate reward for collecting data or developing code. However, the Vancouver Protocol tellingly states that “Participation solely in the … collection of data does not justify authorship.” Citations are routinely raised as the obvious approach to solving this dilemma^4,5, but it is not enough. Citations carry less value to a scientist than authorship. Moreover, citations to scientific resources are agnostic to the impact of the papers that used those resources, resource citations are commonly buried in supplementary material where they do not get picked up by citation tracking software, and published resources not associated with a published manuscript do not contribute to a scientists’ citation indices.

We suggest one solution is to divorce authorship of a manuscript from authorship of the resources used in the manuscript, which can be achieved by creating separate categories of authorship: manuscript and resource authors. Here, a published paper would come with two separate author lists. Manuscript authors are those who developed the question, analysed and interpreted the data, and wrote the paper; “authorship for authors”⁶. Resource authors are those who contributed some or all of the data that were analysed or code that was used. In this system, a resource generator can receive credit for contributing to a paper, but without implying that they agree with, understand, or have even seen, the analysis and the conclusions the manuscript authors have presented.

Membership of the two author lists need not be mutually exclusive, as a single person could reasonably contribute resources and contribute to the manuscript. The set of resource authors from a publication presenting new data or code would be repeated on any subsequent publication(s) re-using those resources, whereas the manuscript authors would change to reflect the identity of team members conducting the new analysis. This approach extends naturally to meta-analyses. The set of resource authors on a meta-analysis would include the resource — not manuscript — authors from publications presenting the original data, along with the authors of unpublished datasets or datasets published in online repositories. Manuscript authorship on a meta-analysis would be restricted to those that conducted the analysis and developed the publication.

Resource authorship provides a path to quantify the value of a scientist’s provision of resources to the wider community, and could be implemented within the framework of the existing, citation-based recognition system. Resource contributions could reasonably be tracked through the use of exactly the same citation indices already in widespread use, but applied to resource rather than manuscript authorship. This would ensure scientists contributing data or code that are frequently re-used in highly cited, influential papers will have higher resource citation metrics than those contributing resources that are infrequently used and published in low-impact papers.

Separating the impact of generating scientific resources from the impact of using those resources provides a way out of the resource generator–resource user tension. The two are complementary aspects of a shared scientific enterprise. Data and reproducible codes represent empirical truth; quantitative, repeatable measurements of the world around us against which we test our understanding. The papers we write are our qualitative interpretation of what those data and codes tell us; they are ephemeral position statements that implicitly embed the sum of our experiences, knowledge and biases to date. Both are important contributions to the advancement of science, and both need to be represented when quantifying the contribution that individuals make to that advance.

References

Whitlock, M. C. Trends Ecol. Evol. 26, 61–65 (2011).
Article Google Scholar
Mislan, K. A. S., Heer, J. M. & White, E. P. Trends Ecol. Evol. 31, 4–7 (2016).
Article CAS Google Scholar
Peng, R. D. Science 334, 1226–1227 (2011).
Article CAS Google Scholar
Amann, R. I. et al. Science 363, 350–352 (2019).
Article CAS Google Scholar
Pierce, H. H., Dev, A., Statham, E. & Bierer, B. E. Nature 570, 30–32 (2019).
Article Google Scholar
Baskin, T. I. Nature 562, 494 (2018).
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Silwood Park Campus, Imperial College London, Ascot, UK
Robert M. Ewers, Cristina Banks-Leite & Carsten Rahbek
Lancaster Environment Centre, Lancaster University, Lancaster, UK
Jos Barlow
Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
Carsten Rahbek

Authors

Robert M. Ewers
View author publications
You can also search for this author in PubMed Google Scholar
Jos Barlow
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Banks-Leite
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Rahbek
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.M.E., J.B., C.B.-L. and C.R. co-developed the ideas. R.M.E. wrote the first draft of the manuscript and J.B., C.B.-L. and C.R. contributed to manuscript editing.

Corresponding author

Correspondence to Robert M. Ewers.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ewers, R.M., Barlow, J., Banks-Leite, C. et al. Separate authorship categories to recognize data collectors and code developers. Nat Ecol Evol 3, 1610 (2019). https://doi.org/10.1038/s41559-019-1033-9

Download citation

Published: 04 November 2019
Issue Date: December 2019
DOI: https://doi.org/10.1038/s41559-019-1033-9

This article is cited by

How phantom databases could contribute to conservation assessments
- Lucas C. Marinho
- Emily Beech
The Science of Nature (2020)

Separate authorship categories to recognize data collectors and code developers

Subjects

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

This article is cited by

How phantom databases could contribute to conservation assessments

Search

Quick links

Subjects

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

How phantom databases could contribute to conservation assessments

Search

Quick links