Abstract
In registered reports (RRs), initial peer review and in-principle acceptance occur before knowing the research outcomes. This combats publication bias and distinguishes planned from unplanned research. How RRs could improve the credibility of research findings is straightforward, but there is little empirical evidence. Also, there could be unintended costs such as reducing novelty. Here, 353 researchers peer reviewed a pair of papers from 29 published RRs from psychology and neuroscience and 57 non-RR comparison papers. RRs numerically outperformed comparison papers on all 19 criteria (mean difference 0.46, scale range −4 to +4) with effects ranging from RRs being statistically indistinguishable from comparison papers in novelty (0.13, 95% credible interval [−0.24, 0.49]) and creativity (0.22, [−0.14, 0.58]) to sizeable improvements in rigour of methodology (0.99, [0.62, 1.35]) and analysis (0.97, [0.60, 1.34]) and overall paper quality (0.66, [0.30, 1.02]). RRs could improve research quality while reducing publication bias and ultimately improve the credibility of the published literature.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data files are available on OSF: https://osf.io/aj4zr/.
Code availability
All files and scripts are available on OSF: https://osf.io/aj4zr/.
References
Chambers, C. What’s next for registered reports? Nature 573, 187–189 (2019).
Chambers, C. The registered reports revolution. Lessons in cultural reform. Significance 16, 23–27 (2019).
Nosek, B. A. & Lakens, D. Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141 (2014).
Nosek, B. A., Spies, J. R. & Motyl, M. Scientific utopia: II. restructuring incentives and practices to promote truth over publishability. Perspect. Psychol. Sci. 7, 615–631 (2012).
Smith, R. Peer review: a flawed process at the heart of science and journals. J. R. Soc. Med. 99, 178–182 (2006).
Fanelli, D. Negative results are disappearing from most disciplines and countries. Scientometrics 90, 891–904 (2012).
Fanelli, D. ‘Positive’ results increase down the hierarchy of the sciences. PLoS ONE 5, e10068 (2010).
Franco, A., Malhotra, N. & Simonovits, G. Publication bias in the social sciences: unlocking the file drawer. Science 345, 1502–1505 (2014).
Dickersin, K. The existence of publication bias and risk factors for its occurrence. JAMA 263, 1385–1389 (1990).
Mahoney, M. J. Publication prejudices: an experimental study of confirmatory bias in the peer review system. Cogn. Ther. Res. 1, 161–175 (1977).
Greenwald, A. G. Consequences of prejudice against the null hypothesis. Psychol. Bull. 82, 1–20 (1975).
Sterling, T. D. Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. J. Am. Stat. Assoc. 54, 30–34 (1959).
Makel, M. C., Plucker, J. A. & Hegarty, B. Replications in psychology research: How often do they really occur? Perspect. Psychol. Sci. 7, 537–542 (2012).
Schmidt, S. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev. Gen. Psychol. 13, 90–100 (2009).
Makel, M. C. & Plucker, J. A. Facts are more important than novelty. Educ. Res. 43, 304–316 (2014).
Schimmack, U. The ironic effect of significant results on the credibility of multiple-study articles. Psychol. Methods 17, 551–566 (2012).
Giner-Sorolla, R. Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspect. Psychol. Sci. 7, 562–571 (2012).
Begley, C. G. & Ellis, L. M. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).
Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets? Nat. Rev. Drug Discov. 10, 712–712 (2011).
Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351, 1433–1436 (2016).
Camerer, C. F. et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2, 637–644 (2018).
Klein, R. A. et al. Many Labs 2: investigating variation in replicability across samples and settings. Adv. Methods Pract. Psychol. Sci. 1, 443–490 (2018).
Klein, R. A. et al. Investigating variation in replicability: a ‘many labs’ replication project. Soc. Psychol. 45, 142–152 (2014).
Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).
Ebersole, C. R. et al. Many Labs 3: evaluating participant pool quality across the academic semester via replication. J. Exp. Soc. Psychol. 67, 68–82 (2016).
Allen, C. & Mehler, D. M. A. Open science challenges, benefits and tips in early career and beyond. PLoS Biol. 17, e3000246 (2019).
Scheel, A. M., Schijen, M. & Lakens, D. An excess of positive results: comparing the standard psychology literature with registered reports. Preprint at PsyArXiv https://osf.io/p6e9c (2020).
Hummer, L. T., Singleton Thorn, F., Nosek, B. A. & Errington, T. M. Evaluating registered reports: a naturalistic comparative study of article impact. Preprint at OSF https://osf.io/5y8w7 (2017).
Cropley, A. Research as artisanship versus research as generation of novelty: the march to nowhere. Creat. Res. J 30, 323–328 (2018).
Baumeister, R. F. Charting the future of social psychology on stormy seas: winners, losers, and recommendations. J. Exp. Soc. Psychol. 66, 153–158 (2016).
Nosek, B. A. & Errington, T. M. The best time to argue about what a replication means? Before you do it. Nature 583, 518–520 (2020).
Gelman, A., Hill, J. & Yajima, M. Why we (usually) don’t have to worry about multiple comparisons. J. Res. Educ. Eff. 5, 189–211 (2012).
Epskamp, S. & Nuijten, M. B. statcheck: extract statistics from articles and recompute P values. R package version 1.3.1 (2018).
Hardwicke, T. E. & Ioannidis, J. P. A. Mapping the universe of registered reports. Nat. Hum. Behav. 2, 793–796 (2018).
Chambers, C. D. & Mellor, D. T. Protocol transparency is vital for registered reports. Nat. Hum. Behav. 2, 791–792 (2018).
John, L. K., Loewenstein, G. & Prelec, D. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23, 524–532 (2012).
Simmons, J. P., Nelson, L. D. & Simonsohn, U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22, 1359–1366 (2011).
Nuijten, M. B., van Assen, M. A. L. M., Hartgerink, C. H. J., Epskamp, S. & Wicherts, J. M. The validity of the tool ‘statcheck’ in discovering statistical reporting inconsistencies. Preprint at PsyArXiv https://osf.io/tcxaj (2017).
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual (2020).
Acknowledgements
The authors thank L. Hummer for help with study planning, A. Denis and Z. Loomas for help preparing survey materials, B. Bouza and N. Buttrick for help with implementing the survey in Qualtrics and A. Allard for help coding the articles. This research was funded by grants from Arnold Ventures and James S. McDonnell Foundation (grant # 220020498) to B.A.N. and supported by the National Science Foundation Graduate Research Fellowship Program (grant # 1247392 awarded to S.R.S). The funders had no role in study design, analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
Conceptualization: Survey: C.K.S., T.M.E. and B.A.N. Article coding: S.R.S., J.B. and S.V. Data curation: Survey: C.K.S. Article coding: S.R.S. and J.B. Formal analysis: Survey: C.K.S. and K.M.E. Article coding: S.V., S.R.S. and J.B. Investigation: Survey: C.K.S. and T.M.E. Article coding: S.R.S., J.B. and S.V. Methodology: Survey: C.K.S., T.M.E., K.M.E. and B.A.N. Article coding: S.R.S., J.B. and S.V. Software: Article coding: S.R.S. and J.B. Visualization: Survey: C.K.S. Article coding: S.R.S. and J.B. Validation: Survey: K.M.E. Article coding: S.V. Project administration: T.M.E. Resources: T.M.E. and F.S.T. Supervision: T.M.E. and B.A.N. Funding acquisition: T.M.E. and B.A.N. Writing, original draft: C.K.S., T.M.E., K.M.E. and B.A.N. Writing, review and editing: C.K.S., T.M.E., S.R.S., J.B., F.S.T., S.V., K.M.E. and B.A.N.
Corresponding author
Ethics declarations
Competing interests
T.M.E., C.K.S. and B.A.N. are employees of the nonprofit Center for Open Science (COS), which has a mission to increase openness, integrity and reproducibility of research. COS offers support to journals, editors and researchers in adopting and conducting RRs. The remaining authors declare no conflicts of interest.
Additional information
Peer review information Nature Human Behaviour thanks Balazs Aczel, Marcel van Assen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Plot of correlations between all difference score outcome variables.
Correlation matrix of the 19 outcome variables with larger darker blue circles indicating stronger positive correlations than smaller lighter blue circles.
Extended Data Fig. 2 Posterior probability distributions for parameter estimates for each DV and each level of Familiar comparing the difference of RRs and comparison articles.
Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.
Extended Data Fig. 3 Posterior probability distributions for parameter estimates for each DV and each level of Improve comparing the difference of RRs and comparison articles.
Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.
Extended Data Fig. 4 Posterior probability distributions for parameter estimates for each DV and each level of ‘Guessed Right’ comparing the difference of RRs and comparison articles.
Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.
Supplementary information
Supplementary information
Supplementary Methods, Supplementary Discussion, Supplementary Figs. 1–6, Supplementary Tables 1–10 and Supplementary References.
Rights and permissions
About this article
Cite this article
Soderberg, C.K., Errington, T.M., Schiavone, S.R. et al. Initial evidence of research quality of registered reports compared with the standard publishing model. Nat Hum Behav 5, 990–997 (2021). https://doi.org/10.1038/s41562-021-01142-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-021-01142-4
This article is cited by
-
Supporting study registration to reduce research waste
Nature Ecology & Evolution (2024)
-
So you got a null result. Will anyone publish it?
Nature (2024)
-
Unequal treatment under the flaw: race, crime & retractions
Current Psychology (2024)
-
Registered report adoption in academic journals: assessing rates in different research domains
Scientometrics (2024)
-
Open Science
Business & Information Systems Engineering (2024)