Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

Specification curve analysis

A Publisher Correction to this article was published on 09 October 2020

This article has been updated

Abstract

Empirical results hinge on analytical decisions that are defensible, arbitrary and motivated. These decisions probably introduce bias (towards the narrative put forward by the authors), and they certainly involve variability not reflected by standard errors. To address this source of noise and bias, we introduce specification curve analysis, which consists of three steps: (1) identifying the set of theoretically justified, statistically valid and non-redundant specifications; (2) displaying the results graphically, allowing readers to identify consequential specifications decisions; and (3) conducting joint inference across all specifications. We illustrate the use of this technique by applying it to three findings from two different papers, one investigating discrimination based on distinctively Black names, the other investigating the effect of assigning female versus male names to hurricanes. Specification curve analysis reveals that one finding is robust, one is weak and one is not robust at all.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Sets of possible specifications as perceived by researchers.
Fig. 2: Descriptive specification curve.
Fig. 3: Observed and expected under-the-null specification curves for the hurricanes and racial discrimination studies.

Similar content being viewed by others

Data availability

The datasets used for both demonstrations have been deposited at OSF: https://osf.io/9rvps/

Code availability

The code used to generate all figures and calculations, including those in the Supplementary information, has been deposited at OSF: https://osf.io/9rvps/

Change history

  • 09 October 2020

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

  1. Leamer, E. E. Let’s take the con out of econometrics. Am. Econ. Rev. 73, 31-43 (1983).

  2. Ioannidis, J. P. A. Why most published research findings are false. PLoS Med. 2, 696–701 (2005).

    Google Scholar 

  3. Simmons, J. P., Nelson, L. D. & Simonsohn, U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22, 1359–1366 (2011).

    Article  Google Scholar 

  4. Glaeser, E. L. Researcher incentives and empirical methods. NBER Technical Working Paper Series https://doi.org/10.3386/t0329 (2006).

  5. Efron, B. Estimation and accuracy after model selection. J. Am. Stat. Assoc. 109, 991–1007 (2014).

    Article  CAS  Google Scholar 

  6. White, H. A reality check for data snooping. Econometrica 68, 1097–1126 (2000).

    Article  Google Scholar 

  7. Athey, S. & Imbens, G. A measure of robustness to misspecification. Am. Econ. Rev. 105, 476–480 (2015).

  8. Sala-i-Martin, X. X. I just ran two million regressions. Am. Econ. Rev. 87, 178–183 (1997).

  9. Muñoz, J. & Young, C. We ran 9 billion regressions: eliminating false positives through computational model robustness. Sociol. Methodol. 48, 1–33 (2018).

    Article  Google Scholar 

  10. Young, C. & Holsteen, K. Model uncertainty and robustness: a computational framework for multimodel analysis. Sociol. Methods Res. 46, 3–40 (2017).

    Article  Google Scholar 

  11. Miguel, E. et al. Promoting transparency in social science research. Science 343, 30–31 (2014).

    Article  CAS  Google Scholar 

  12. Moore, D. A. Preregister if you want to. Am. Psychol. 71, 238–239 (2016).

    Article  Google Scholar 

  13. Bhargava, S., Kassam, K. S. & Loewenstein, G. A reassessment of the defense of parenthood. Psychol. Sci. 25, 299–302 (2014).

    Article  Google Scholar 

  14. DellaVigna, S. & Malmendier, U. Paying not to go to the gym. Am. Econ. Rev. 96, 694–719 (2006).

    Article  Google Scholar 

  15. Stevenson, B. & Wolfers, J. Economic growth and subjective well-being: reassessing the Easterlin Paradox. Brookings Pap. Econ. Act. 2008, 1–87 (2008).

    Article  Google Scholar 

  16. Card, D. & Krueger, A. B. Minimum wages and employment: a case study of the fast-food industry in New Jersey and Pennsylvania. Am. Econ. Rev. 84, 772–793 (1994).

    Google Scholar 

  17. Jung, K., Shavitt, S., Viswanathan, M. & Hilbe, J. M. Female hurricanes are deadlier than male hurricanes. Proc. Natl Acad. Sci. USA 111, 8782–8787 (2014).

  18. Malter, D. Female hurricanes are not deadlier than male hurricanes. Proc. Natl Acad. Sci. USA 111, E3496 (2014).

    Article  CAS  Google Scholar 

  19. Maley, S. Statistics show no evidence of gender bias in the public’s hurricane preparedness. Proc. Natl Acad. Sci. USA 111, E3834 (2014).

    Article  CAS  Google Scholar 

  20. Bakkensen, L. & Larson, W. Population matters when modeling hurricane fatalities. Proc. Natl Acad. Sci. USA 111, E5331 (2014).

    Article  CAS  Google Scholar 

  21. Christensen, B. & Christensen, S. Are female hurricanes really deadlier than male hurricanes? Proc. Natl Acad. Sci. USA 111, E3497–E3498 (2014).

    Article  CAS  Google Scholar 

  22. Jung, K., Shavitt, S., Viswanathan, M. & Hilbe, J. M. Reply to Christensen and Christensen and to Malter: pitfalls of erroneous analyses of hurricanes names. Proc. Natl Acad. Sci. USA 111, E3499–E3500 (2014).

    Article  CAS  Google Scholar 

  23. Bertrand, M. & Mullainathan, S. Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. Am. Econ. Rev. 94, 991–1013 (2004).

    Article  Google Scholar 

  24. Boos, D. D. Introduction to the bootstrap world. Stat. Sci. 18, 168–174 (2003).

    Article  Google Scholar 

  25. Bickel, P. J. & Ren, J.-J. The bootstrap in hypothesis testing. Proj. Euclid 36, 91–112 (2001).

    Google Scholar 

  26. MacKinnon, J. G. in Handbook of Computational Econometrics (eds Belsley, D. A. & Kontoghiorghes, E. J.) 183–213 (Wiley, 2009).

  27. Paparoditis, E. & Politis, D. N. Bootstrap hypothesis testing in regression models. Stat. Probab. Lett. 74, 356–365 (2005).

    Article  Google Scholar 

  28. Romano, J. P. Bootstrap and randomization tests of some nonparametric hypotheses. Ann. P Stat. 17, 141–159 (1989).

    Article  Google Scholar 

  29. Pitman, E. J. G. Significance tests which may be applied to samples from any populations. J. R. Stat. Soc. 4, 119–130 (1937).

    Google Scholar 

  30. Fisher, R. A. The Design of Experiments (Oliver and Boyd, 1935).

  31. Pesarin, F. & Salmaso, L. Permutation Tests for Complex Data: Theory, Applications and Software (John Wiley & Sons, 2010).

  32. Ernst, M. D. Permutation methods: a basis for exact inference. Stat. Sci. 19, 676–685 (2004).

    Article  Google Scholar 

  33. Flachaire, E. A better way to bootstrap pairs. Econ. Lett. 64, 257–262 (1999).

    Article  Google Scholar 

  34. Lancaster, H. Significance tests in discrete distributions. J. Am. Stat. Assoc. 56, 223–234 (1961).

    Article  Google Scholar 

Download references

Acknowledgements

The authors received no specific funding for this work.

Author information

Authors and Affiliations

Authors

Contributions

U.S., J.P.S. and L.D.N. jointly developed the ideas surrounding specification curve analysis and wrote the manuscript. U.S. developed and implemented the inferential approach to specification curve analysis and conducted all analyses.

Corresponding author

Correspondence to Uri Simonsohn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Primary handling editor: Stavroula Kousta

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Supplementary Figs. 1–10 and references.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Simonsohn, U., Simmons, J.P. & Nelson, L.D. Specification curve analysis. Nat Hum Behav 4, 1208–1214 (2020). https://doi.org/10.1038/s41562-020-0912-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-020-0912-z

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing