Statistical significance sets a convenient obstacle to unfounded claims. In my view, removing the obstacle (V. Amrhein et al. Nature 567, 305–307; 2019) could promote bias. Irrefutable nonsense would rule.
More stringent thresholds of significance are needed for most fields, which currently assume statistical significance when P values are less than 0.05 (see, for example, D. J. Benjamin et al. Nature Hum. Behav. 2, 6–10; 2018; J. P. A. Ioannidis J. Am. Med. Assoc. 319, 1429–1430; 2018).
Dichotomous conclusions can be useful for pinning down discoveries of gene variants for osteoporosis, new bosons or carcinogens, say. But focusing on effect sizes can often be better than determining whether an effect exists. That said, I find the “compatibility interval” proposed by Valentin Amrhein et al. potentially confusing — and biases could render the entire interval incompatible with truth.
If rules are set before data collection and analysis, then statistical guidance that is based on appropriate thresholds is helpful. However, post hoc and subjective statistical inference is susceptible to conflicts of interest. A company could, for example, claim that any results somehow support licensing of its product.
Careful thinking before a study starts should pick the best, fit-for-purpose statistical inference tool and pre-specify the rules of the game — whether frequentist, Bayesian or other. So, although the obstacle of statistical significance can be surmounted by trickery, removing it altogether is worse.
Nature 567, 461 (2019)