In my view, the proposal to retire statistical significance conflates two problems (V. Amrhein et al. Nature 567, 305–307; 2019). These should be addressed separately.
One problem is the value of having a term that signifies whether an experiment provides evidence of an effect — that is, it achieves ‘statistical significance’.
The second problem involves defining statistical significance as, say, P < 0.05. Many scientists object to this threshold because it can prevent publication of experiments when P > 0.05 (see, for example, D. Lakens et al. Nature Hum. Behav. 2, 168–171; 2018). I have the opposite concern. Careful analysis of P values close to 0.05 shows that they don’t provide evidence for a genuine association (D. J. Benjamin et al. Nature Hum. Behav. 2, 6–10; 2018). Instead, they provide evidence supporting the null hypothesis of no association.
By focusing on the term ‘statistical significance’, we ignore the more important issue of what constitutes sufficient evidence of a true association. Let’s have that discussion and redefine what we mean by a statistically significant finding.
Nature 567, 461 (2019)