Test for reliability of results ‘too easy to pass’, say editors.
A controversial statistical test has finally met its end, at least in one journal. Earlier this month, the editors of Basic and Applied Social Psychology (BASP) announced that the journal would no longer publish papers containing P values because the statistics were too often used to support lower-quality research1.
Authors are still free to submit papers to BASP with P values and other statistical measures that form part of ‘null hypothesis significance testing’ (NHST), but the numbers will be removed before publication. Nerisa Dozo, a PhD student in psychology at the University of Queensland in Brisbane, Australia, tweeted:
Jan de Ruiter, a cognitive scientist at Bielefeld University in Germany, tweeted: “NHST is really problematic”, but added that banning all inferential statistics is “throwing away the baby with the p-value”.
P values are widely used in science to test null hypotheses. For example, in a medical study looking at smoking and cancer, the null hypothesis could be that there is no link between the two. Many researchers interpret a lower P value as stronger evidence that the null hypothesis is false. Many also accept findings as ‘significant’ if the P value comes in at less than 0.05. But P values are slippery, and sometimes, significant P values vanish when experiments and statistical analyses are repeated (see Nature 506, 150–152; 2014).
In an editorial explaining the new policy, editor David Trafimow and associate editor Michael Marks, who are psychologists at New Mexico State University in Las Cruces, say that P values have become a crutch for scientists dealing with weak data. “We believe that the p < .05 bar is too easy to pass and sometimes serves as an excuse for lower quality research,” they write.
Speaking to Nature, Trafimow says that he would be happy if null hypothesis testing disappeared from all published research: “If scientists are depending on a process that’s blatantly invalid, we should get rid of it.” He admits, however, that he does not know which statistical approach should take its place.
Some puzzled over how scientists are supposed to judge whether work has validity without some statistical rules, and the suggestion that scientists could do away entirely with P values met with some derision online. Sanjay Srivastava, a psychologist at the University of Oregon in Eugene, wryly tweeted that conclusions should be the next thing to be banned. But Srivastava also sees a serious side to the new policy. In another tweet, he said:
Srivastava told Nature that he was pleased to see that several psychology journals — including Psychological Science and the Journal of Research in Personality — recently adopted different standards for data analysis, and that he is keeping an open mind about BASP’s change of course. “A pessimistic prediction is that it will become a dumping ground for results that people couldn’t publish elsewhere,” he says. “An optimistic prediction is that it might become an outlet for good, descriptive research that was undervalued under the traditional criteria.”
De Ruiter says that he doesn’t harbour much love for P values, mostly because they don’t accurately reflect the quality of evidence and can lead to false positives. But he is still “baffled” by the move to get rid of them completely. “I predict this will go wrong,” he says. “You can’t do science without some sort of inferential statistics.”
Trafimow responds that experiments and hypothesis testing had been around for centuries before P values were invented. “I’d rather not have any inferential statistics at all than have some that we know aren’t valid,” he says.
Change history
09 March 2015
This story originally asserted that “The closer to zero the P value gets, the greater the chance the null hypothesis is false.” P values do not give the probability that a null hypothesis is false, they give the probability of obtaining data at least as extreme as those observed, if the null hypothesis was true. It is by convention that smaller P values are interpreted as stronger evidence that the null hypothesis is false. The text has been changed to reflect this.
References
Trafimow, D. & Marks, M. Basic Appl. Soc. Psych. 37, 1–2 (2015).
Related links
Related links
Related links in Nature Research
Related external links
Rights and permissions
About this article
Cite this article
Woolston, C. Psychology journal bans P values. Nature 519, 9 (2015). https://doi.org/10.1038/519009f
Published:
Issue Date:
DOI: https://doi.org/10.1038/519009f
This article is cited by
-
Agile software development one year into the COVID-19 pandemic
Empirical Software Engineering (2022)
-
Are the statistical tests the best way to deal with the biomarker selection problem?
Knowledge and Information Systems (2022)
-
Viewing “p” through the lens of the philosophy of medicine
Philosophy, Ethics, and Humanities in Medicine (2019)
-
Statistical inference in abstracts of major medical and epidemiology journals 1975–2014: a systematic review
European Journal of Epidemiology (2017)
-
Confidence limits, error bars and method comparison in molecular modeling. Part 2: comparing methods
Journal of Computer-Aided Molecular Design (2016)