Psychology journal bans P values

Woolston, Chris

doi:10.1038/519009f

Download PDF

Social Selection
Published: 26 February 2015

Psychology journal bans P values

Chris Woolston

Nature volume 519, page 9 (2015)Cite this article

9249 Accesses
42 Citations
1162 Altmetric
Metrics details

Subjects

09 March 2015 This story originally asserted that “The closer to zero the P value gets, the greater the chance the null hypothesis is false.” P values do not give the probability that a null hypothesis is false, they give the probability of obtaining data at least as extreme as those observed, if the null hypothesis was true. It is by convention that smaller P values are interpreted as stronger evidence that the null hypothesis is false. The text has been changed to reflect this.

Test for reliability of results ‘too easy to pass’, say editors.

A controversial statistical test has finally met its end, at least in one journal. Earlier this month, the editors of Basic and Applied Social Psychology (BASP) announced that the journal would no longer publish papers containing P values because the statistics were too often used to support lower-quality research¹.

Authors are still free to submit papers to BASP with P values and other statistical measures that form part of ‘null hypothesis significance testing’ (NHST), but the numbers will be removed before publication. Nerisa Dozo, a PhD student in psychology at the University of Queensland in Brisbane, Australia, tweeted:

Jan de Ruiter, a cognitive scientist at Bielefeld University in Germany, tweeted: “NHST is really problematic”, but added that banning all inferential statistics is “throwing away the baby with the p-value”.

P values are widely used in science to test null hypotheses. For example, in a medical study looking at smoking and cancer, the null hypothesis could be that there is no link between the two. Many researchers interpret a lower P value as stronger evidence that the null hypothesis is false. Many also accept findings as ‘significant’ if the P value comes in at less than 0.05. But P values are slippery, and sometimes, significant P values vanish when experiments and statistical analyses are repeated (see Nature 506, 150–152; 2014).

Based on data from Altmetric.com. Altmetric is supported by Macmillan Science and Education, which owns Nature Publishing Group.

In an editorial explaining the new policy, editor David Trafimow and associate editor Michael Marks, who are psychologists at New Mexico State University in Las Cruces, say that P values have become a crutch for scientists dealing with weak data. “We believe that the p < .05 bar is too easy to pass and sometimes serves as an excuse for lower quality research,” they write.

Speaking to Nature, Trafimow says that he would be happy if null hypothesis testing disappeared from all published research: “If scientists are depending on a process that’s blatantly invalid, we should get rid of it.” He admits, however, that he does not know which statistical approach should take its place.

Some puzzled over how scientists are supposed to judge whether work has validity without some statistical rules, and the suggestion that scientists could do away entirely with P values met with some derision online. Sanjay Srivastava, a psychologist at the University of Oregon in Eugene, wryly tweeted that conclusions should be the next thing to be banned. But Srivastava also sees a serious side to the new policy. In another tweet, he said:

Srivastava told Nature that he was pleased to see that several psychology journals — including Psychological Science and the Journal of Research in Personality — recently adopted different standards for data analysis, and that he is keeping an open mind about BASP’s change of course. “A pessimistic prediction is that it will become a dumping ground for results that people couldn’t publish elsewhere,” he says. “An optimistic prediction is that it might become an outlet for good, descriptive research that was undervalued under the traditional criteria.”

De Ruiter says that he doesn’t harbour much love for P values, mostly because they don’t accurately reflect the quality of evidence and can lead to false positives. But he is still “baffled” by the move to get rid of them completely. “I predict this will go wrong,” he says. “You can’t do science without some sort of inferential statistics.”

Trafimow responds that experiments and hypothesis testing had been around for centuries before P values were invented. “I’d rather not have any inferential statistics at all than have some that we know aren’t valid,” he says.

Change history

09 March 2015
This story originally asserted that “The closer to zero the P value gets, the greater the chance the null hypothesis is false.” P values do not give the probability that a null hypothesis is false, they give the probability of obtaining data at least as extreme as those observed, if the null hypothesis was true. It is by convention that smaller P values are interpreted as stronger evidence that the null hypothesis is false. The text has been changed to reflect this.

References

Trafimow, D. & Marks, M. Basic Appl. Soc. Psych. 37, 1–2 (2015).
Article Google Scholar

Download references

Authors

Chris Woolston
View author publications
You can also search for this author in PubMed Google Scholar

About this article

Cite this article

Woolston, C. Psychology journal bans P values. Nature 519, 9 (2015). https://doi.org/10.1038/519009f

Download citation

Published: 26 February 2015
Issue Date: 05 March 2015
DOI: https://doi.org/10.1038/519009f

This article is cited by

Agile software development one year into the COVID-19 pandemic
- Pernilla Ågren
- Eli Knoph
- Richard Berntsson Svensson
Empirical Software Engineering (2022)
Are the statistical tests the best way to deal with the biomarker selection problem?
- Ari Urkullu
- Aritz Pérez
- Borja Calvo
Knowledge and Information Systems (2022)
Viewing “p” through the lens of the philosophy of medicine
- Sara Asato
- James Giordano
Philosophy, Ethics, and Humanities in Medicine (2019)
Statistical inference in abstracts of major medical and epidemiology journals 1975–2014: a systematic review
- Andreas Stang
- Markus Deckert
- Kenneth J. Rothman
European Journal of Epidemiology (2017)
Confidence limits, error bars and method comparison in molecular modeling. Part 2: comparing methods
- A. Nicholls
Journal of Computer-Aided Molecular Design (2016)

Psychology journal bans P values

Subjects

Change history

09 March 2015

References

Related links

Related links in Nature Research

Related external links

Rights and permissions

About this article

Cite this article

This article is cited by

Agile software development one year into the COVID-19 pandemic

Are the statistical tests the best way to deal with the biomarker selection problem?

Viewing “p” through the lens of the philosophy of medicine

Statistical inference in abstracts of major medical and epidemiology journals 1975–2014: a systematic review

Confidence limits, error bars and method comparison in molecular modeling. Part 2: comparing methods

Search

Quick links

Subjects

Change history

09 March 2015

References

Related links

Related links

Related links in Nature Research

Related external links

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Agile software development one year into the COVID-19 pandemic

Are the statistical tests the best way to deal with the biomarker selection problem?

Viewing “p” through the lens of the philosophy of medicine

Statistical inference in abstracts of major medical and epidemiology journals 1975–2014: a systematic review

Confidence limits, error bars and method comparison in molecular modeling. Part 2: comparing methods

Search

Quick links