Nature | News

Statisticians issue warning over misuse of P values

Policy statement aims to halt missteps in the quest for certainty.

Article tools

Misuse of the P value — a common test for judging the strength of scientific evidence — is contributing to the number of research findings that cannot be reproduced, the American Statistical Association (ASA) warns in a statement released today1. The group has taken the unusual step of issuing principles to guide use of the P value, which it says cannot determine whether a hypothesis is true or whether results are important.

This is the first time that the 177-year-old ASA has made explicit recommendations on such a foundational matter in statistics, says executive director Ron Wasserstein. The society’s members had become increasingly concerned that the P value was being misapplied in ways that cast doubt on statistics generally, he adds.

In its statement, the ASA advises researchers to avoid drawing scientific conclusions or making policy decisions based on P values alone. Researchers should describe not only the data analyses that produced statistically significant results, the society says, but all statistical tests and choices made in calculations. Otherwise, results may seem falsely robust.

Véronique Kiermer, executive editor of the Public Library of Science journals, says that the ASA’s statement lends weight and visibility to longstanding concerns over undue reliance on the P value. “It is also very important in that it shows statisticians, as a profession, engaging with the problems in the literature outside of their field,” she adds.

Weighing the evidence

P values are commonly used to test (and dismiss) a ‘null hypothesis’, which generally states that there is no difference between two groups, or that there is no correlation between a pair of characteristics. The smaller the P value, the less likely an observed set of values would occur by chance — assuming that the null hypothesis is true. A P value of 0.05 or less is generally taken to mean that a finding is statistically significant and warrants publication. But that is not necessarily true, the ASA statement notes.

A P value of 0.05 does not mean that there is a 95% chance that a given hypothesis is correct. Instead, it signifies that if the null hypothesis is true, and all other assumptions made are valid, there is a 5% chance of obtaining a result at least as extreme as the one observed. And a P value cannot indicate the importance of a finding; for instance, a drug can have a statistically significant effect on patients’ blood glucose levels without having a therapeutic effect.

Giovanni Parmigiani, a biostatistician at the Dana Farber Cancer Institute in Boston, Massachusetts, says that misunderstandings about what information a P value provides often crop up in textbooks and practice manuals. A course correction is long overdue, he adds. “Surely if this happened twenty years ago, biomedical research could be in a better place now.”

Frustration abounds

Criticism of the P value is nothing new. In 2011, researchers trying to raise awareness about false positives gamed an analysis to reach a statistically significant finding: that listening to music by the Beatles makes undergraduates younger2. More controversially, in 2015, a set of documentary filmmakers published conclusions from a purposely shoddy clinical trial — supported by a robust P value — to show that eating chocolate helps people to lose weight. (The article has since been retracted.)

But Simine Vazire, a psychologist at the University of California, Davis, and editor of the journal Social Psychological and Personality Science, thinks that the ASA statement could help to convince authors to disclose all of the statistical analyses that they run. “To the extent that people might be sceptical, it helps to have statisticians saying, ‘No, you can't interpret P values without this information,” she says.

More drastic steps, such as the ban on publishing papers that contain P values instituted by at least one journal, could be counter-productive, says Andrew Vickers, a biostatistician at Memorial Sloan Kettering Cancer Center in New York City. He compares attempts to bar the use of P values to addressing the risk of automobile accidents by warning people not to drive — a message that many in the target audience would probably ignore. Instead, Vickers says that researchers should be instructed to “treat statistics as a science, and not a recipe”.

But a better understanding of the P value will not take away the human impulse to use statistics to create an impossible level of confidence, warns Andrew Gelman, a statistician at Columbia University in New York City.

“People want something that they can't really get,” he says. “They want certainty.”

Journal name:
Nature
Volume:
531,
Pages:
151
Date published:
()
DOI:
doi:10.1038/nature.2016.19503

References

  1. Wasserstein, R. L. & Lazar, N. A. advance online publication The American Statistician (2016).

  2. Simmons, J. P., Nelson, L. D. & Simonsohn, U. Psychol. Sci. 22, 13591366 (2011).

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments

Commenting is currently unavailable.

Gene count

genes

The most popular genes in the human genome

A tour through the most studied genes in biology reveals some surprises.

sign up to Nature briefing

What matters in science — and why — free in your inbox every weekday.

Sign up

Questionable provenance

bones

Archaeologists say human-evolution study used stolen bone

Bizarre tale of theft and suspicious packages casts doubt on claims for early-human occupation in northern Europe.

Money matters

money

Pay for US postdocs varies wildly by institution

Analysis of universities' salary data suggests major disparities in pay for early-career researchers.

New species

orangutan

Newly discovered orangutan species is also the most endangered

The first new species of great ape described in more than eight decades faces threats to its habitat.

Harassment in science

issen

University systems allow sexual harassers to thrive

It's time for academic institutions to take responsibility for protecting students and staff, says Laurel Issen.

Listen

new-pod-red

Nature Podcast

This week, a potential stem cell treatment for a genetic skin condition, and the disappearing axolotl.