Many animal studies of neurological disease appear to overstate the significance of their results. Credit: Redmond O. Durrell/Alamy

A statistical analysis of more than 4,000 data sets from animal studies of neurological diseases has found that almost 40% of studies reported statistically significant results — nearly twice as many as would be expected on the basis of the number of animal subjects. The results suggest that the published work — some of which was used to justify human clinical trials — is biased towards reporting positive results.

This bias could partly explain why a therapy that does well in preclinical studies so rarely predicts success in human patients, says John Ioannidis, a physician who studies research methodology at Stanford University in California, and who is a co-author on the study published today in PLoS Biology1. “The results are too good to be true,” he says.

Ioannidis’s team is not the first to find fault with animal studies: others have highlighted how small sample sizes and unblinded studies can skew results. Another key factor is the tendency of researchers to publish only positive results, leaving negative findings buried in lab notebooks. Creative analyses are also likely culprits, says Ioannidis, such as selecting the statistical technique that gives the best result.

These problems that can affect patient care in hospitals, cautions Matthias Briel, an epidemiologist at University Hospital in Basel, Switzerland. Preclinical studies influence clinical guidelines when human data are lacking, he says. “A lot of clinical researchers are not aware that animal studies are not as well planned as clinical trials,” he adds.

Significantly skewed

Ioannidis and his colleagues mined a database of meta-analyses — analyses of data from multiple studies — of neurological disease research on animals. They focused on 160 meta-analyses of Alzheimer’s disease, Parkinson’s disease and spinal-cord injury, among others.

The researchers then estimated the expected number of statistically significant findings, using the largest study as a reference. Studies with the largest sample sizes are considered the most precise, and the assumption was that these would best approximate the effectiveness of a given intervention.

Of the 4,445 studies, 919 were expected to be significant. But nearly twice as many — 1,719 — reported significant findings. Among the groups most likely to report an inflated number of significant findings were studies with the smallest sample sizes, and those with a corresponding author who reported a financial conflict of interest.

The study does not mean that animal studies are meaningless, says Ioannidis, but rather that they should be better controlled and reported. He and his co-authors advocate a registry for animal studies, akin to clinical-trial registries, as a way to publicize negative findings and detailed research protocols.

Briel would also like to see standards for clinical research applied to preclinical studies. Clinical trials are often blinded, use predetermined sample sizes and analysis methods, and are stopped only at pre-specified points for an interim analysis. Animal studies, by contrast, rarely follow this schema, says Briel.

“These quality-control methods should be introduced more often into preclinical research,” he says. “Preclinical researchers have to realize that their experiments can already have implications for clinical decisions.”