Nature | Comment

Preclinical research: Make mouse studies work

More investment to characterize animal models can boost the ability of preclinical work to predict drug effects in humans, says Steve Perrin.

Article tools

Illustration by Claire Welsh/Nature

Mice take the blame for one of the most uncomfortable truths in translational research. Even after animal studies suggest that a treatment will be safe and effective, more than 80% of potential therapeutics fail when tested in people. Animal models of disease are frequently condemned as poor predictors of whether an experimental drug can become an effective treatment. Often, though, the real reason is that the preclinical experiments were not rigorously designed1, 2.

The series of clinical trials for a potential therapy can cost hundreds of millions of dollars. The human costs are even greater: patients with progressive terminal illnesses may have just one shot at an unproven but promising treatment. Clinical trials typically require patients to commit to year or more of treatment, during which they are precluded from pursuing other experimental options. Launching a clinical trial without the backing of robust animal data keeps patients out of tests for therapies that may have a better chance of success.

One such group of patients is those with amyotrophic lateral sclerosis (ALS), the fatal neurodegenerative condition also known as Lou Gehrig's or motor neuron disease. Over the past decade, about a dozen experimental treatments have made their way into human trials for ALS. All had been shown to ameliorate disease in an established animal model. All but one failed in the clinic, and the survival benefits of that one are marginal.

At the ALS Therapy Development Institute (TDI) in Cambridge, Massachusetts, we have tested more than 100 potential drugs in an established mouse model of this disease (mostly unpublished work). Many of these drugs had been reported to slow down disease in that same mouse model; none was found to be beneficial in our experiments (see 'Due diligence, overdue'). Eight of these compounds ultimately failed in clinical trials, which together involved thousands of people. One needs to look no further than potential blockbuster indications such as Alzheimer's and cancer to see that the problem persists across diseases.

After nearly a decade of validation work, the ALS TDI introduced guidelines that should reduce the number of false positives in preclinical studies and so prevent unwarranted clinical trials. The recommendations, which pertain to other diseases too, include: rigorously assessing animals' physical and biochemical traits in terms of human disease; characterizing when disease symptoms and death occur and being alert to unexpected variation; and creating a mathematical model to aid experimental design, including how many mice must be included in a study. It is astonishing how often such straightforward steps are overlooked. It is hard to find a publication, for example, in which a preclinical animal study is backed by statistical models to minimize experimental noise.

The experiments necessary for this type of characterization are expensive, time-consuming and will not, in themselves, lead to new treatments. But without this upfront investment, financial resources for clinical trials are being wasted and lives are being lost.

Know your animals

Investigations at the ALS TDI exemplify how initial physiological descriptions of an animal model rarely encompass all salient features, including how closely the model captures what is observed in patients. Such models are often inadequate for studying how a drug affects various aspects of disease.

ALS progression is characterized by a deterioration in the neurons that innervate skeletal muscles. Sequencing and genetic studies implicate RNA-binding proteins as crucial for maintaining the health of motor neurons3. Mouse models expressing a mutant form of the RNA binding protein TDP43 show hallmark features of ALS: loss of motor neurons, protein aggregation and progressive muscle atrophy4.

But further study of these mice revealed key differences. In patients (and in established mouse models), paralysis progresses over time. However, we did not observe this progression in TDP43-mutant mice. Measurements of gait and grip strength showed that their muscle deficits were in fact mild, and post-mortem examination found that the animals died not of progressive muscle atrophy, but of acute bowel obstruction caused by deterioration of smooth muscles in the gut5. Although the existing TDP43-mutant mice may be useful for studying drugs' effects on certain disease mechanisms, a drug's ability to extend survival would most probably be irrelevant to people.

Scientists who use animal models for translational research must proceed with caution, and be prepared to do further characterizations themselves.

Cancel the noise

ALS TDI scientists performed a meta-analysis on nearly 5,500 mice that had been used in treatment or control groups over four years1. All mice expressed a specific defective version of the SOD1 gene, which is mutated in about 10% of people with inherited ALS. This work, and that of others6, revealed both unexpected variation in the animals, and ways to control for it.

Almost 90% of the mice had an average lifespan of 134 days, give or take 10 days. Careful inspection of animals that lived shorter or longer revealed four factors that produced considerable noise in the data and could have led to spurious conclusions (see ‘Four ways to fight noise’). Crucially, understanding such variation requires careful monitoring of hundreds of mice over several generations.

Four ways to fight noise

Simple steps to avoid spurious conclusions

  • Exclude irrelevant animals As often done in clinical trials, subjects that die for reasons unrelated to disease (such as mishandling) should not be counted in results. Reasons for exclusion should be well documented.
  • Balance for gender Males and females can show differences in symptoms that obscure modest drug effects.
  • Split littermates among experimental groups Putting siblings into the same treatment group can bias results.
  • Track genes Genes that induce disease are often not inherited reliably. When copies are lost, symptoms can be less severe and drugs can seem more effective than they are.

One factor is the failure to exclude animals whose deaths are unrelated to the disease being studied. Other factors are failing to split littermates between control and treatment groups, and not taking gender into account. Male SOD1 mice show symptoms as much as a week before females and die about a week earlier. Given that a week is a 4% variability in survival, such differences could easily be misconstrued as a drug effect.

The fourth factor regards the genes introduced to induce disease. All too often, a disease phenotype is lost as a colony of breeding mice is built up. For many diseases, including ALS, animal models carry multiple copies of the disease-causing gene, and these repeated genes are often not passed on in a stable fashion as cells divide to make gametes. Regular genotyping assays are essential to make sure that mice in subsequent generations do not have fewer copies of the transgene, and therefore less severe disease.

At the ALS TDI we have seen this several times. When first described in 2010, all TDP43-mutant mice died within 200 days7. When we ordered mice from a breeding colony established from those used in this initial publication, the mice lived for up to 400 days without showing signs of disease. To perform the characterization work on TDP43 described above, we first spent several months backcrossing the strain to create a stable phenotype.

Illustration by Claire Welsh/Nature

Characterization can flag more subtle potential problems for translation. This is exemplified by a study showing that lithium can boost survival of SOD1 mice by 30 days, an astoundingly long time8. A small clinical trial showed that it also extended life in people with ALS8. Lithium is already sold to treat schizophrenia, and many people with ALS began taking the drug off label in hope of slowing down their disease progression. Three separate phase III clinical trials were launched in parallel to assess the drug's effects. These enrolled hundreds of patients with a total cost of well over US$100 million. None of the three trials showed any therapeutic benefit9, 10, 11.

Concurrently, other groups attempted to reproduce the preclinical data and could not12, 13. Although it is difficult to determine why the first study showed such a dramatic effect, its initial results are curious. The median survival time of untreated animals was 20 days shorter than that observed elsewhere, suggesting other anomalies.

For studies that aim to predict treatment benefits, such as extended survival or a delay of symptom progression, a mathematical simulation is in order. This incorporates the variation typically observed in an animal model to calculate how many animals should be assigned to the experimental groups. According to our calculations, highly variable animal models could require hundreds of animals per group; even homogeneous ones require as many as ten.

“Public and private agencies should fund characterization studies as a specific project.”

And before assessing a drug's efficacy, researchers should investigate what dose animals can tolerate, whether the drug reaches the relevant tissue at the required dose and how quickly the drug is metabolized or degraded by the body. We estimate that it takes about $30,000 and 6–9 months to characterize the toxicity of a molecule and assess whether enough reaches the relevant tissue and has a sufficient half life at the target to be potentially effective.

If those results are promising, then experiments to test whether a drug can extend an animal's survival are warranted — this will cost about $100,000 per dose and take around 12 months. At least three doses of the molecule should be tested; this will help to establish that any drug responses are real and suggest what a reasonable dosing level might be.

Thus, even assuming the model has been adequately characterized, an investment of $330,000 is necessary just to determine whether a single drug has reasonable potential to treat disease in humans. This seems worthwhile given that it could take thousands of patients, several years and hundreds of millions of dollars to move a drug through the clinical development process.

Community effort

As academic labs shift their focus to translational research, the burden to characterize animal models will fall on them. Although the costs are meagre compared with those of clinical trials, the investment required in time and funds is far beyond what any one lab should be expected to do. This burden and the resulting mouse models should be shared. At the very least, researchers should place new animal models in a public repository so that other teams can repeat the characterization, and share the costs of doing it well.

Public and private agencies should fund characterization studies as a specific project. A good example is the Alzheimer's Disease Neuroimaging Initiative, a large, collaborative study to find diagnostic biomarkers of the disease. Competitive bidding and milestone-driven payments could persuade qualified groups to perform the necessary experiments and to make results publicly available. This is unglamorous work that will never directly lead to a breakthrough or therapy, and is hard to mesh with the aims of a typical grant proposal or graduate student training programme. However, without these investments, more patients and funds will be squandered on clinical trials that are uninformative and disappointing.

Journal name:
Nature
Volume:
507,
Pages:
423–425
Date published:
()
DOI:
doi:10.1038/507423a

References

  1. Scott, S. et al. Amyotroph. Lateral Scler. 9, 415 (2008).

  2. Begley, C. G. & Ellis, L. M. Nature 483, 531533 (2012).

  3. Ling, S.-C., Polymenidou, M. & Cleveland, D. W. Neuron 79, 416438 (2013).

  4. Wegorzewska, I., Bell, S., Cairns, N. J., Miller, T. M. & Baloh, R. H. Proc. Natl Acad. Sci. USA 106, 1880918814 (2009).

  5. Hatzipetros, T. et al. Brain Res. http://dx.doi.org/10.1016/j.brainres.2013.10.013 (2013).

  6. Ludolph, A. C. et al. Amyotroph. Lateral Scler. 11, 3845 (2010).

  7. Stallings, N. R., Puttaparthi, K., Luther, C. M., Burns, D. K. & Elliott, J. L. Neurobiol. Dis. 40, 404414 (2010).

  8. Fornai, F. et al. Proc. Natl Acad. Sci. USA 105, 20522057 (2008).

  9. UKMND-LiCALS Study Group Lancet Neurol. 12, 339345 (2013).

  10. Verstraete, E. et al. J. Neurol. Neurosurg. Psychiatr. 83, 557564 (2012).

  11. Aggarwal, S. P. et al. Lancet Neurol. 9, 481488 (2010).

  12. Gill, A., Kidd, J., Vieira, F., Thompson, K. & Perrin, S. PLoS ONE 4, e6489 (2009).

  13. Pizzasegola, C. et al. Amyotroph. Lateral Scler. 10, 221228 (2009).

Author information

Affiliations

  1. Steve Perrin is chief scientific officer at the ALS Therapy Development Institute in Cambridge, Massachusetts, USA.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. 507423a_s1 (263K)

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments for this thread are now closed.

Comments

5 comments Subscribe to comments

  1. Avatar for Ray Greek
    Ray Greek
    The article by Steve Perrin (Nature 507, 423–425) and the accompanying news item by Erika Check Hayden (http://www.nature.com/news/misleading-mouse-studies-waste-medical-resources-1.14938) regarding mouse models were refreshing in that both acknowledged certain limitations of mouse models when used to study human disease. However both also ignored the more fundamental problem with animal modeling in general: animals and humans are examples of evolved, complex systems. Greek and Hansen summarize this in Progress in Biophysics and Molecular Biology, 2013 (http://www.sciencedirect.com/science/article/pii/S0079610713000539): “While trans-species extrapolation is possible when perturbations concern lower levels of organization or when studying morphology and function on the gross level, one evolved complex system will not be of predictive value for another when the perturbation affects higher levels of organization.” In part, this is because complex systems are exceedingly dependent on initial conditions and evolution dramatically alters the initial condition of an organism. The initial conditions also change through sexual recombination, which is why individual humans respond differently to the same drug and disease. An understanding of humans and animals as evolved complex systems clarifies why animals can be used to model humans in terms of general morphology but fail as models for neuroprotection after stroke, a vaccine against HIV, and drug development in general.
  2. Avatar for David Hicks
    David Hicks
    I congratulate Steve Perrin on his willingness to look for ways of improving medical research. I would like to propose a couple of extra possibilities for Steve Perrin to consider. The design of preclinical experiments are important, but the design is not the only thing. There is a need to understand the disease process in individual phenotypes before conducting preclinical experiments. There is a need to know your enemy before you enter the fight. ALS/MND is made up of multiple phenotypes of disease requiring multiple treatments. i.e. TDP-43 mutant mice may have relevance to bulbar onset ALS, but not to LMN onset ALS. An Mn SOD 2 mutant mouse may produce the most relevant results for LMN onset sporadic ALS, but not bulbar onset sALS. This example is based on the proposal that sporadic LMN ALS is initiated in skeletal muscle and spreads to motor neurons. It is time to drop the "one size fits all" approach in ALS research.
  3. Avatar for Miriam Meisler
    Miriam Meisler
    Dr. Perrin makes several good suggestions. However, it should be pointed out that most of the patients in the ALS trials did not carry mutations in SOD1, while most or all of the mice were SOD1 mutants. That difference is probably relevant to the lack of success of the clinical trials.
  4. Avatar for Harsha Radhakrishnan
    Harsha Radhakrishnan
    Remove the unnecessary burden of "publish or perish" or "vicious cycle of need funding to write papers but need papers to get funding" from the shoulders of these researchers, then everyone will have the time to go through a complete rigorous study spanning many years. If PIs need to constantly be on the look out for more sources of money (grants); then they are forced to spend time to write grants and to be evaluated well (its about top 5-10% getting funded these days) will need to publish a lot of papers. Academic research is unforgiving. Maybe this needs to be tweaked first before we look into more mice based trials. Till then the industry has to do the heavy lifting. The diseased few will always be caught in the middle, for now.
  5. Avatar for Sijumon Kunjachan
    Sijumon Kunjachan
    late revelation!

Taking a gamble

prediction-markets

The power of prediction markets

Scientists are beginning to understand why these ‘mini Wall Streets’ work so well at forecasting election results — and how they sometimes fail.

Newsletter

The best science news from Nature and beyond, direct to your inbox every day.

The polling crisis

election-polling

How to tell what people really think

This year’s US presidential election is the toughest test yet for political polls as experts struggle to keep up with changing demographics and technology.

Mitochondrial replacement

mitochondrial-replacement

Reports of 'three-parent babies' multiply

Claims of infants created using mitochondrial-replacement techniques stir scientific and ethical debate.

US presidential race

Trump-supporters

The scientists who support Donald Trump

Science policy fades into background for many who back Republican candidate in US presidential race.

ExoMars

lost-mars-lander

Europe’s probe feared lost on Mars

Sister craft successfully enters Martian orbit but loses contact with Schiaparelli lander.

Nature Podcast

new-pod-red

Listen

This week, making egg cells in a dish, super-bright flares in nearby galaxies, and trying to predict the election.

Science jobs from naturejobs