Abstract
The scientific literature is full of articles discussing poor reproducibility of findings from animal experiments as well as failures to translate results from preclinical animal studies to clinical trials in humans. Critics even go so far as to talk about a “reproducibility crisis” in the life sciences, a novel headword that increasingly finds its way into numerous high-impact journals. Viewed from a cynical perspective, Fett's law of the lab “Never replicate a successful experiment” has thus taken on a completely new meaning. So far, poor reproducibility and translational failures in animal experimentation have mostly been attributed to biased animal data, methodological pitfalls, current publication ethics and animal welfare constraints. More recently, the concept of standardization has also been identified as a potential source of these problems. By reducing within-experiment variation, rigorous standardization regimes limit the inference to the specific experimental conditions. In this way, however, individual phenotypic plasticity is largely neglected, resulting in statistically significant but possibly irrelevant findings that are not reproducible under slightly different conditions. By contrast, systematic heterogenization has been proposed as a concept to improve representativeness of study populations, contributing to improved external validity and hence improved reproducibility. While some first heterogenization studies are indeed very promising, it is still not clear how this approach can be transferred into practice in a logistically feasible and effective way. Thus, further research is needed to explore different heterogenization strategies as well as alternative routes toward better reproducibility in animal experimentation.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
We are sorry, but there is no personal subscription option available for your country.
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Unreliable research. Trouble at the lab. The Economist (2013).
Ioannidis, J.P. Why most published research findings are false. PLoS Med. 2, e124 (2005).
Bailoo, J.D., Reichlin, T.S. & Würbel, H. Refinement of experimental design and conduct in laboratory animal research. ILAR J. 55, 383–391 (2014).
Kola, I. & Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat. Rev. Drug Discov. 3, 711–716 (2004).
Van der Worp, H.B. et al. Can animal models of disease reliably inform human studies? PLoS Med. 7, e1000245 (2010).
Mogil, J.S. Laboratory environmental factors and pain behavior: the relevance of unknown unknowns to reproducibility and translation. Lab Anim. (NY) 46, 136–141 (2017).
Würbel, H. More than 3Rs: the importance of scientific validity for harm-benefit analysis of animal research. Lab Anim. (NY) 46, 164–166 (2017).
Garner, J.P., Gaskill, B.N., Weber, E.M., Ahloy-Dallaire, J. & Pritchett-Corning, K.R. Introducing Therioepistemology: the study of how knowledge is gained from animal research. Lab Anim. (NY) 46, 103–113 (2017).
Jarvis, M.F. & Williams, M. Irreproducibility in preclinical biomedical research: perceptions, uncertainties, and knowledge gaps. Trends Pharmacol. Sci. 37, 290–302 (2016).
Seok, J. et al. Genomic responses in mouse models poorly mimic human inflammatory diseases. Proc. Natl. Acad. Sci. USA 110, 3507–3512 (2013).
Scannell, J.W. & Bosley, J. When quality beats quantity: decision theory, drug discovery, and the reproducibility crisis. PLoS ONE 11, e0147215 (2016).
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).
Voelkl, B. & Würbel, H. Reproducibility crisis: are we ignoring reaction norms? Trends Pharmacol. Sci. 37, 509–510 (2016).
Peng, R. The reproducibility crisis in science: A statistical counterattack. Significance 12, 30–32 (2015).
Begley, C.G. & Ioannidis, J.P. Reproducibility in science. Circ. Res. 116, 116–126 (2015).
van der Staay, F.J., Arndt, S.S. & Nordquist, R.E. Evaluation of animal models of neurobehavioral disorders. Behav. Brain Funct. 5, 11 (2009).
Collins, F.S. & Tabak, L.A. Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014).
Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets? Nat. Rev. Drug Discov. 10, 712 (2011).
Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).
Ioannidis, J.P. et al. Repeatability of published microarray gene expression analyses. Nat. Genet. 41, 149–155 (2009).
Freedman, L.P., Cockburn, I.M. & Simcoe, T.S. The economics of reproducibility in preclinical research. PLoS Biol. 13, e1002165 (2015).
Begley, C.G. & Ellis, L.M. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).
Giles, J. Animal experiments under fire for poor design. Nature 444, 981 (2006).
Ioannidis, J.P. et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet 383, 166–175 (2014).
Macleod, M.R. et al. Risk of bias in reports of in vivo research: a focus for improvement. PLoS Biol. 13, e1002273 (2015).
Reichlin, T.S., Vogt, L. & Würbel, H. The researchers' view of scientific rigor—survey on the conduct and reporting of in vivo research. PLoS ONE 11, e0165999 (2016).
van der Worp, H.B., de Haan, P., Morrema, E. & Kalkman, C.J. Methodological quality of animal studies on neuroprotection in focal cerebral ischaemia. J. Neurol. 252, 1108–1114 (2005).
Vogt, L., Reichlin, T.S., Nathues, C. & Würbel, H. Authorization of animal experiments is based on confidence rather than evidence of scientific rigor. PLoS Biol. 14, e2000598 (2016).
McNutt, M. Journals unite for reproducibility. Science 346, 679 (2014).
Kilkenny, C., Browne, W., Cuthill, I.C., Emerson, M. & Altman, D.G. Animal research: reporting in vivo experiments: the ARRIVE guidelines. Br. J. Pharmacol. 160, 1577–1579 (2010).
Kilkenny, C., Browne, W.J., Cuthill, I.C., Emerson, M. & Altman, D.G. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol. 8, e1000412 (2010).
Baker, D., Lidster, K., Sottomayor, A. & Amor, S. Two years later: journals are not yet enforcing the ARRIVE guidelines on reporting standards for pre-clinical animal studies. PLoS Biol. 12, e1001756 (2014).
Lazic, S.E. & Essioux, L. Improving basic and translational science by accounting for litter-to-litter variation in animal models. BMC Neurosci. 14, 37 (2013).
Festing, M.F. Design and statistical methods in studies using animal models of development. ILAR J. 47, 5–14 (2006).
Halsey, L.G., Curran-Everett, D., Vowler, S.L. & Drummond, G.B. The fickle P value generates irreproducible results. Nat. Methods 12, 179–185 (2015).
Goodman, S.N. Aligning statistical and scientific reasoning. Science 352, 1180–1181 (2016).
Wainwright, P.E. Issues of design and analysis relating to the use of multiparous species in developmental nutritional studies. J. Nutr. 128, 661–663 (1998).
Zorrilla, E.P. Multiparous species present problems (and possibilities) to developmentalists. Dev. Psychobiol. 30, 141–150 (1997).
Holson, R.R. & Pearce, B. Principles and pitfalls in the analysis of prenatal treatment effects in multiparous species. Neurotoxicol. Teratol. 14, 221–228 (1992).
Lazic, S.E. The problem of pseudoreplication in neuroscientific studies: is it affecting your analysis? BMC Neurosci. 11, 5 (2010).
Noble, W.S. How does multiple testing correction work? Nat. Biotechnol. 27, 1135–1137 (2009).
Festing, M.F. We are not born knowing how to design and analyse scientific experiments. Altern. Lab. Anim. 41, 19–21 (2013).
Sena, E.S., Van Der Worp, H.B., Bath, P.M., Howells, D.W. & Macleod, M.R. Publication bias in reports of animal stroke studies leads to major overstatement of efficacy. PLoS Biol. 8, e1000344 (2010).
Cumming, G. The new statistics why and how. Psychol. Sci. 25, 7–29 (2014).
Poole, T. Happy animals make good science. Lab. Anim. 31, 116–124 (1997).
Garner, J.P. Stereotypies and other abnormal repetitive behaviors: potential impact on validity, reliability, and replicability of scientific outcomes. ILAR J. 46, 106–117 (2005).
Prescott, M.J. & Lidster, K. Improving quality of science through better animal welfare: the NC3Rs strategy. Lab Anim. (NY) 46, 152–156 (2017).
Nuzzo, R. Statistical errors. Nature 506, 150 (2014).
Head, M.L., Holman, L., Lanfear, R., Kahn, A.T. & Jennions, M.D. The extent and consequences of p-hacking in science. PLoS Biol. 13, e1002106 (2015).
Simmons, J.P., Nelson, L.D. & Simonsohn, U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22, 1359–1366 (2011).
Festing, M.F. Reduction of animal use: experimental design and quality of experiments. Lab. Anim. 28, 212–221 (1994).
Beynen, A.C., Baumans, V. & Van Zutphen, L.F.M. in Principles of Laboratory Animal Science (eds. L.F.M. Van Zutphen, V. Baumans & A.C. Beynen) 103–110 (Elsevier, Amsterdam, 2001).
Würbel, H. Behaviour and the standardization fallacy. Nat. Genet. 26, 263 (2000).
Würbel, H. Behavioral phenotyping enhanced–beyond (environmental) standardization. Genes Brain Behav. 1, 3–8 (2002).
Crabbe, J.C., Wahlsten, D. & Dudek, B.C. Genetics of mouse behavior: interactions with laboratory environment. Science 284, 1670–1672 (1999).
Mandillo, S. et al. Reliability, robustness, and reproducibility in mouse behavioral phenotyping: a cross-laboratory study. Physiol. Genomics 34, 243–255 (2008).
Lewejohann, L. et al. Environmental bias? Effects of housing conditions, laboratory environment and experimenter on behavioral tests. Genes Brain Behav. 5, 64–72 (2006).
Wolfer, D.P. et al. Laboratory animal welfare: cage enrichment and mouse behaviour. Nature 432, 821–822 (2004).
Wahlsten, D. Standardizing tests of mouse behavior: reasons, recommendations, and reality. Physiol. Behav. 73, 695–704 (2001).
Wahlsten, D. et al. Different data from different labs: lessons from studies of gene–environment interaction. J. Neurobiol. 54, 283–311 (2003).
Crabbe, J.C. & Morris, R.G. Festina lente: late-night thoughts on high-throughput screening of mouse behavior. Nat. Neurosci. 7, 1175–1179 (2004).
Galsworthy, M.J. et al. A comparison of wild-caught wood mice and bank voles in the Intellicage: assessing exploration, daily activity patterns and place learning paradigms. Behav. Brain Res. 157, 211–217 (2005).
Talpos, J. & Steckler, T. Touching on translation. Cell Tissue Res. 354, 297–308 (2013).
Richter, S.H. et al. Touchscreen-paradigm for mice reveals cross-species evidence for an antagonistic relationship of cognitive flexibility and stability. Front. Behav. Neurosci. 8, 154 (2014).
Richardson, C.A. Automated homecage behavioural analysis and the implementation of the three Rs in research involving mice. Altern. Lab. Anim. 40, 7–9 (2012).
Dingemanse, N.J., Kazem, A.J., Réale, D. & Wright, J. Behavioural reaction norms: animal personality meets individual plasticity. Trends Ecol. Evol. 25, 81–89 (2010).
Sarkar, S. From the Reaktionsnorm to the adaptive norm: the norm of reaction, 1909–1960. Biol. Philos. 14, 235–252 (1999).
van der Staay, F.J. Animal models of behavioral dysfunctions: basic concepts and classifications, and an evaluation strategy. Brain Res. Rev. 52, 131–159 (2006).
Muma, J.R. The need for replication. J. Speech Lang. Hear. Res. 36, 927–930 (1993).
Würbel, H. & Garner, J.P. Refinement of rodent research through environmental enrichment and systematic randomization. NC3Rs 9, 1–9 (2007).
Richter, S.H., Garner, J.P. & Wurbel, H. Environmental standardization: cure or cause of poor reproducibility in animal experiments? Nat. Methods 6, 257–261 (2009).
Richter, S.H., Garner, J.P., Auer, C., Kunert, J. & Würbel, H. Systematic variation improves reproducibility of animal experiments. Nat. Methods 7, 167–168 (2010).
Grafen, A. & Hails, R. Modern statistics for the life sciences (Oxford University Press, Oxford, 2002).
Walker, M. et al. Mixed-strain housing for female C57BL/6, DBA/2, and BALB/c mice: validating a split-plot design that promotes refinement and reduction. BMC Med. Res. Methodol. 16, 11 (2016).
Festing, M.F. & Altman, D.G. Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR J. 43, 244–258 (2002).
Richter, S.H. et al. Effect of population heterogenization on the reproducibility of mouse behavior: a multi-laboratory study. PLoS ONE 6, e16461 (2011).
Würbel, H., Richter, S.H. & Garner, J.P. Reply to: “Reanalysis of Richter et al. (2010) on reproducibility”. Nat. Methods 10, 374 (2013).
Jonker, R.M., Guenther, A., Engqvist, L. & Schmoll, T. Does systematic variation improve the reproducibility of animal experiments? Nat. Methods 10, 373 (2013).
Wolfinger, R.D. Reanalysis of Richter et al. (2010) on reproducibility. Nat. Methods 10, 373–374 (2013).
Paylor, R. Questioning standardization in science. Nat. Methods 6, 253–254 (2009).
Chesler, E.J., Wilson, S.G., Lariviere, W.R., Rodriguez-Zas, S.L. & Mogil, J.S. Identification and ranking of genetic and laboratory environment factors influencing a behavioral trait, thermal nociception, via computational analysis of a large data archive. Neurosci. Biobehav. Rev. 26, 907–923 (2002).
Chesler, E.J., Wilson, S.G., Lariviere, W.R., Rodriguez-Zas, S.L. & Mogil, J.S. Influences of laboratory environment on behavior. Nat. Neurosci. 5, 1101–1102 (2002).
Karp, N.A., Melvin, D., Mott, R.F. & Project, S.M.G. Robust and sensitive analysis of mouse knockout phenotypes. PLoS ONE 7, e52410 (2012).
Sorge, R.E. et al. Olfactory exposure to males, including men, causes stress and related analgesia in rodents. Nat. Methods 11, 629–632 (2014).
Sittig, L.J. et al. Genetic background limits generalizability of genotype-phenotype relationships. Neuron 91, 1253–1259 (2016).
Acknowledgements
Current research on heterogenization and reproducibility is funded by the German Research Foundation (DFG, RI 2488/3-1). Furthermore, I would like to thank Norbert Sachser, Hanno Würbel, Sara Hintze, Niklas Kästner, and Vanessa von Kortzfleisch for their helpful comments on earlier drafts of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The author declares no competing financial interests.
Rights and permissions
About this article
Cite this article
Richter, S. Systematic heterogenization for better reproducibility in animal experimentation. Lab Anim 46, 343–349 (2017). https://doi.org/10.1038/laban.1330
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/laban.1330
This article is cited by
-
Challenging current scientific practice: how a shift in research methodology could reduce animal use
Lab Animal (2024)
-
A paradigm shift in translational psychiatry through rodent neuroethology
Molecular Psychiatry (2023)
-
Predictive validity in drug discovery: what it is, why it matters and how to improve it
Nature Reviews Drug Discovery (2022)
-
Comparative Analysis of the Pharmacological Activity of Bis(3,5-di-Tert-Butyl-4-Hydroxyphenylthiolate)Dimethylol in Different Modes of Administration on a Mouse Model of Melanoma B16 Tumor Growth
Bulletin of Experimental Biology and Medicine (2022)
-
Mapping the past, present and future research landscape of paternal effects
BMC Biology (2020)