This article has been updated


Although the gut microbiome plays important roles in host physiology, health and disease1, we lack understanding of the complex interplay between host genetics and early life environment on the microbial and metabolic composition of the gut. We used the genetically diverse Collaborative Cross mouse system2 to discover that early life history impacts the microbiome composition, whereas dietary changes have only a moderate effect. By contrast, the gut metabolome was shaped mostly by diet, with specific non-dietary metabolites explained by microbial metabolism. Quantitative trait analysis identified mouse genetic trait loci (QTL) that impact the abundances of specific microbes. Human orthologues of genes in the mouse QTL are implicated in gastrointestinal cancer. Additionally, genes located in mouse QTL for Lactobacillales abundance are implicated in arthritis, rheumatic disease and diabetes. Furthermore, Lactobacillales abundance was predictive of higher host T-helper cell counts, suggesting an important link between Lactobacillales and host adaptive immunity.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 14 July 2017

    In the PDF version of this article previously published, the year of publication provided in the footer of each page and in the 'How to cite' section was erroneously given as 2017, it should have been 2016. This error has now been corrected. The HTML version of the article was not affected.


  1. 1.

    , , & The impact of the gut microbiota on human health: an integrative view. Cell 148, 1258–1270 (2012).

  2. 2.

    Collaborative Cross Consortium. The genome architecture of the Collaborative Cross mouse genetic reference population. Genetics 190, 389–401 (2012).

  3. 3.

    et al. MHC variation sculpts individualized microbial communities that control susceptibility to enteric infection. Nat. Commun. 6, 8642 (2015).

  4. 4.

    et al. Human genetics shape the gut microbiome. Cell 159, 789–799 (2014).

  5. 5.

    et al. Murine gut microbiota is defined by host genetics and modulates variation of metabolic traits. PLoS ONE 7, e39191 (2012).

  6. 6.

    et al. Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. Proc. Natl Acad. Sci. USA 107, 18933–18938 (2010).

  7. 7.

    Host genetic architecture and the landscape of microbiome composition: humans weigh in. Genome Biol. 16, 203 (2015).

  8. 8.

    , , , & Yogurt containing probiotic Lactobacillus rhamnosus GR-1 and L. reuteri RC-14 helps resolve moderate diarrhea and increases CD4 count in HIV/AIDS patients. J. Clin. Gastroenterol. 42, 239–243 (2008).

  9. 9.

    , & Use of probiotics in HIV-infected children: a randomized double-blind controlled study. J. Trop. Pediatr. 54, 19–24 (2008).

  10. 10.

    et al. Ingestion of Lactobacillus strain regulates emotional behavior and central GABA receptor expression in a mouse via the vagus nerve. Proc. Natl Acad. Sci. USA 108, 16050–16055 (2011).

  11. 11.

    et al. Lactobacilli activate human dendritic cells that skew T cells toward T helper 1 polarization. Proc. Natl Acad. Sci. USA 102, 2880–2885 (2005).

  12. 12.

    et al. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat. Genet. 46, 234–244 (2014).

  13. 13.

    et al. PROX1 gene variant is associated with fasting glucose change after antihypertensive treatment. Pharmacotherapy 34, 123–130 (2014).

  14. 14.

    et al. Genome-wide association study of a heart failure related metabolomic profile among African Americans in the Atherosclerosis Risk in Communities (ARIC) study. Genet. Epidemiol. 37, 840–845 (2013).

  15. 15.

    et al. Combined linkage and association analyses identify a novel locus for obesity near PROX1 in Asians. Obesity 21, 2405–2412 (2013).

  16. 16.

    et al. A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat. Genet. 44, 659–669 (2012).

  17. 17.

    et al. Effects of Lactobacillus casei supplementation on disease activity and inflammatory cytokines in rheumatoid arthritis patients: a randomized double-blind clinical trial. Int. J. Rheum. Dis. 17, 519–527 (2014).

  18. 18.

    et al. Clinical application of probiotics in diabetes mellitus: therapeutics and new perspectives. Crit. Rev. Food Sci. Nutr. (2015).

  19. 19.

    et al. A Catalog of Published Genome-Wide Association Studies;

  20. 20.

    et al. Fiehnlib: mass spectral and retention index libraries for metabolomics based on quadrupole and time-of-flight gas chromatography/mass spectrometry. Anal. Chem. 81, 10038–10048 (2009).

  21. 21.

    et al. Metabolic model-based integration of microbiome taxonomic and metabolomic profiles elucidates mechanistic links between ecological and metabolic variation. mSystems 1, e00013-15 (2016).

  22. 22.

    et al. Status and access to the Collaborative Cross population. Mamm. Genome. 23, 706–712 (2012).

  23. 23.

    , & The Collaborative Cross, developing a resource for mammalian systems genetics: a status report of the Wellcome Trust cohort. Mamm. Genome 19, 379–381 (2008).

  24. 24.

    , & Establishment of ‘The Gene Mine’: a resource for rapid identification of complex trait genes. Mamm. Genome. 19, 390–393 (2008).

  25. 25.

    et al. The Collaborative Cross at Oak Ridge National Laboratory: developing a powerful resource for systems genetics. Mamm. Genome. 19, 382–389 (2008).

  26. 26.

    et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).

  27. 27.

    et al. Improved bacterial 16S rRNA gene (V4 and V4–5) and fungal internal transcribed spacer marker gene primers for microbial community surveys. mSystems 1, e00009-15 (2015).

  28. 28.

    et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

  29. 29.

    ea-utils: Command-Line Tools for Processing Biological Sequencing Data (Expression Analysis, 2011);

  30. 30.

    Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

  31. 31.

    , , , & UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194–2200 (2011).

  32. 32.

    , & vsearch: VSEARCH Version 1.1.3 (2015);

  33. 33.

    et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

  34. 34.

    et al. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266–267 (2010).

  35. 35.

    , & Fasttree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

  36. 36.

    , & Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

  37. 37.

    & Unifrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228–8235 (2005).

  38. 38.

    & Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8, e61217 (2013).

  39. 39.

    ggplot2: Elegant Graphics for Data Analysis (Springer, 2010).

  40. 40.

    R-Core-Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2016);

  41. 41.

    & Creating reference gene annotation for the mouse C57BL6/J genome assembly. Mamm. Genome 26, 366–378 (2015).

  42. 42.

    et al. The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease. Nucleic Acids Res. 43, D726–D736 (2015).

  43. 43.

    , & Ggbio: an R package for extending the grammar of graphics for genomic data. Genome Biol. 13, R77 (2012).

  44. 44.

    et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

  45. 45.

    et al. Identification of genetic factors that modify motor performance and body weight using Collaborative Cross mice. Sci. Rep. 5, 16247 (2015).

  46. 46.

    & Classification and regression by randomForest. R News 2, 18–22 (2002).

  47. 47.

    et al. Importance of sulfur-containing metabolites in discriminating fecal extracts between normal and type-2 diabetic mice. J. Proteome Res. 13, 4220–4231 (2014).

  48. 48.

    et al. Salmonella modulates metabolism during growth under conditions that induce expression of virulence genes. Mol. Biosyst. 9, 1522–1534 (2013).

  49. 49.

    et al. Metabolitedetector: comprehensive analysis tool for targeted and nontargeted GC/MS based metabolome analysis. Anal. Chem. 81, 3429–3439 (2009).

  50. 50.

    et al. Proposed minimum reporting standards for chemical analysis. Metabolomics 3, 211–221 (2007).

  51. 51.

    et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013).

  52. 52.

    & MUSiCC: a marker genes based framework for metagenomic normalization and accurate profiling of gene abundances in the microbiome. Genome Biol. 16, 27 (2015).

Download references


The authors thank S.E. Cates, N.N. Robinson and G.D. Shaw in the Systems Genetics Core at UNC for technical assistance and M.H. Stoiber for helpful discussions, especially regarding statistical analysis. This work was primarily supported by funding from the Office of Naval Research under ONR contract N0001415IP00021 (J.J., J.H.M. and A.M.S.). Additional support was provided by the Low Dose Scientific Focus Area, Office of Biological and Environmental Research, US Department of Energy (G.K., J.H.M. and A.M.S.) and the Lawrence Berkeley National Laboratory Directed Research and Development (LDRD) program funding under the Microbes to Biomes (M2B) initiative (S.C., B.B., G.K., J.H.M. and A.M.S.). C.N. was supported by an NSF IGERT DGE-1258485 fellowship and in part by New Innovator Award DP2 AT007802-01 to E.B. Partial support was also provided under the Microbiomes in Transition (MinT) Initiative as part of the Laboratory Directed Research and Development Program at PNNL. Metabolomic measurements were performed in the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the US DOE OBER and located at PNNL in Richland, Washington. PNNL and LBNL are multi-program national laboratories operated by Battelle for the DOE under contract DE-AC05-76RLO 1830 and the University of California for the DOE under contract DE AC02-05CH11231, respectively.

Author information

Author notes

    • Antoine M. Snijders
    • , Sasha A. Langley
    •  & Young-Mo Kim

    These authors contributed equally to this work.


  1. Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA

    • Antoine M. Snijders
    • , Sasha A. Langley
    • , Yurong Huang
    • , Gary H. Karpen
    •  & Jian-Hua Mao
  2. Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, USA

    • Young-Mo Kim
    • , Colin J. Brislawn
    • , Erika M. Zink
    • , Sarah J. Fansler
    • , Cameron P. Casey
    • , Janet K. Jansson
    •  & Thomas O. Metz
  3. Department of Genome Sciences, University of Washington, Seattle, Washington 98105, USA

    • Cecilia Noecker
    •  & Elhanan Borenstein
  4. Systems Genetics Core Facility, Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA

    • Darla R. Miller
  5. Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA

    • Gary H. Karpen
  6. Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA

    • Susan E. Celniker
    •  & James B. Brown
  7. Department of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA

    • Elhanan Borenstein
  8. Santa Fe Institute, Santa Fe, New Mexico 87501, USA

    • Elhanan Borenstein


  1. Search for Antoine M. Snijders in:

  2. Search for Sasha A. Langley in:

  3. Search for Young-Mo Kim in:

  4. Search for Colin J. Brislawn in:

  5. Search for Cecilia Noecker in:

  6. Search for Erika M. Zink in:

  7. Search for Sarah J. Fansler in:

  8. Search for Cameron P. Casey in:

  9. Search for Darla R. Miller in:

  10. Search for Yurong Huang in:

  11. Search for Gary H. Karpen in:

  12. Search for Susan E. Celniker in:

  13. Search for James B. Brown in:

  14. Search for Elhanan Borenstein in:

  15. Search for Janet K. Jansson in:

  16. Search for Thomas O. Metz in:

  17. Search for Jian-Hua Mao in:


A.M.S., J.-H.M. and J.K.J. conceived and designed the study. A.M.S. and J.-H.M. performed the mouse experiments, acquired the data, performed data analysis, interpreted results and co-wrote the manuscript. S.A.L. performed data analysis, interpreted results and co-wrote the manuscript. T.O.M. and Y.-M.K. performed metabolome data analysis, interpreted results and co-wrote the manuscript. C.J.B. performed microbiome data analysis and interpreted results. C.N. performed metabolic modelling-based taxonomic and metabolomics integration. E.M.Z. prepared microbiome samples and performed GC–MS-based metabolomics analysis. S.J.F. carried out microbiome sequencing. C.P.C. performed metabolome data analysis and interpreted results. D.R.M. acquired data. Y.H. performed in vivo experiments and collected data. G.H.K. and S.E.C. interpreted results and co-wrote the manuscript. J.B.B. supervised the integrative data analysis, interpreted results and co-wrote the manuscript. E.B. performed data analysis, interpreted results and co-wrote the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare no competing financial interests

Corresponding authors

Correspondence to Janet K. Jansson or Thomas O. Metz or Jian-Hua Mao.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    Supplementary Figures 1-10, Supplementary Tables 1-12

Excel files

  1. 1.

    Supplementary Table 2

    Normalized amplicon abundance.

  2. 2.

    Supplementary Table 3

    Differentially abundant fecal operational taxonomic units (OTUs) between animal facility built environments (BE1 vs BE2).

  3. 3.

    Supplementary Table 4

    P-values for each genetic locus obtained using Mann-Whitney U test for all OTUs.

  4. 4.

    Supplementary Table 5

    Joint QTL intervals and candidate genes.

  5. 5.

    Supplementary Table 6

    Linkage analysis of microbial families.

  6. 6.

    Supplementary Table 7

    Candidate genes in genetic loci associated with specific microbial families.

  7. 7.

    Supplementary Table 10

    Metabolomics data including original intensity of the detected metabolites from murine feces and their zscored transformed values in separate tabs.

  8. 8.

    Supplementary Table 11

    Metabolite profiles in fecal samples of four CC strains maintained on different diets.

  9. 9.

    Supplementary Table 12

    A list of all metabolites assayed and analyzed in terms of community metabolic potential for each subset of the data, detailing correlations between metabolomics data and community metabolic potential scores and potential taxonomic contributors.

About this article

Publication history





Further reading