Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Association of host genome with intestinal microbial composition in a large healthy cohort


Intestinal microbiota is known to be important in health and disease. Its composition is influenced by both environmental and host factors. Few large-scale studies have evaluated the association between host genetic variation and the composition of microbiota. We recruited a cohort of 1,561 healthy individuals, of whom 270 belong in 123 families, and found that almost one-third of fecal bacterial taxa were heritable. In addition, we identified 58 SNPs associated with the relative abundance of 33 taxa in 1,098 discovery subjects. Among these, four loci were replicated in a second cohort of 463 subjects: rs62171178 (nearest gene UBR3) associated with Rikenellaceae, rs1394174 (CNTN6) associated with Faecalibacterium, rs59846192 (DMRTB1) associated with Lachnospira, and rs28473221 (SALL3) associated with Eubacterium. After correction for multiple testing, 6 of the 58 associations remained significant, one of which replicated. These results identify associations between specific genetic variants and the gut microbiome.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Taxonomic representation of heritable or SNP-associated taxa.
Figure 2: Beeswarm plots of replicated taxon–SNP associations.
Figure 3: Regional association plot of rs62171178 with Rikenellaceae.

Accession codes

Primary accessions



  1. Huse, S.M., Ye, Y., Zhou, Y. & Fodor, A.A. A core human microbiome as viewed through 16S rRNA sequence clusters. PLoS One 7, e34242 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Schloissnig, S. et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50 (2013).

    Article  PubMed  Google Scholar 

  3. Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP–trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

    Article  CAS  PubMed  Google Scholar 

  5. Gao, X., Starmer, J. & Martin, E.R. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet. Epidemiol. 32, 361–369 (2008).

    Article  PubMed  Google Scholar 

  6. Gao, X., Becker, L.C., Becker, D.M., Starmer, J.D. & Province, M.A. Avoiding the high Bonferroni penalty in genome-wide association studies. Genet. Epidemiol. 34, 100–105 (2010).

    PubMed  PubMed Central  Google Scholar 

  7. Goodrich, J.K. et al. Human genetics shape the gut microbiome. Cell 159, 789–799 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012).

    Article  CAS  PubMed  Google Scholar 

  9. Karlsson, F.H. et al. Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498, 99–103 (2013).

    Article  CAS  PubMed  Google Scholar 

  10. Davenport, E.R. et al. Genome-wide association studies of the human gut microbiota. PLoS One 10, e0140301 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn's disease. Cell Host Microbe 15, 382–392 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Kevans, D. et al. IBD genetic risk profile in healthy first-degree relatives of Crohn's disease patients. J. Crohn's Colitis 10, 209–215 (2016).

    Article  Google Scholar 

  13. Chanock, S.J. et al. Replicating genotype–phenotype associations. Nature 447, 655–660 (2007).

    Article  CAS  PubMed  Google Scholar 

  14. O'Toole, P.W. & Jeffery, I.B. Gut microbiota and aging. Science 350, 1214–1215 (2015).

    Article  CAS  PubMed  Google Scholar 

  15. Chang, Y.J. et al. Complete genome sequence of Acidaminococcus fermentans type strain (VR4). Stand. Genomic Sci. 3, 1–14 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  16. GTEx Consortium. The Genotype-Tissue Expression (GTEx) Project. Nat. Genet. 45, 580–585 (2013).

  17. Maller, J.B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Liang, C. et al. Sjogren syndrome antigen B (SSB)/La promotes global microRNA expression by binding microRNA precursors through stem-loop recognition. J. Biol. Chem. 288, 723–736 (2013).

    Article  CAS  PubMed  Google Scholar 

  19. Jiang, X. & Chen, Z.J. The role of ubiquitylation in immune defence and pathogen evasion. Nat. Rev. Immunol. 12, 35–48 (2011).

    Article  PubMed  Google Scholar 

  20. Ashida, H., Kim, M. & Sasakawa, C. Exploitation of the host ubiquitin system by human bacterial pathogens. Nat. Rev. Microbiol. 12, 399–413 (2014).

    Article  CAS  PubMed  Google Scholar 

  21. Sokol, H. et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc. Natl. Acad. Sci. USA 105, 16731–16736 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Manichanh, C., Borruel, N., Casellas, F. & Guarner, F. The gut microbiota in IBD. Nat. Rev. Gastroenterol. Hepatol. 9, 599–608 (2012).

    Article  CAS  PubMed  Google Scholar 

  23. Turnbaugh, P.J. et al. A core gut microbiome in obese and lean twins. Nature 457, 480–484 (2009).

    Article  CAS  PubMed  Google Scholar 

  24. Spor, A., Koren, O. & Ley, R. Unravelling the effects of the environment and host genotype on the gut microbiome. Nat. Rev. Microbiol. 9, 279–290 (2011).

    Article  CAS  PubMed  Google Scholar 

  25. Thjodleifsson, B. et al. Subclinical intestinal inflammation: an inherited abnormality in Crohn's disease relatives? Gastroenterology 124, 1728–1737 (2003).

    Article  PubMed  Google Scholar 

  26. Human Microbiome Project Consortium. A framework for human microbiome research. Nature 486, 215–221 (2012).

  27. Caporaso, J.G. et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Masella, A.P., Bartram, A.K., Truszkowski, J.M., Brown, D.G. & Neufeld, J.D. PANDAseq: paired-end assembler for Illumina sequences. BMC Bioinformatics 13, 31 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Caporaso, J.G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Edgar, R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

    Article  CAS  PubMed  Google Scholar 

  31. DeSantis, T.Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Langille, M.G. et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Paradis, E., Claude, J. & Strimmer, K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 20, 289–290 (2004).

    Article  CAS  PubMed  Google Scholar 

  34. Tatusov, R.L., Koonin, E.V. & Lipman, D.J. A genomic perspective on protein families. Science 278, 631–637 (1997).

    Article  CAS  PubMed  Google Scholar 

  35. Li, J. & Ji, L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb.) 95, 221–227 (2005).

    Article  CAS  Google Scholar 

  36. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    PubMed  PubMed Central  Google Scholar 

  38. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457–470 (2011).

    Article  Google Scholar 

  39. Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One 8, e64683 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Almasy, L. & Blangero, J. Multipoint quantitative-trait linkage analysis in general pedigrees. Am. J. Hum. Genet. 62, 1198–1211 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Liang, K.-Y. & Zeger, S.L. Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986).

    Article  Google Scholar 

  42. Pan, W. Akaike's information criterion in generalized estimating equations. Biometrics 57, 120–125 (2001).

    Article  CAS  PubMed  Google Scholar 

  43. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references


We thank the members of the GEM Global Project Office, C. Bravi, D. Couchman, N. Ganeswaren, A. Keludjian, K. Ow, R. Caplan, M. Greaves, A. Craig-Neil, A. Olteanu, N. Allam, A. Garrioch, D. Ng, V. Onay, and I. Yeadon, for administrative support. We thank D. Cvitkovitch for his helpful scientific discussion of the project. All authors disclose no potential conflicts (financial, professional, or personal) that are relevant to the manuscript. This study was supported by grants from Crohn's and Colitis Canada, Canadian Institutes of Health Research (CIHR) grant CMF108031 and the Helmsley Charitable Trust. W.T. is the recipient of a Postdoctoral Fellowship Research Award from CIHR Fellowship/Canadian Association of Gastroenterology (CAG)/Ferring Pharmaceuticals, Inc., and a fellowship from the Department of Medicine, Mount Sinai Hospital, Toronto. M.S.S. is supported in part by the Gale and Graham Wright Chair in Digestive Diseases. D.K. is the recipient of a CIHR/CAG/Abbvie IBD Fellowship Award. L.X. is the recipient of a CIHR STAGE fellowship. K.S. is the recipient of an Ontario Graduate Scholarship.

Author information

Authors and Affiliations




A.D.P. and K.C. contributed equally. A.D.P. and K.C. jointly supervised research. W.T., M.S.S., A.D.P., K.C., and the GEM Project Steering Committee conceived and designed the experiments. W.T., W.X., L.X., and K.C. performed the experiments. W.T., O.E.-G., W.X., L.X., and A.D.P. performed statistical analysis. W.T., M.S.S., M.I.S., W.X., G.M.-H., D.K., K.S., O.E.-G., D.S.G., L.X., A.D.P., and K.C. analyzed the data. W.X., L.X., A.D.P., A.G., R.P., A.O., and the GEM Project Consortium contributed reagents, materials, and/or analysis tools and contributed significant subject recruitment. W.T., A.D.P., and K.C. wrote the manuscript.

Corresponding authors

Correspondence to Andrew D Paterson or Kenneth Croitoru.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8 and Supplementary Note. (PDF 3576 kb)

Supplementary Table 1

Geographic origin of individuals in the discovery and replication cohorts. (XLSX 18 kb)

Supplementary Table 2

Heritability assessment of bacterial taxa, choa1, Shannon PD whole-tree α diversity index, and microbial dysbiosis index in 123 independent families. (XLSX 42 kb)

Supplementary Table 3

P values of association of the bacterial α diversity measure with SNPs in the discovery cohort. (XLS 102 kb)

Supplementary Table 4

P values of association of SNPs with the bacterial microbial dysbiosis index in the discovery cohort. (XLSX 18 kb)

Supplementary Table 5

P values of association of the bacterial taxa with 123 IBD-related SNPs in the discovery cohort. (XLS 10046 kb)

Supplementary Table 6

Significant genome-wide associated taxa. (XLSX 55 kb)

Supplementary Table 7

Results of colocalization analysis of rs62171178 with Rikenellaceae relative abundance and PHOSPHO2 expression in the stomach. (XLSX 222 kb)

Supplementary Table 8

P values for association of bacterial COG function with rs62171178. (XLSX 530 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Turpin, W., Espin-Garcia, O., Xu, W. et al. Association of host genome with intestinal microbial composition in a large healthy cohort. Nat Genet 48, 1413–1417 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing