Host genetics are known to influence the gut microbiome, yet their role remains poorly understood. To robustly characterize these effects, we performed a genome-wide association study of 207 taxa and 205 pathways representing microbial composition and function in 7,738 participants of the Dutch Microbiome Project. Two robust, study-wide significant (P < 1.89 × 10−10) signals near the LCT and ABO genes were found to be associated with multiple microbial taxa and pathways and were replicated in two independent cohorts. The LCT locus associations seemed modulated by lactose intake, whereas those at ABO could be explained by participant secretor status determined by their FUT2 genotype. Twenty-two other loci showed suggestive evidence (P < 5 × 10−8) of association with microbial taxa and pathways. At a more lenient threshold, the number of loci we identified strongly correlated with trait heritability, suggesting that much larger sample sizes are needed to elucidate the remaining effects of host genetics on the gut microbiome.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $6.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Raw sequencing microbiome data are available at European Genome-Phenome archive (accession number EGAS00001005027). Genotyping data and participant metadata are not publicly available to protect participants’ privacy, and neither can be deposited in public repositories to respect the research agreements in the informed consent. The data can be accessed by all bona-fide researchers with a scientific proposal by contacting the LifeLines Biobank (instructions at https://www.lifelines.nl/researcher/how-to-apply). Researchers will need to fill in an application form that will be reviewed within 2 weeks. If the proposed research complies with LifeLines regulations, such as noncommercial use and warranty of participants’ privacy, then researchers will receive a financial offer and a data and material transfer agreement to sign. In general, data will be released within 2 weeks after signing the offer and data and material transfer agreement. The data will be released in a remote system (the LifeLines workspace) running on a high-performance computer cluster to ensure data quality and security. The full GWAS summary statistical data for all 207 taxa and 205 pathways are instead available for direct download at NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/) under the study accession numbers GCST90027446-GCST90027857 (accession numbers for each specific taxa and pathways can be found in Supplementary Table 13) or at https://dutchmicrobiomeproject.molgeniscloud.org. The processed microbiome data (taxonomy and pathway abundance per individual) can also be downloaded after filling in a request form available at the same website and after signing a data access agreement. This study also used the following databases: UniRef90 v.0.1.1 protein database and the ChocoPhlAn pangenome databases available within the Humann2 pipeline (https://huttenhower.sph.harvard.edu/humann2/), the Genome Taxonomy Database (https://gtdb.ecogenomic.org/) and the IEU GWAS database (https://gwas.mrcieu.ac.uk/). All other data supporting the findings of this study are available within the paper and Supplementary Note.
Valdes, A. M., Walter, J., Segal, E. & Spector, T. D. Role of the gut microbiota in nutrition and health. BMJ 361, k2179 (2018).
Hall, A. B., Tolonen, A. C. & Xavier, R. J. Human genetic variation and the gut microbiome in disease. Nat. Rev. Genet. 18, 690–699 (2017).
Fan, Y. & Pederson, O. Gut microbiota in human metabolic health and disease.Nat. Rev. Microbiol. 19, 55–71 (2020).
Zhernakova, A. et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science 352, 565–569 (2016).
Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016).
Goodrich, J. K. et al. Genetic determinants of the gut microbiome in UK twins. Cell Host Microbe 19, 731–743 (2016).
Rothschild, D. et al. Environment dominates over host genetics in shaping human gut microbiota. Nature 555, 210–215 (2018).
Bonder, M. J. et al. The effect of host genetics on the gut microbiome. Nat. Genet. 48, 1407–1412 (2016).
Wang, J. et al. Genome-wide association analysis identifies variation in vitamin D receptor and other host factors influencing the gut microbiota. Nat. Genet. 48, 1396–1406 (2016).
Turpin, W. et al. Association of host genome with intestinal microbial composition in a large healthy cohort. Nat. Genet. 48, 1413–1417 (2016).
Hughes, D. A. et al. Genome-wide associations of human gut microbiome variation and implications for causal inference analyses. Nat. Microbiol. 5, 1079–1087 (2020).
Kurilshikov, A., Wijmenga, C., Fu, J. & Zhernakova, A. Host genetics and gut microbiome: challenges and perspectives. Trends Immunol. 38, 633–647 (2017).
Kurilshikov, A. et al. Large-scale association analyses identify host factors influencing human gut microbiome composition. Nat. Genet. 53, 156–165 (2021).
Blekhman, R. et al. Host genetic variation impacts microbiome composition across human body sites. Genome Biol. 16, 191 (2015).
Vieira-Silva, S. et al. Species–function relationships shape ecological properties of the human gut microbiome. Nat. Microbiol. 1, 1–8 (2016).
Gacesa, R. et al. The Dutch Microbiome Project defines factors that shape the healthy gut microbiome. Preprint at bioRxiv https://doi.org/10.1101/2020.11.27.401125 (2020).
Sanna, S. et al. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. Nat. Genet. 51, 600–605 (2019).
Qin, Y. et al. Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort. Preprint at medRxiv https://doi.org/10.1101/2020.09.12.20193045 (2020).
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
Liu, X. et al. A genome-wide association study for gut metagenome in Chinese adults illuminates complex diseases. Cell Discov. 7, 9 (2021).
Van Der Heide, H. M., Magnee, W. & Van Loghem, J. J. Blood group frequencies in the Netherlands. Am. J. Hum. Genet. 3, 344–347 (1951).
Rühlemann, M. C. et al. Genome-wide association study in 8,956 German individuals identifies influence of ABO histo-blood groups on gut microbiome. Nat. Genet. 53, 147–155 (2021).
Turroni, F. et al. Bifidobacterium bifidum as an example of a specialized human gut commensal. Front. Microbiol. 5, 437 (2014).
Turroni, F., Milani, C., van Sinderen, D. & Ventura, M. Genetic strategies for mucin metabolism in Bifidobacterium bifidum PRL2010: an example of possible human-microbe co-evolution. Gut Microbes 2, 183–189 (2011).
Bonder, M. J. et al. Genetic and epigenetic regulation of gene expression in fetal and adult human livers. BMC Genomics 15, 860 (2014).
Moon, Jee-Young et al. Milk intake, host LCT genotype and gut Bifidobacteria in relation to obesity: results from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Circulation 141, AP459 (2020).
Arnolds, K. L., Martin, C. G. & Lozupone, C. A. Blood type and the microbiome: untangling a complex relationship with lessons from pathogens. Curr. Opin. Microbiol. 56, 59–66 (2020).
Mäkivuokko, H. et al. Association between the ABO blood group and the human intestinal microbiota composition. BMC Microbiol. 12, 94 (2012).
Liu, X. et al. Inter-determination of blood metabolite levels and gut microbiome supported by Mendelian randomization. Preprint at bioRxiv https://doi.org/10.1101/2020.06.30.181438 (2020).
Motta, V., Luise, D., Bosi, P. & Trevisi, P. Faecal microbiota shift during weaning transition in piglets and evaluation of AO blood types as shaping factor for the bacterial community profile. PLoS One 14, e0217001 (2019).
Yang, H., et al. An ancient deletion in the ABO gene affects the composition of the porcine microbiome by altering intestinal N-acetyl-galactosamine concentrations. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2020.07.16.206219v1 (2020).
Chen, L. et al. The long-term genetic stability and individual specificity of the human gut microbiome. Cell 184, 2302–2315 (2021).
Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004).
Auer, P. L. et al. Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project. Am. J. Hum. Genet. 91, 794–808 (2012).
Ségurel, L. et al. The ABO blood group is a trans-species polymorphism in primates. Proc. Natl Acad. Sci. USA 109, 18493–18498 (2012).
Band, G. et al. Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania. Nat. Commun. 10, 5732 (2019).
Barua, D. & Paguio, A. S. ABO blood groups and cholera. Ann. Hum. Biol. 4, 489–492 (1977).
Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).
Naitza, S. et al. A genome-wide association scan on the levels of markers of inflammation in sardinians reveals associations that underpin its complex regulation. PLoS Genet. 8, e1002480 (2012).
Suzuki, T. A. et al. The role of the microbiota in human genetic adaptation. Science 370, eaaz6827 (2020).
Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).
Wade, K. H. & Hall, L. J. Improving causality in microbiome research: can human genetic epidemiology help? Wellcome Open Res. 4, 199 (2020).
Pirastu, N. et al. Using genetics to disentangle the complex relationship between food choices and health status. Preprint at bioRxiv https://doi.org/10.1101/829952 (2019).
Wang, C. et al. High-salt diet has a certain impact on protein digestion and gut microbiota: a sequencing and proteome combined study. Front. Microbiol. 8, 1838 (2017).
Culligan, E. P. et al. Combined metagenomic and phenomic approaches identify a novel salt tolerance gene from the human gut microbiome. Front. Microbiol. 5, 189 (2014).
Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 163, 1079–1094 (2015).
Zhong, V. W. et al. A genome-wide association study of bitter and sweet beverage consumption. Hum. Mol. Genet. 28, 2449–2457 (2019).
Bjørkhaug, S. T. et al. Characterization of gut microbiota composition and functions in patients with chronic alcohol overconsumption. Gut Microbes 10, 663–675 (2019).
Sanna, S. et al. Common variants in the GDF5-UQCC region are associated with variation in human height. Nat. Genet. 40, 198–203 (2008).
Weedon, M. N. et al. A common variant of HMGA2 is associated with adult and childhood height in the general population. Nat. Genet. 39, 1245–1250 (2007).
Loos, R. J. F. et al. Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat. Genet. 40, 768–775 (2008).
Tigchelaar, E. F. et al. Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics. BMJ Open 5, e006772 (2015).
Scholtens, S. et al. Cohort Profile: LifeLines, a three-generation cohort study and biobank. Int. J. Epidemiol. 44, 1172–1180 (2015).
the Haplotype Reference Consortium. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Lopera Maya, E. A. et al. Lack of association between genetic variants at ACE2 and TMPRSS2 genes involved in SARS-CoV-2 infection and human quantitative phenotypes. Front. Genet. 11, 613 (2020).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Francioli, L. C. et al. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).
Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).
Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962–968 (2018).
Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinforma. Oxf. Engl. 31, 926–932 (2015).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Siebelink, E., Geelen, A. & de Vries, J. H. M. Self-reported energy intake by FFQ compared with actual energy intake to maintain body weight in 516 adults. Br. J. Nutr. 106, 274–281 (2011).
Willett, W. C. Nutritional Epidemiology (Oxford Univ. Press, 2012).
Zheng, X. et al. SAIGEgds: an efficient statistical tool for large-scale PheWAS with mixed models. Bioinformatics 37, 728–730 (2021).
The Severe Covid-19 GWAS Group Genome-wide association study of severe COVID-19 with respiratory failure. N. Engl. J. Med. 383, 1522–1534 (2020).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
Elsworth, B. et al. The MRC IEU OpenGWAS data infrastructure. Preprint at bioRxiv https://doi.org/10.1101/2020.08.10.244293 (2020).
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
Burgess, S. & Thompson, S. G. Mendelian Randomization Methods for Using Genetic Variants in Causal Estimation (CRC Press, 2015).
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
We acknowledge the services of the LifeLines Cohort Study, the contributing research centers delivering data to LifeLines and all the study participants. The LifeLines initiative was made possible by subsidy from the Dutch Ministry of Health, Welfare and Sport; the Dutch Ministry of Economic Affairs; the University Medical Center Groningen (UMCG); the University of Groningen (UG) and the Provinces of the North of the Netherlands (Drenthe, Friesland and Groningen). This project was carried out under LifeLines project number OV18_0464. We thank Mathieu Plateel and Jody Geelderloos-Arends for their contribution in genotyping the LifeLines samples, Kate McIntyre for help developing the manuscript, Marije van der Geest for setting up the website for sharing summary statistics and Patrick Deelen for discussion of results. We also thank the UMCG Genomics Coordination Center, the UG Center for Information Technology and their sponsors (BBMRI-NL and TarGet) for storage and computational infrastructure and Novogene for providing gut metagenome sequencing of all DMP samples. Finally, we thank the UK Biobank for making their resource available. Analyses of UK Biobank data described in this work were carried out under project number 48548 to C.W. The generation and management of genotype data for the LifeLines Cohort Study was supported by the UMCG Genetics LifeLines Initiative. Genotyping quality control was supported by UMCG (HAP grant CD017.0031/ronde 2017-2/nr 324). Metagenomics sequencing of the cohort was mainly funded by the CardioVasculair Onderzoek Nederland (CVON) (grant CVON 2012-03) to M. Hofken (who died in 2016), J.F. and A.Z., as well as other grants to R.K.W. and C.W. (listed below). This work was further supported by the collaborative TIMID project LSHM18057-SGF financed by the allowance made available by Top Sector Life Sciences & Health to Samenwerkende Gezondheidsfondsen to stimulate public/private partnerships and cofinancing by health foundations that are part of the Samenwerkende Gezondheidsfondsen (R.K.W.); the the Seerave Foundation (R.K.W.); European Research Council (ERC) starting grant 715772 (S.Z.), consolidator grant 101001678 (J.F.) and advanced grant ERC-671274 (C.W.); Netherlands Organization for Scientific Research VIDI grant 016.178.056 (A.Z.), gravitation grant ExposomeNL 024.004.017 (A.Z.), VICI grant VI.C.202.022 (J.F.), gravitation grant The Netherlands Organ-on-Chip Initiative 024.003.001 (C.W.) and Spinoza award NWOSPI 92-266 (C.W.); CVON grant 2018-27 (A.Z. and J.F.); the EurHealth-1Health INTERREG V A 202085 project (H.J.M.H.); Foundation De Cock-Hadders grant 20:20-13 (L.C.); a joint fellowship from the UMCG and China Scholarship Council (CSC201708320268 to L.C.); and Colciencias fellowship ed.783 (E.A.L.-M.).
The authors declare no competing interests.
Peer review information
Nature Genetics thanks A. Franke and T. Zhang for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Cladogram plot tree of taxonomic relations between bacteria of the class Actinobacteria and their associations with host genetics.
Each node shows a taxonomic level (from outside to inside: phylogenetic group, phylum, class, order, family, genus and species). Note that branch lengths do not represent phylogenetic distance. Inner labels represent genetic locus. External labels represent the clade. Nodes with dotted lines indicate that the GWAS was not performed for that taxa. Node color corresponds to different levels of significance as described in the legend. a, Depicts associations detected at the MCM6/LCT locus with each taxa, using the most significant p-value observed between rs4988235 and rs182549. b, Depicts associations at the ABO locus with each taxa, using the most significant p-value observed between rs8176645 and rs550057.
Extended Data Fig. 2 Association at the LCT locus and interaction with lactose intake in other members of family Bifidobacteriaceae.
Relative abundances of taxa, natural log–transformed and adjusted by age and sex, compared between LP (rs182549 C/T or T/T) and LI (rs182549 C/C) participants and among individuals with low or high daily lactose intake levels. The y axis represents the relative abundance of the microbial feature, natural log–transformed and adjusted by age and sex. Density distribution is displayed with violin plots, while boxplots represent summary statistics: the center line represents the median, the box hinges represent the lower and upper quartiles (percentiles 25 and 75) of the distribution, the upper whisker extends to the maximum value no further than 1.5*IQR (where IQR is the interquartile range) from the upper hinge, the lower whisker extends to the minimum value no further than 1.5*IQR from the lower hinge, and data beyond the end of the whiskers are outliers plotted as individual points. a and c, Relative abundances for the taxa between LP and LI participants. b and d, Comparisons of abundance between lactose intake levels, low (<first quartile) and high (≥ first quartile), stratified by lactose persistence status. The distributions are shown for s. Bifidobacterium adolescentis (top) and s. Bifidobacterium longum (bottom). P-values were obtained with a two-sided Wilcoxon rank test. n: number of participants.
Extended Data Fig. 3 Graphical representation of MR results with a Benjamini–Hochberg FDR q value < 0.1.
a, Effect size in standard deviation units of 3 variants associated with Alistipes abundance changes that were used as instrumental variables (effects estimated on 7,728 independent samples) (x-axis) versus effect size in standard deviation units of the same variants for salt intake (estimated effects estimated on 462,630 independent samples) (y-axis). Error bars represent standard errors (SE) of each effect size (beta + SE and beta-SE). The orange and blue lines represent lines whose slope is the causal estimate from MR methods IVW and Egger, respectively. b, A plot similar to a, but the x axis is the effect size in standard deviation units for instrumental variants selected for Collinsella (effects estimated on 7,210 independent samples) abundance and on the y-axis for Triglyceride levels (effects estimated on 343,992 independent individuals).
About this article
Cite this article
Lopera-Maya, E.A., Kurilshikov, A., van der Graaf, A. et al. Effect of host genetics on the gut microbiome in 7,738 participants of the Dutch Microbiome Project. Nat Genet 54, 143–151 (2022). https://doi.org/10.1038/s41588-021-00992-y
This article is cited by
Nature Genetics (2022)
Nature Reviews Gastroenterology & Hepatology (2022)
Nature Reviews Gastroenterology & Hepatology (2022)
Nature Reviews Genetics (2022)