Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals

Abstract

The gut microbiome is shaped by diet and influences host metabolism; however, these links are complex and can be unique to each individual. We performed deep metagenomic sequencing of 1,203 gut microbiomes from 1,098 individuals enrolled in the Personalised Responses to Dietary Composition Trial (PREDICT 1) study, whose detailed long-term diet information, as well as hundreds of fasting and same-meal postprandial cardiometabolic blood marker measurements were available. We found many significant associations between microbes and specific nutrients, foods, food groups and general dietary indices, which were driven especially by the presence and diversity of healthy and plant-based foods. Microbial biomarkers of obesity were reproducible across external publicly available cohorts and in agreement with circulating blood metabolites that are indicators of cardiovascular disease risk. While some microbes, such as Prevotella copri and Blastocystis spp., were indicators of favorable postprandial glucose metabolism, overall microbiome composition was predictive for a large panel of cardiometabolic blood markers including fasting and postprandial glycemic, lipemic and inflammatory indices. The panel of intestinal species associated with healthy dietary habits overlapped with those associated with favorable cardiometabolic and postprandial markers, indicating that our large-scale resource can potentially stratify the gut microbiome into generalizable health levels in individuals without clinically manifest disease.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: The PREDICT 1 study associates gut microbiome structure with habitual diet and blood cardiometabolic markers.
Fig. 2: Food quality, regardless of source, is linked to overall and feature-level composition of the gut microbiome.
Fig. 3: Random forest machine learning models trained on microbial or functional profiles can predict obesity phenotypic markers, even on independent cohorts.
Fig. 4: Fasting and postprandial cardiometabolic responses to standardized test meals associated with the microbiome.
Fig. 5: Species-level segregation into healthy and unhealthy microbial signatures of fasting and postprandial cardiometabolic markers.
Fig. 6: Panel of the 30 species showing the strongest overall correlations with a selection of markers of nutritional and cardiometabolic health.

Data availability

The metagenomes are deposited in European Bioinformatics Institute European Nucleotide Archive under accession no. PRJEB39223. The non-metagenomic data used for analysis in this study are held by the Department of Twin Research at King’s College London. The data can be released to bona fide researchers using our normal procedures overseen by the Wellcome Trust and its guidelines as part of our core funding. We receive around 100 requests per year for our datasets and have three meetings per month with independent members to assess proposals. The application can be found at https://twinsuk.ac.uk/resources-for-researchers/access-our-data/. This means that data need to be anonymized and conform to GDPR standards.

Code availability

Computational analyses were performed using the bioBakery suite of tools; species-level microbial abundances were computed using MetaPhlAn v.3.0 (https://github.com/biobakery/MetaPhlAn). Functional potential profiling was carried out with HUMAnN v.2.0 (https://github.com/biobakery/humann; Methods).

References

  1. 1.

    Ng, M. et al. Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 384, 766–781 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  2. 2.

    Brown, J. M. & Hazen, S. L. Microbial modulation of cardiovascular disease. Nat. Rev. Microbiol. 16, 171–181 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  3. 3.

    Le Chatelier, E. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4.

    Sze, M. A. & Schloss, P. D. Looking for a signal in the noise: revisiting obesity and the microbiome. mBio 7, e01018-16 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e20 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. 6.

    Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. 7.

    Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24, 392–400 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. 8.

    Berry, S. E. et al. Human postprandial responses to food and potential for precision nutrition. Nat. Med. 26, 964–973 (2020).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  9. 9.

    Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 163, 1079–1094 (2015).

    CAS  Article  Google Scholar 

  10. 10.

    Mendes-Soares, H. et al. Model of personalized postprandial glycemic response to food developed for an Israeli cohort predicts responses in Midwestern American individuals. Am. J. Clin. Nutr. 110, 63–75 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  12. 12.

    Zhernakova, A. et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science 352, 565–569 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Thingholm, L. B. et al. Obese individuals with and without type 2 diabetes show different gut microbial functional capacity and composition. Cell Host Microbe 26, 252–264.e10 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Schirmer, M. et al. Linking the human gut microbiome to inflammatory cytokine production capacity. Cell 167, 1897 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  15. 15.

    Fu, J. et al. The gut microbiome contributes to a substantial proportion of the variation in blood lipids. Circ. Res. 117, 817–824 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Berry, S. et al. Personalised REsponses to DIetary Composition Trial (PREDICT): an intervention study to determine inter-individual differences in postprandial response to foods. Preprint at https://protocolexchange.researchsquare.com/article/pex-802/v1 (2020).

  17. 17.

    Xie, H. et al. Shotgun metagenomics of 250 adult twins reveals genetic and environmental impacts on the gut microbiome. Cell Syst. 3, 572–584.e3 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Atabaki-Pasdar, N. et al. Predicting and elucidating the etiology of fatty liver disease: a machine learning modeling and validation study in the IMI DIRECT cohorts. PLoS Med. 17, e1003149 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19.

    Vojinovic, D. et al. Relationship between gut microbiota and circulating metabolites in population-based cohorts. Nat. Commun. 10, 5813 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Vadiveloo, M., Dixon, L. B., Mijanovich, T., Elbel, B. & Parekh, N. Development and evaluation of the US Healthy Food Diversity index. Br. J. Nutr. 112, 1562–1574 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  21. 21.

    Guenther, P. M. et al. Update of the healthy eating index: HEI-2010. J. Acad. Nutr. Diet. 113, 569–580 (2013).

    PubMed  Article  PubMed Central  Google Scholar 

  22. 22.

    Fung, T. T. et al. Diet-quality scores and plasma concentrations of markers of inflammation and endothelial dysfunction. Am. J. Clin. Nutr. 82, 163–173 (2005).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  23. 23.

    Reedy, J. et al. Higher diet quality is associated with decreased risk of all-cause, cardiovascular disease, and cancer mortality among older adults. J. Nutr. 144, 881–889 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. 24.

    Mitrou, P. N. et al. Mediterranean dietary pattern and prediction of all-cause mortality in a US population: results from the NIH-AARP Diet and Health Study. Arch. Intern. Med. 167, 2461–2468 (2007).

    PubMed  Article  PubMed Central  Google Scholar 

  25. 25.

    Satija, A. et al. Plant-based dietary patterns and incidence of type 2 diabetes in US men and women: results from three prospective cohort studies. PLoS Med. 13, e1002039 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Vadiveloo, M., Parekh, N. & Mattei, J. Greater healthful food variety as measured by the US Healthy Food Diversity index is associated with lower odds of metabolic syndrome and its components in US adults. J. Nutr. 145, 564–571 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  27. 27.

    Onvani, S., Haghighatdoost, F., Surkan, P. J., Larijani, B. & Azadbakht, L. Adherence to the Healthy Eating Index and Alternative Healthy Eating Index dietary patterns and mortality from all causes, cardiovascular disease and cancer: a meta-analysis of observational studies. J. Hum. Nutr. Diet. 30, 216–226 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  28. 28.

    Redondo-Useros, N. et al. Associations of probiotic fermented milk (PFM) and yogurt consumption with Bifidobacterium and Lactobacillus components of the gut microbiota in healthy adults. Nutrients 11, 651 (2019).

    CAS  PubMed Central  Article  Google Scholar 

  29. 29.

    Sakamoto, M., Iino, T., Yuki, M. & Ohkuma, M. Lawsonibacter asaccharolyticus gen. nov., sp. nov., a butyrate-producing bacterium isolated from human faeces. Int. J. Syst. Evol. Microbiol. 68, 2074–2081 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  30. 30.

    Satija, A. et al. Healthful and unhealthful plant-based diets and the risk of coronary heart disease in U.S. adults. J. Am. Coll. Cardiol. 70, 411–422 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  31. 31.

    Monteiro, C. A. et al. The UN Decade of Nutrition, the NOVA food classification and the trouble with ultra-processing. Public Health Nutr. 21, 5–17 (2018).

    PubMed  Article  PubMed Central  Google Scholar 

  32. 32.

    Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  33. 33.

    Beaumont, M. et al. Heritable components of the human fecal microbiome are associated with visceral fat. Genome Biol. 17, 189 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  34. 34.

    Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023–1024 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. & Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626–638 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    D’Agostino, R. B. Sr et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation 117, 743–753 (2008).

    PubMed  Article  PubMed Central  Google Scholar 

  37. 37.

    Kettunen, J. et al. Biomarker glycoprotein acetyls is associated with the risk of a wide spectrum of incident diseases and stratifies mortality risk in angiography patients. Circ. Genom. Precis. Med. 11, e002234 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. 38.

    Würtz, P. et al. Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation 131, 774–785 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  39. 39.

    Hrebícek, J., Janout, V., Malincíková, J., Horáková, D. & Cízek, L. Detection of insulin resistance by simple quantitative insulin sensitivity check index QUICKI for epidemiological assessment and prevention. J. Clin. Endocrinol. Metab. 87, 144–147 (2002).

    PubMed  Article  PubMed Central  Google Scholar 

  40. 40.

    Thomas, A. M. et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat. Med. 25, 667–678 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  41. 41.

    Wojczynski, M. K. et al. High-fat meal effect on LDL, HDL, and VLDL particle size and number in the Genetics of Lipid-Lowering Drugs and Diet Network (GOLDN): an interventional study. Lipids Health Dis. 10, 181 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. 42.

    Skeggs, J. W. & Morton, R. E. LDL and HDL enriched in triglyceride promote abnormal cholesterol transport. J. Lipid Res. 43, 1264–1274 (2002).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. 43.

    Hodson, L., Skeaff, C. M. & Fielding, B. A. Fatty acid composition of adipose tissue and blood in humans and its use as a biomarker of dietary intake. Prog. Lipid Res. 47, 348–380 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  44. 44.

    Cohn, J. S. Postprandial lipemia: emerging evidence for atherogenicity of remnant lipoproteins. Can. J. Cardiol. 14, 18B–27B (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174–180 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Huttenhower, C. et al. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214 (2012).

    CAS  Article  Google Scholar 

  47. 47.

    Kovatcheva-Datchary, P. et al. Dietary fiber-induced improvement in glucose metabolism is associated with increased abundance of Prevotella. Cell Metab. 22, 971–982 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  48. 48.

    Pedersen, H. K. et al. Human gut microbes impact host serum metabolome and insulin sensitivity. Nature 535, 376–381 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  49. 49.

    Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in westernized populations. Cell Host Microbe 26, 666–679.e7 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. 50.

    De Filippis, F. et al. Distinct genetic and functional traits of human intestinal Prevotella copri strains are associated with different habitual diets. Cell Host Microbe 25, 444–453.e3 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  51. 51.

    Clark, C. G., van der Giezen, M., Alfellani, M. A. & Stensvold, C. R. Recent developments in Blastocystis research. Adv. Parasitol. 82, 1–32 (2013).

    PubMed  Article  PubMed Central  Google Scholar 

  52. 52.

    Lukeš, J., Stensvold, C. R., Jirků-Pomajbíková, K. & Wegener Parfrey, L. Are human intestinal eukaryotes beneficial or commensals? PLoS Pathog. 11, e1005039 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  53. 53.

    Beghini, F. et al. Large-scale comparative metagenomics of Blastocystis, a common member of the human gut microbiome. ISME J. 11, 2848–2863 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  54. 54.

    Sokol, H. et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc. Natl Acad. Sci. USA 105, 16731–16736 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  55. 55.

    Hall, A. B. et al. A novel Ruminococcus gnavus clade enriched in inflammatory bowel disease patients. Genome Med. 9, 103 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  56. 56.

    Valles-Colomer, M. et al. The neuroactive potential of the human gut microbiota in quality of life and depression. Nat. Microbiol. 4, 623–632 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  57. 57.

    Kim, H., Caulfield, L. E. & Rebholz, C. M. Healthy plant-based diets are associated with lower risk of all-cause mortality in US adults. J. Nutr. 148, 624–631 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Meslier, V. et al. Mediterranean diet intervention in overweight and obese subjects lowers plasma cholesterol and causes changes in the gut microbiome and metabolome independently of energy intake. Gut 69, 1258–1268 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Kurilshikov, A. et al. Gut microbial associations to plasma metabolites linked to cardiovascular phenotypes and risk. Circ. Res. 124, 1808–1820 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  60. 60.

    Ko, C.-W., Qu, J., Black, D. D. & Tso, P. Regulation of intestinal lipid metabolism: current concepts and relevance to disease. Nat. Rev. Gastroenterol. Hepatol. 17, 169–183 (2020).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  61. 61.

    McIver, L. J. et al. bioBakery: a meta’omic analysis environment. Bioinformatics 34, 1235–1237 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  62. 62.

    Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  63. 63.

    Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962–968 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  65. 65.

    Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  66. 66.

    Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. 67.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Meth. 9, 357–359 (2012).

    CAS  Article  Google Scholar 

  68. 68.

    Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  69. 69.

    Bingham, S. A. et al. Nutritional methods in the European Prospective Investigation of Cancer in Norfolk. Public Health Nutr. 4, 847–858 (2001).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  70. 70.

    Mulligan, A. A. et al. A new tool for converting food frequency questionnaire data into nutrient and food group values: FETA research methods and availability. BMJ Open 4, e004503 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    McCance and Widdowson’s The Composition of Foods 7th edn (Public Health England, 2014).

  72. 72.

    Food Portion Sizes 3rd edn (Food Standards Agency, 2002).

  73. 73.

    Rimm, E. B. et al. Reproducibility and validity of an expanded self-administered semiquantitative food frequency questionnaire among male health professionals. Am. J. Epidemiol. 135, 1114–1126 (1992).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  74. 74.

    Frankenfield, D. C., Muth, E. R. & Rowe, W. A. The Harris–Benedict studies of human basal metabolism: history and limitations. J. Am. Diet. Assoc. 98, 439–445 (1998).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  75. 75.

    McGuire, S. U.S. Department of Agriculture and U.S. Department of Health and Human Services, Dietary Guidelines for Americans, 2010. 7th Edition, Washington, DC: U.S. Government Printing Office, January 2011. Adv. Nutr. 2, 293–294 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  76. 76.

    World Health Organization & Brouwer, I. A. Effect of trans-fatty acid intake on blood lipids and lipoproteins: a systematic review and meta-regression analysis. World Health Organization https://apps.who.int/iris/handle/10665/246109 (2016).

  77. 77.

    Zhong, V. W. et al. Associations of dietary cholesterol or egg consumption with incident cardiovascular disease and mortality. JAMA 321, 1081–1095 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    de Souza, R. J. et al. Intake of saturated and trans unsaturated fatty acids and risk of all cause mortality, cardiovascular disease, and type 2 diabetes: systematic review and meta-analysis of observational studies. BMJ 351, h3978 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  79. 79.

    Michaëlsson, K. et al. Milk intake and risk of mortality and fractures in women and men: cohort studies. BMJ 349, g6015 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  80. 80.

    Mazidi, M. et al. Consumption of dairy product and its association with total and cause specific mortality: a population-based cohort study and meta-analysis. Clin. Nutr. 38, 2833–2845 (2019).

    PubMed  Article  PubMed Central  Google Scholar 

  81. 81.

    Petsini, F., Fragopoulou, E. & Antonopoulou, S. Fish consumption and cardiovascular disease related biomarkers: a review of clinical trials. Crit. Rev. Food Sci. Nutr. 59, 2061–2071 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  82. 82.

    Rimm, E. B. et al. Seafood long-chain n-3 polyunsaturated fatty acids and cardiovascular disease: a science advisory from the American Heart Association. Circulation 138, e35–e47 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. 83.

    Kim, K. et al. Role of total, red, processed, and white meat consumption in stroke incidence and mortality: a systematic review and meta-analysis of prospective cohort studies. J. Am. Heart Assoc. 6, e005983 (2017).

    PubMed  PubMed Central  Google Scholar 

  84. 84.

    Dairy and alternatives in your diet. NHS https://www.nhs.uk/live-well/eat-well/milk-and-dairy-nutrition/ (2018).

  85. 85.

    Matthews, J. N., Altman, D. G., Campbell, M. J. & Royston, P. Analysis of serial measurements in medical research. BMJ 300, 230–235 (1990).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  86. 86.

    Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  87. 87.

    Pasolli, E., Truong, D. T., Malik, F., Waldron, L. & Segata, N. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12, e1004977 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  88. 88.

    Oksanen, J. et al. Vegan: Community ecology package. R package v.1.17-4 https://cran.r-project.org/web/packages/vegan/index.html (2010).

  89. 89.

    Costea, P. I. et al. Subspecies in the global human gut microbiome. Mol. Syst. Biol. 13, 960 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  90. 90.

    Dhakan, D. B. et al. The unique composition of Indian gut microbiome, gene catalogue, and associated fecal metabolome deciphered using multi-omics approaches. Gigascience 8, giz004 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  91. 91.

    Hansen, L. B. S. et al. A low-gluten diet induces changes in the intestinal microbiome of healthy Danish adults. Nat. Commun. 9, 4630 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  92. 92.

    Jie, Z. et al. The gut microbiome in atherosclerotic cardiovascular disease. Nat. Commun. 8, 845 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  93. 93.

    Zeller, G. et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 10, 766 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank the participants of the PREDICT 1 study. We thank N. Atabaki-Pasdar for generating the liver fat score. We thank the staff of Zoe Global, the Department of Twin Research and the Massachusetts General Hospital and all the members of the Segata, Berry and Spector laboratories for their tireless work in contributing to the running of the study, data collection and data processing. We thank Nightingale Health and Affinity Biomarker Laboratories for their support and analytical work. This work was supported by Zoe Global and received support from grants from the Wellcome Trust (no. 212904/Z/18/Z) and Medical Research Council/British Heart Foundation Ancestry and Biological Informative Markers for Stratification of Hypertension (no. MR/M016560/1). The work was also supported by the European Research Council (ERC-STG project MetaPG-716575 to N.S.), MIUR ‘Futuro in Ricerca’ (grant no. RBFR13EWWI_001 to N.S.), the European H2020 program (ONCOBIOME-825410 and MASTER-818368 projects to N.S.), the National Cancer Institute of the National Institutes of Health (grant no. 1U01CA230551 to N.S.) and the Premio Internazionale Lombardia e Ricerca 2019 to N.S. S.E.B. was supported in part by a grant funded by the Biotechnology and Biological Sciences Research Council (grant no. BB/NO12739/1). P.W.F. was supported in part by grants from the European Research Council (grant no. CoG-2015_681742_NASCENT), Swedish Research Council (grant no. IRC15-0067) and Novo Nordisk Foundation. A.T.C. was supported in part as a Stuart and Suzanne Steele MGH Research Scholar. TwinsUK is funded by the Wellcome Trust, Medical Research Council, European Union, Chronic Disease Research Foundation, Zoe Global and the National Institute for Health Research-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.

Author information

Affiliations

Authors

Contributions

J.W., G.H. and T.D.S. obtained the funding. S.E.B., A.M.V., J.W., G.H., H.A.K., R.D., A.T.C., N.S., P.W.F. and T.D.S. designed the study and developed the concept. S.E.B., N.S., F.A., H.A.K., A.T.C., D.A.D. and T.D.S. collected the data. F.A., S.E.B., N.S., L.F., E.L., R.G., M.M., O.M., G.P., C.L.R., M.V.-C., S.O., F.G., A.T., F.B., C.M., A.K., L.D., D.B., A.M.T., C.B., L.W., L.G., J.C.P., S.D. and R.H. analyzed the data. S.E.B., H.A.K., D.A.D., G.H., J.W. and N.S. coordinated the study. F.A., S.E.B., A.M.V., L.H.N., D.A.D., E.L., R.G., J.W., C.G., J.M.O., C.H., P.W.F., T.D.S. and N.S. wrote the manuscript. All authors reviewed and revised the final manuscript.

Corresponding authors

Correspondence to Sarah E. Berry or Nicola Segata.

Ethics declarations

Competing interests

T.D.S., S.E.B., A.M.V., F.A., P.W.F., C.H. and N.S. are consultants to Zoe Global. J.W., G.H., R.D., J.C.P., C.B., R.H., L.F., F.G. and S.D. are or have been employees of Zoe Global. The other authors declare no competing interests.

Additional information

Peer review information Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Alpha diversity linked with personal factors, habitual diet, fasting, and postprandial markers.

a, Microbiome alpha diversity computed using the Shannon index correlated markers from the four categories: personal, habitual diet, fasting, and post-prandial. Reported are the five strongest positive and negative Spearman correlations for each category with p < 0.05. All correlations and p-values available in the Supplementary Table 1. b, Inter-sample microbiome distances (beta-diversity) were substantially lower, that is closer, among samples from the same individuals (two weeks apart) compared to those amongst different individuals. Gut microbial communities in monozygotic twins were slightly more similar than in dizygotic twins (Mann–Whitney U test two-sided p = 0.06), which, in turn, were more similar than unrelated individuals (p < 1e-12), even after adjusting for age (p < 1e-12). c, After excluding twin status (that is non-twin, vs. mono vs. dizygotic twins) from the model, personal factors still accounted for the greatest proportion of variance explained in overall microbial diversity, followed by dietary habits, fasting and postprandial cardiometabolic blood markers (by cumulative stepwise dbRDA). d, Cumulative (left bars) contributions and individual (right bars) contributions for each metadata variable based on Bray-Curtis dissimilarity. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.

Extended Data Fig. 2 Species-level correlation with single foods.

The figure shows the species-level correlations (Spearman) with single food quantities as estimated from the food frequency questionnaires. Only foods with at least 5 significant associations (q-value≤0.2) are displayed. Species are sorted by the number of significant associations, and the top 30 are reported in the figure.

Extended Data Fig. 3 Top foods, food groups, nutrients, and dietary patterns validated in the PREDICT 1 US cohort.

The application of the RF regression model trained on the PREDICT 1 UK cohort on the PREDICT 1 US participants, validating the associations with food-related variables found in the PREDICT 1 UK.

Extended Data Fig. 4 Performance for random Forest regression and classification on microbiome functional potential in predicting fasting measurements, total cholesterol and triglycerides in different lipoproteins.

The figure shows the performance of both RF regression and classification tasks trained on microbiome gene families profiles in predicting (a) the fasting measurements presented in Fig. 4a, sorted as in Fig. 4a. b, Predicting performances of the total cholesterol and (c) of triglycerides in different sizes of lipoproteins. For each lipoprotein, we considered its concentration values at both fasting and postprandial (6 h), and also the difference (rise) between the post-prandial concentration and the fasting one. Box plots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile.

Extended Data Fig. 5 Distributions of BMI in each curatedMetagenomicData dataset.

The figure shows the distributions of BMI values for the datasets available in curatedMetagenomicData. This was used to further select those datasets with a comparable range of values (interquartile range between 3.5 and 7.5) as the one in the PREDICT 1 UK dataset (IQR of 5.5), to be used as validation datasets for the associations found. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.

Extended Data Fig. 6 Pairwise partial Spearman correlations between bacterial species and total lipids and cholesterol in lipoproteins.

a, The heatmap shows the species-level correlations with total lipids in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. b, The heatmap shows the species-level correlations with total cholesterol in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.

Extended Data Fig. 7 Species-level correlations with triglycerides in lipoproteins.

The heatmap shows the species-level correlations with triglycerides in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.

Extended Data Fig. 8 Pairwise partial Spearman correlations between bacterial gene families and pathway abundances with clinical and metabolic risk scores, glycaemic and inflammatory measures, and lipoproteins.

a, The heatmap shows gene families correlations with the set of metadata presented in Fig. 5a–c reporting the top 2,000 genes selected among those with at least 20% prevalence on their number of significant correlations (q < 0.2). Gene families’ correlations are showing the same clusters as the species-level correlations in Fig. 5a–c. b, The heatmap shows pathway abundances correlations with the set of metadata presented in Fig. 5a–c reporting all the pathways at 20% prevalence (349 in total). Pathway abundances correlations are showing the same cluster structure as the species-level correlations in Fig. 5a–c.

Extended Data Fig. 9 Concordance of Random Forest scores with species-level partial correlations.

Volcano plots of the scores assigned to each species by Random Forest and their partial correlation, showing an overall concordance between the two independent approaches. We considered the top 5 metadata variables for the six metadata categories: a, Foods, bacon (g) (corr. 0.49), garlic (g) (corr. 0.424), unsalted nuts (g) (0.422), dairy dessert (g) (corr. 0.421), salted nuts (g) (corr. 0.395). b, Food groups, nuts (corr. 0.468), tea and coffee (corr. 0.436), meat (corr. 0.42), legumes (corr. 0.374), vegetables (corr. 0.371). c, Nutrients, lactose (corr. 0.442), niacin (corr. 0.381), maltose (corr. 0.361), sucrose (corr. 0.344), total carbohydrates (corr. 0.324). d, Nutrients normalized by daily energy intake, magnesium (corr. 0.472), starch (corr. 0.436), total carbohydrates (corr. 0.422), non-starch polysaccharides (NSP) (corr. 0.421), lactose (corr. 0.414). e, Dietary patterns, healthy plant percentage (corr. 0.492), healthy PDI (corr. 0.472), hei score (corr. 0.47), HFD (corr. 0.408), total plants percentage (0.388). f, Lipoproteins, M-HDL-L 6 h rise (corr. 0.406), IDL-C 6 h (corr. 0.4), HDL-L 6 h rise (corr. 0.397), XL-HDL-C 0 h (corr. 0.395), Total Cholesterol 4 h rise (corr. 0.391).

Extended Data Fig. 10 Prevotella copri and/or Blastocystis presence are indicators of a more favourable postprandial glucose response to meals.

a–c, Differential analysis of visceral fat, HFD and glucose iAUC 2 h after standardised breakfast according to presence-absence of one and both of P. copri and Blastocystis. The analysis reveals that both these species are indicators of reduced visceral fat, good cholesterol and meal-driven increase of glucose. d,e, Differential analysis of C-peptide and triglycerides at different time points according to presence-absence of one and both of P. copri and Blastocystis. The distributions of the concentrations for C-peptide and triglycerides were typically lower when one or both are absent. An asterisk between two box plots represents a significant p-value (p < 0.05) according to the Mann-Whitney U test (two-sided, Supplementary Table 8). Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. P-values are available in Supplementary Table 8.

Supplementary information

Reporting Summary

Supplementary Table 1

Alpha diversity measures and their correlations with personal factors, habitual diet, fasting and postprandial markers

Supplementary Table 2

List of foods and their assigned food groups and health classification and nutrients normalized by daily energy intake that is calorie adjusted

Supplementary Table 3

Meal descriptions

Supplementary Table 4

Plant-based Diet Index, Healthy Food Diversity index, animal groups, and Alternate Mediterranean score description

Supplementary Table 5

Species-level partial correlations with food groups, nutrients normalized by daily energy intake, dietary patterns, and fasting and postprandial measures with the species identified in the PREDICT 1 UK participants. Partial correlations were computed using pcor.test (two-sided) with the parameter method=spearman and corrected for multiple-hypothesis testing with FDR

Supplementary Table 6

Species-level partial correlations with food groups, nutrients normalized by daily energy intake, dietary patterns, and fasting and postprandial measures with the species identified in the PREDICT 1 US participants. Partial correlations were computed using pcor.test (two-sided) with the parameter method=spearman and corrected for multiple-hypothesis testing with FDR

Supplementary Table 7

Random forest regression and classification performances measured as Pearson and Spearman correlations for the regression task and AUC for the classification task for the model trained and tested with 80/20 training and testing random splitting over 100 folds for foods, food groups, nutrients, nutrients normalized by average energy intake, dietary patterns, and fasting and postprandial measures

Supplementary Table 8

P values from the Mann–Whitney U-test between presence and absence of Prevotella copri, Blastocystis and P. copri and Blastocystis (first tab). Effect size measured as the ratio of the medians for P. copri and Blastocystis presence/absence (second tab)

Supplementary Table 9

Correlations, ranks, and average ranks for determining the two sets of positive and negative bacterial species according to their correlations with a balanced set of personal, habitual diet, fasting and postprandial metadata

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Asnicar, F., Berry, S.E., Valdes, A.M. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat Med (2021). https://doi.org/10.1038/s41591-020-01183-8

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing