Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals

Abstract

The gut microbiome is shaped by diet and influences host metabolism; however, these links are complex and can be unique to each individual. We performed deep metagenomic sequencing of 1,203 gut microbiomes from 1,098 individuals enrolled in the Personalised Responses to Dietary Composition Trial (PREDICT 1) study, whose detailed long-term diet information, as well as hundreds of fasting and same-meal postprandial cardiometabolic blood marker measurements were available. We found many significant associations between microbes and specific nutrients, foods, food groups and general dietary indices, which were driven especially by the presence and diversity of healthy and plant-based foods. Microbial biomarkers of obesity were reproducible across external publicly available cohorts and in agreement with circulating blood metabolites that are indicators of cardiovascular disease risk. While some microbes, such as Prevotella copri and Blastocystis spp., were indicators of favorable postprandial glucose metabolism, overall microbiome composition was predictive for a large panel of cardiometabolic blood markers including fasting and postprandial glycemic, lipemic and inflammatory indices. The panel of intestinal species associated with healthy dietary habits overlapped with those associated with favorable cardiometabolic and postprandial markers, indicating that our large-scale resource can potentially stratify the gut microbiome into generalizable health levels in individuals without clinically manifest disease.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The PREDICT 1 study associates gut microbiome structure with habitual diet and blood cardiometabolic markers.
Fig. 2: Food quality, regardless of source, is linked to overall and feature-level composition of the gut microbiome.
Fig. 3: Random forest machine learning models trained on microbial or functional profiles can predict obesity phenotypic markers, even on independent cohorts.
Fig. 4: Fasting and postprandial cardiometabolic responses to standardized test meals associated with the microbiome.
Fig. 5: Species-level segregation into healthy and unhealthy microbial signatures of fasting and postprandial cardiometabolic markers.
Fig. 6: Panel of the 30 species showing the strongest overall correlations with a selection of markers of nutritional and cardiometabolic health.

Data availability

The metagenomes are deposited in European Bioinformatics Institute European Nucleotide Archive under accession no. PRJEB39223. The non-metagenomic data used for analysis in this study are held by the Department of Twin Research at King’s College London. The data can be released to bona fide researchers using our normal procedures overseen by the Wellcome Trust and its guidelines as part of our core funding. We receive around 100 requests per year for our datasets and have three meetings per month with independent members to assess proposals. The application can be found at https://twinsuk.ac.uk/resources-for-researchers/access-our-data/. This means that data need to be anonymized and conform to GDPR standards.

Code availability

Computational analyses were performed using the bioBakery suite of tools; species-level microbial abundances were computed using MetaPhlAn v.3.0 (https://github.com/biobakery/MetaPhlAn). Functional potential profiling was carried out with HUMAnN v.2.0 (https://github.com/biobakery/humann; Methods).

References

  1. Ng, M. et al. Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 384, 766–781 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Brown, J. M. & Hazen, S. L. Microbial modulation of cardiovascular disease. Nat. Rev. Microbiol. 16, 171–181 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Le Chatelier, E. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).

    Article  CAS  PubMed  Google Scholar 

  4. Sze, M. A. & Schloss, P. D. Looking for a signal in the noise: revisiting obesity and the microbiome. mBio 7, e01018-16 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e20 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24, 392–400 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Berry, S. E. et al. Human postprandial responses to food and potential for precision nutrition. Nat. Med. 26, 964–973 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 163, 1079–1094 (2015).

    Article  CAS  PubMed  Google Scholar 

  10. Mendes-Soares, H. et al. Model of personalized postprandial glycemic response to food developed for an Israeli cohort predicts responses in Midwestern American individuals. Am. J. Clin. Nutr. 110, 63–75 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016).

    Article  CAS  PubMed  Google Scholar 

  12. Zhernakova, A. et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science 352, 565–569 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Thingholm, L. B. et al. Obese individuals with and without type 2 diabetes show different gut microbial functional capacity and composition. Cell Host Microbe 26, 252–264.e10 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Schirmer, M. et al. Linking the human gut microbiome to inflammatory cytokine production capacity. Cell 167, 1897 (2016).

    Article  CAS  PubMed  Google Scholar 

  15. Fu, J. et al. The gut microbiome contributes to a substantial proportion of the variation in blood lipids. Circ. Res. 117, 817–824 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Berry, S. et al. Personalised REsponses to DIetary Composition Trial (PREDICT): an intervention study to determine inter-individual differences in postprandial response to foods. Preprint at https://protocolexchange.researchsquare.com/article/pex-802/v1 (2020).

  17. Xie, H. et al. Shotgun metagenomics of 250 adult twins reveals genetic and environmental impacts on the gut microbiome. Cell Syst. 3, 572–584.e3 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Atabaki-Pasdar, N. et al. Predicting and elucidating the etiology of fatty liver disease: a machine learning modeling and validation study in the IMI DIRECT cohorts. PLoS Med. 17, e1003149 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Vojinovic, D. et al. Relationship between gut microbiota and circulating metabolites in population-based cohorts. Nat. Commun. 10, 5813 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Vadiveloo, M., Dixon, L. B., Mijanovich, T., Elbel, B. & Parekh, N. Development and evaluation of the US Healthy Food Diversity index. Br. J. Nutr. 112, 1562–1574 (2014).

    Article  CAS  PubMed  Google Scholar 

  21. Guenther, P. M. et al. Update of the healthy eating index: HEI-2010. J. Acad. Nutr. Diet. 113, 569–580 (2013).

    Article  PubMed  Google Scholar 

  22. Fung, T. T. et al. Diet-quality scores and plasma concentrations of markers of inflammation and endothelial dysfunction. Am. J. Clin. Nutr. 82, 163–173 (2005).

    Article  CAS  PubMed  Google Scholar 

  23. Reedy, J. et al. Higher diet quality is associated with decreased risk of all-cause, cardiovascular disease, and cancer mortality among older adults. J. Nutr. 144, 881–889 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Mitrou, P. N. et al. Mediterranean dietary pattern and prediction of all-cause mortality in a US population: results from the NIH-AARP Diet and Health Study. Arch. Intern. Med. 167, 2461–2468 (2007).

    Article  PubMed  Google Scholar 

  25. Satija, A. et al. Plant-based dietary patterns and incidence of type 2 diabetes in US men and women: results from three prospective cohort studies. PLoS Med. 13, e1002039 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Vadiveloo, M., Parekh, N. & Mattei, J. Greater healthful food variety as measured by the US Healthy Food Diversity index is associated with lower odds of metabolic syndrome and its components in US adults. J. Nutr. 145, 564–571 (2015).

    Article  CAS  PubMed  Google Scholar 

  27. Onvani, S., Haghighatdoost, F., Surkan, P. J., Larijani, B. & Azadbakht, L. Adherence to the Healthy Eating Index and Alternative Healthy Eating Index dietary patterns and mortality from all causes, cardiovascular disease and cancer: a meta-analysis of observational studies. J. Hum. Nutr. Diet. 30, 216–226 (2017).

    Article  CAS  PubMed  Google Scholar 

  28. Redondo-Useros, N. et al. Associations of probiotic fermented milk (PFM) and yogurt consumption with Bifidobacterium and Lactobacillus components of the gut microbiota in healthy adults. Nutrients 11, 651 (2019).

    Article  CAS  PubMed Central  Google Scholar 

  29. Sakamoto, M., Iino, T., Yuki, M. & Ohkuma, M. Lawsonibacter asaccharolyticus gen. nov., sp. nov., a butyrate-producing bacterium isolated from human faeces. Int. J. Syst. Evol. Microbiol. 68, 2074–2081 (2018).

    Article  CAS  PubMed  Google Scholar 

  30. Satija, A. et al. Healthful and unhealthful plant-based diets and the risk of coronary heart disease in U.S. adults. J. Am. Coll. Cardiol. 70, 411–422 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Monteiro, C. A. et al. The UN Decade of Nutrition, the NOVA food classification and the trouble with ultra-processing. Public Health Nutr. 21, 5–17 (2018).

    Article  PubMed  Google Scholar 

  32. Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

    Article  CAS  PubMed  Google Scholar 

  33. Beaumont, M. et al. Heritable components of the human fecal microbiome are associated with visceral fat. Genome Biol. 17, 189 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods 14, 1023–1024 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. & Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626–638 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. D’Agostino, R. B. Sr et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation 117, 743–753 (2008).

    Article  PubMed  Google Scholar 

  37. Kettunen, J. et al. Biomarker glycoprotein acetyls is associated with the risk of a wide spectrum of incident diseases and stratifies mortality risk in angiography patients. Circ. Genom. Precis. Med. 11, e002234 (2018).

    Article  CAS  PubMed  Google Scholar 

  38. Würtz, P. et al. Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation 131, 774–785 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Hrebícek, J., Janout, V., Malincíková, J., Horáková, D. & Cízek, L. Detection of insulin resistance by simple quantitative insulin sensitivity check index QUICKI for epidemiological assessment and prevention. J. Clin. Endocrinol. Metab. 87, 144–147 (2002).

    Article  PubMed  Google Scholar 

  40. Thomas, A. M. et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat. Med. 25, 667–678 (2019).

    Article  CAS  PubMed  Google Scholar 

  41. Wojczynski, M. K. et al. High-fat meal effect on LDL, HDL, and VLDL particle size and number in the Genetics of Lipid-Lowering Drugs and Diet Network (GOLDN): an interventional study. Lipids Health Dis. 10, 181 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Skeggs, J. W. & Morton, R. E. LDL and HDL enriched in triglyceride promote abnormal cholesterol transport. J. Lipid Res. 43, 1264–1274 (2002).

    Article  CAS  PubMed  Google Scholar 

  43. Hodson, L., Skeaff, C. M. & Fielding, B. A. Fatty acid composition of adipose tissue and blood in humans and its use as a biomarker of dietary intake. Prog. Lipid Res. 47, 348–380 (2008).

    Article  CAS  PubMed  Google Scholar 

  44. Cohn, J. S. Postprandial lipemia: emerging evidence for atherogenicity of remnant lipoproteins. Can. J. Cardiol. 14, 18B–27B (1998).

    CAS  PubMed  Google Scholar 

  45. Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174–180 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Huttenhower, C. et al. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214 (2012).

    Article  CAS  Google Scholar 

  47. Kovatcheva-Datchary, P. et al. Dietary fiber-induced improvement in glucose metabolism is associated with increased abundance of Prevotella. Cell Metab. 22, 971–982 (2015).

    Article  CAS  PubMed  Google Scholar 

  48. Pedersen, H. K. et al. Human gut microbes impact host serum metabolome and insulin sensitivity. Nature 535, 376–381 (2016).

    Article  CAS  PubMed  Google Scholar 

  49. Tett, A. et al. The Prevotella copri complex comprises four distinct clades underrepresented in westernized populations. Cell Host Microbe 26, 666–679.e7 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. De Filippis, F. et al. Distinct genetic and functional traits of human intestinal Prevotella copri strains are associated with different habitual diets. Cell Host Microbe 25, 444–453.e3 (2019).

    Article  CAS  PubMed  Google Scholar 

  51. Clark, C. G., van der Giezen, M., Alfellani, M. A. & Stensvold, C. R. Recent developments in Blastocystis research. Adv. Parasitol. 82, 1–32 (2013).

    Article  PubMed  Google Scholar 

  52. Lukeš, J., Stensvold, C. R., Jirků-Pomajbíková, K. & Wegener Parfrey, L. Are human intestinal eukaryotes beneficial or commensals? PLoS Pathog. 11, e1005039 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Beghini, F. et al. Large-scale comparative metagenomics of Blastocystis, a common member of the human gut microbiome. ISME J. 11, 2848–2863 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Sokol, H. et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc. Natl Acad. Sci. USA 105, 16731–16736 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Hall, A. B. et al. A novel Ruminococcus gnavus clade enriched in inflammatory bowel disease patients. Genome Med. 9, 103 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Valles-Colomer, M. et al. The neuroactive potential of the human gut microbiota in quality of life and depression. Nat. Microbiol. 4, 623–632 (2019).

    Article  CAS  PubMed  Google Scholar 

  57. Kim, H., Caulfield, L. E. & Rebholz, C. M. Healthy plant-based diets are associated with lower risk of all-cause mortality in US adults. J. Nutr. 148, 624–631 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Meslier, V. et al. Mediterranean diet intervention in overweight and obese subjects lowers plasma cholesterol and causes changes in the gut microbiome and metabolome independently of energy intake. Gut 69, 1258–1268 (2020).

    Article  CAS  PubMed  Google Scholar 

  59. Kurilshikov, A. et al. Gut microbial associations to plasma metabolites linked to cardiovascular phenotypes and risk. Circ. Res. 124, 1808–1820 (2019).

    Article  CAS  PubMed  Google Scholar 

  60. Ko, C.-W., Qu, J., Black, D. D. & Tso, P. Regulation of intestinal lipid metabolism: current concepts and relevance to disease. Nat. Rev. Gastroenterol. Hepatol. 17, 169–183 (2020).

    Article  CAS  PubMed  Google Scholar 

  61. McIver, L. J. et al. bioBakery: a meta’omic analysis environment. Bioinformatics 34, 1235–1237 (2018).

    Article  CAS  PubMed  Google Scholar 

  62. Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).

    Article  CAS  PubMed  Google Scholar 

  63. Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962–968 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    Article  CAS  PubMed  Google Scholar 

  65. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Meth. 9, 357–359 (2012).

    Article  CAS  Google Scholar 

  68. Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).

    Article  CAS  PubMed  Google Scholar 

  69. Bingham, S. A. et al. Nutritional methods in the European Prospective Investigation of Cancer in Norfolk. Public Health Nutr. 4, 847–858 (2001).

    Article  CAS  PubMed  Google Scholar 

  70. Mulligan, A. A. et al. A new tool for converting food frequency questionnaire data into nutrient and food group values: FETA research methods and availability. BMJ Open 4, e004503 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  71. McCance and Widdowson’s The Composition of Foods 7th edn (Public Health England, 2014).

  72. Food Portion Sizes 3rd edn (Food Standards Agency, 2002).

  73. Rimm, E. B. et al. Reproducibility and validity of an expanded self-administered semiquantitative food frequency questionnaire among male health professionals. Am. J. Epidemiol. 135, 1114–1126 (1992).

    Article  CAS  PubMed  Google Scholar 

  74. Frankenfield, D. C., Muth, E. R. & Rowe, W. A. The Harris–Benedict studies of human basal metabolism: history and limitations. J. Am. Diet. Assoc. 98, 439–445 (1998).

    Article  CAS  PubMed  Google Scholar 

  75. McGuire, S. U.S. Department of Agriculture and U.S. Department of Health and Human Services, Dietary Guidelines for Americans, 2010. 7th Edition, Washington, DC: U.S. Government Printing Office, January 2011. Adv. Nutr. 2, 293–294 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  76. World Health Organization & Brouwer, I. A. Effect of trans-fatty acid intake on blood lipids and lipoproteins: a systematic review and meta-regression analysis. World Health Organization https://apps.who.int/iris/handle/10665/246109 (2016).

  77. Zhong, V. W. et al. Associations of dietary cholesterol or egg consumption with incident cardiovascular disease and mortality. JAMA 321, 1081–1095 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. de Souza, R. J. et al. Intake of saturated and trans unsaturated fatty acids and risk of all cause mortality, cardiovascular disease, and type 2 diabetes: systematic review and meta-analysis of observational studies. BMJ 351, h3978 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Michaëlsson, K. et al. Milk intake and risk of mortality and fractures in women and men: cohort studies. BMJ 349, g6015 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  80. Mazidi, M. et al. Consumption of dairy product and its association with total and cause specific mortality: a population-based cohort study and meta-analysis. Clin. Nutr. 38, 2833–2845 (2019).

    Article  PubMed  Google Scholar 

  81. Petsini, F., Fragopoulou, E. & Antonopoulou, S. Fish consumption and cardiovascular disease related biomarkers: a review of clinical trials. Crit. Rev. Food Sci. Nutr. 59, 2061–2071 (2019).

    Article  CAS  PubMed  Google Scholar 

  82. Rimm, E. B. et al. Seafood long-chain n-3 polyunsaturated fatty acids and cardiovascular disease: a science advisory from the American Heart Association. Circulation 138, e35–e47 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Kim, K. et al. Role of total, red, processed, and white meat consumption in stroke incidence and mortality: a systematic review and meta-analysis of prospective cohort studies. J. Am. Heart Assoc. 6, e005983 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Dairy and alternatives in your diet. NHS https://www.nhs.uk/live-well/eat-well/milk-and-dairy-nutrition/ (2018).

  85. Matthews, J. N., Altman, D. G., Campbell, M. J. & Royston, P. Analysis of serial measurements in medical research. BMJ 300, 230–235 (1990).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  87. Pasolli, E., Truong, D. T., Malik, F., Waldron, L. & Segata, N. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12, e1004977 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Oksanen, J. et al. Vegan: Community ecology package. R package v.1.17-4 https://cran.r-project.org/web/packages/vegan/index.html (2010).

  89. Costea, P. I. et al. Subspecies in the global human gut microbiome. Mol. Syst. Biol. 13, 960 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  90. Dhakan, D. B. et al. The unique composition of Indian gut microbiome, gene catalogue, and associated fecal metabolome deciphered using multi-omics approaches. Gigascience 8, giz004 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  91. Hansen, L. B. S. et al. A low-gluten diet induces changes in the intestinal microbiome of healthy Danish adults. Nat. Commun. 9, 4630 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  92. Jie, Z. et al. The gut microbiome in atherosclerotic cardiovascular disease. Nat. Commun. 8, 845 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Zeller, G. et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 10, 766 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the participants of the PREDICT 1 study. We thank N. Atabaki-Pasdar for generating the liver fat score. We thank the staff of Zoe Global, the Department of Twin Research and the Massachusetts General Hospital and all the members of the Segata, Berry and Spector laboratories for their tireless work in contributing to the running of the study, data collection and data processing. We thank Nightingale Health and Affinity Biomarker Laboratories for their support and analytical work. This work was supported by Zoe Global and received support from grants from the Wellcome Trust (no. 212904/Z/18/Z) and Medical Research Council/British Heart Foundation Ancestry and Biological Informative Markers for Stratification of Hypertension (no. MR/M016560/1). The work was also supported by the European Research Council (ERC-STG project MetaPG-716575 to N.S.), MIUR ‘Futuro in Ricerca’ (grant no. RBFR13EWWI_001 to N.S.), the European H2020 program (ONCOBIOME-825410 and MASTER-818368 projects to N.S.), the National Cancer Institute of the National Institutes of Health (grant no. 1U01CA230551 to N.S.) and the Premio Internazionale Lombardia e Ricerca 2019 to N.S. S.E.B. was supported in part by a grant funded by the Biotechnology and Biological Sciences Research Council (grant no. BB/NO12739/1). P.W.F. was supported in part by grants from the European Research Council (grant no. CoG-2015_681742_NASCENT), Swedish Research Council (grant no. IRC15-0067) and Novo Nordisk Foundation. A.T.C. was supported in part as a Stuart and Suzanne Steele MGH Research Scholar. TwinsUK is funded by the Wellcome Trust, Medical Research Council, European Union, Chronic Disease Research Foundation, Zoe Global and the National Institute for Health Research-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.

Author information

Authors and Affiliations

Authors

Contributions

J.W., G.H. and T.D.S. obtained the funding. S.E.B., A.M.V., J.W., G.H., H.A.K., R.D., A.T.C., N.S., P.W.F. and T.D.S. designed the study and developed the concept. S.E.B., N.S., F.A., H.A.K., A.T.C., D.A.D. and T.D.S. collected the data. F.A., S.E.B., N.S., L.F., E.L., R.G., M.M., O.M., G.P., C.L.R., M.V.-C., S.O., F.G., A.T., F.B., C.M., A.K., L.D., D.B., A.M.T., C.B., L.W., L.G., J.C.P., S.D. and R.H. analyzed the data. S.E.B., H.A.K., D.A.D., G.H., J.W. and N.S. coordinated the study. F.A., S.E.B., A.M.V., L.H.N., D.A.D., E.L., R.G., J.W., C.G., J.M.O., C.H., P.W.F., T.D.S. and N.S. wrote the manuscript. All authors reviewed and revised the final manuscript.

Corresponding authors

Correspondence to Sarah E. Berry or Nicola Segata.

Ethics declarations

Competing interests

T.D.S., S.E.B., A.M.V., F.A., P.W.F., C.H. and N.S. are consultants to Zoe Global. J.W., G.H., R.D., J.C.P., C.B., R.H., L.F., F.G. and S.D. are or have been employees of Zoe Global. The other authors declare no competing interests.

Additional information

Peer review information Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Alpha diversity linked with personal factors, habitual diet, fasting, and postprandial markers.

a, Microbiome alpha diversity computed using the Shannon index correlated markers from the four categories: personal, habitual diet, fasting, and post-prandial. Reported are the five strongest positive and negative Spearman correlations for each category with p < 0.05. All correlations and p-values available in the Supplementary Table 1. b, Inter-sample microbiome distances (beta-diversity) were substantially lower, that is closer, among samples from the same individuals (two weeks apart) compared to those amongst different individuals. Gut microbial communities in monozygotic twins were slightly more similar than in dizygotic twins (Mann–Whitney U test two-sided p = 0.06), which, in turn, were more similar than unrelated individuals (p < 1e-12), even after adjusting for age (p < 1e-12). c, After excluding twin status (that is non-twin, vs. mono vs. dizygotic twins) from the model, personal factors still accounted for the greatest proportion of variance explained in overall microbial diversity, followed by dietary habits, fasting and postprandial cardiometabolic blood markers (by cumulative stepwise dbRDA). d, Cumulative (left bars) contributions and individual (right bars) contributions for each metadata variable based on Bray-Curtis dissimilarity. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.

Extended Data Fig. 2 Species-level correlation with single foods.

The figure shows the species-level correlations (Spearman) with single food quantities as estimated from the food frequency questionnaires. Only foods with at least 5 significant associations (q-value≤0.2) are displayed. Species are sorted by the number of significant associations, and the top 30 are reported in the figure.

Extended Data Fig. 3 Top foods, food groups, nutrients, and dietary patterns validated in the PREDICT 1 US cohort.

The application of the RF regression model trained on the PREDICT 1 UK cohort on the PREDICT 1 US participants, validating the associations with food-related variables found in the PREDICT 1 UK.

Extended Data Fig. 4 Performance for random Forest regression and classification on microbiome functional potential in predicting fasting measurements, total cholesterol and triglycerides in different lipoproteins.

The figure shows the performance of both RF regression and classification tasks trained on microbiome gene families profiles in predicting (a) the fasting measurements presented in Fig. 4a, sorted as in Fig. 4a. b, Predicting performances of the total cholesterol and (c) of triglycerides in different sizes of lipoproteins. For each lipoprotein, we considered its concentration values at both fasting and postprandial (6 h), and also the difference (rise) between the post-prandial concentration and the fasting one. Box plots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile.

Extended Data Fig. 5 Distributions of BMI in each curatedMetagenomicData dataset.

The figure shows the distributions of BMI values for the datasets available in curatedMetagenomicData. This was used to further select those datasets with a comparable range of values (interquartile range between 3.5 and 7.5) as the one in the PREDICT 1 UK dataset (IQR of 5.5), to be used as validation datasets for the associations found. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.

Extended Data Fig. 6 Pairwise partial Spearman correlations between bacterial species and total lipids and cholesterol in lipoproteins.

a, The heatmap shows the species-level correlations with total lipids in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. b, The heatmap shows the species-level correlations with total cholesterol in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.

Extended Data Fig. 7 Species-level correlations with triglycerides in lipoproteins.

The heatmap shows the species-level correlations with triglycerides in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.

Extended Data Fig. 8 Pairwise partial Spearman correlations between bacterial gene families and pathway abundances with clinical and metabolic risk scores, glycaemic and inflammatory measures, and lipoproteins.

a, The heatmap shows gene families correlations with the set of metadata presented in Fig. 5a–c reporting the top 2,000 genes selected among those with at least 20% prevalence on their number of significant correlations (q < 0.2). Gene families’ correlations are showing the same clusters as the species-level correlations in Fig. 5a–c. b, The heatmap shows pathway abundances correlations with the set of metadata presented in Fig. 5a–c reporting all the pathways at 20% prevalence (349 in total). Pathway abundances correlations are showing the same cluster structure as the species-level correlations in Fig. 5a–c.

Extended Data Fig. 9 Concordance of Random Forest scores with species-level partial correlations.

Volcano plots of the scores assigned to each species by Random Forest and their partial correlation, showing an overall concordance between the two independent approaches. We considered the top 5 metadata variables for the six metadata categories: a, Foods, bacon (g) (corr. 0.49), garlic (g) (corr. 0.424), unsalted nuts (g) (0.422), dairy dessert (g) (corr. 0.421), salted nuts (g) (corr. 0.395). b, Food groups, nuts (corr. 0.468), tea and coffee (corr. 0.436), meat (corr. 0.42), legumes (corr. 0.374), vegetables (corr. 0.371). c, Nutrients, lactose (corr. 0.442), niacin (corr. 0.381), maltose (corr. 0.361), sucrose (corr. 0.344), total carbohydrates (corr. 0.324). d, Nutrients normalized by daily energy intake, magnesium (corr. 0.472), starch (corr. 0.436), total carbohydrates (corr. 0.422), non-starch polysaccharides (NSP) (corr. 0.421), lactose (corr. 0.414). e, Dietary patterns, healthy plant percentage (corr. 0.492), healthy PDI (corr. 0.472), hei score (corr. 0.47), HFD (corr. 0.408), total plants percentage (0.388). f, Lipoproteins, M-HDL-L 6 h rise (corr. 0.406), IDL-C 6 h (corr. 0.4), HDL-L 6 h rise (corr. 0.397), XL-HDL-C 0 h (corr. 0.395), Total Cholesterol 4 h rise (corr. 0.391).

Extended Data Fig. 10 Prevotella copri and/or Blastocystis presence are indicators of a more favourable postprandial glucose response to meals.

a–c, Differential analysis of visceral fat, HFD and glucose iAUC 2 h after standardised breakfast according to presence-absence of one and both of P. copri and Blastocystis. The analysis reveals that both these species are indicators of reduced visceral fat, good cholesterol and meal-driven increase of glucose. d,e, Differential analysis of C-peptide and triglycerides at different time points according to presence-absence of one and both of P. copri and Blastocystis. The distributions of the concentrations for C-peptide and triglycerides were typically lower when one or both are absent. An asterisk between two box plots represents a significant p-value (p < 0.05) according to the Mann-Whitney U test (two-sided, Supplementary Table 8). Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. P-values are available in Supplementary Table 8.

Supplementary information

Reporting Summary

Supplementary Table 1

Alpha diversity measures and their correlations with personal factors, habitual diet, fasting and postprandial markers

Supplementary Table 2

List of foods and their assigned food groups and health classification and nutrients normalized by daily energy intake that is calorie adjusted

Supplementary Table 3

Meal descriptions

Supplementary Table 4

Plant-based Diet Index, Healthy Food Diversity index, animal groups, and Alternate Mediterranean score description

Supplementary Table 5

Species-level partial correlations with food groups, nutrients normalized by daily energy intake, dietary patterns, and fasting and postprandial measures with the species identified in the PREDICT 1 UK participants. Partial correlations were computed using pcor.test (two-sided) with the parameter method=spearman and corrected for multiple-hypothesis testing with FDR

Supplementary Table 6

Species-level partial correlations with food groups, nutrients normalized by daily energy intake, dietary patterns, and fasting and postprandial measures with the species identified in the PREDICT 1 US participants. Partial correlations were computed using pcor.test (two-sided) with the parameter method=spearman and corrected for multiple-hypothesis testing with FDR

Supplementary Table 7

Random forest regression and classification performances measured as Pearson and Spearman correlations for the regression task and AUC for the classification task for the model trained and tested with 80/20 training and testing random splitting over 100 folds for foods, food groups, nutrients, nutrients normalized by average energy intake, dietary patterns, and fasting and postprandial measures

Supplementary Table 8

P values from the Mann–Whitney U-test between presence and absence of Prevotella copri, Blastocystis and P. copri and Blastocystis (first tab). Effect size measured as the ratio of the medians for P. copri and Blastocystis presence/absence (second tab)

Supplementary Table 9

Correlations, ranks, and average ranks for determining the two sets of positive and negative bacterial species according to their correlations with a balanced set of personal, habitual diet, fasting and postprandial metadata

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asnicar, F., Berry, S.E., Valdes, A.M. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat Med 27, 321–332 (2021). https://doi.org/10.1038/s41591-020-01183-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-020-01183-8

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing