The human gut microbiome matures towards the adult composition during the first years of life and is implicated in early immune development. Here, we investigate the effects of microbial genomic diversity on gut microbiome development using integrated early childhood data sets collected in the DIABIMMUNE study in Finland, Estonia and Russian Karelia. We show that gut microbial diversity is associated with household location and linear growth of children. Single nucleotide polymorphism- and metagenomic assembly-based strain tracking revealed large and highly dynamic microbial pangenomes, especially in the genus Bacteroides, in which we identified evidence of variability deriving from Bacteroides-targeting bacteriophages. Our analyses revealed functional consequences of strain diversity; only 10% of Finnish infants harboured Bifidobacterium longum subsp. infantis, a subspecies specialized in human milk metabolism, whereas Russian infants commonly maintained a probiotic Bifidobacterium bifidum strain in infancy. Groups of bacteria contributing to diverse, characterized metabolic pathways converged to highly subject-specific configurations over the first two years of life. This longitudinal study extends the current view of early gut microbial community assembly based on strain-level genomic variation.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

All 16S rRNA and metagenomic sequencing data are available in the NCBI Sequence Read Archive under BioProject PRJNA497734 and through the DIABIMMUNE microbiome website at https://pubs.broadinstitute.org/diabimmune/.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Kundu, P., Blacher, E., Elinav, E. & Pettersson, S. Our gut microbiome: the evolving inner self. Cell 171, 1481–1493 (2017).

  2. 2.

    Backhed, F. et al. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host Microbe 17, 690–703 (2015).

  3. 3.

    Chu, D. M. et al. Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery. Nat. Med. 23, 314–326 (2017).

  4. 4.

    Bach, J. F. The hygiene hypothesis in autoimmunity: the role of pathogens and commensals. Nat. Rev. Immunol. 18, 105–120 (2018).

  5. 5.

    Haahtela, T. et al. The biodiversity hypothesis and allergic disease: World Allergy Organization position statement. World Allergy Organ. J. 6, 3 (2013).

  6. 6.

    Rewers, M. & Ludvigsson, J. Environmental risk factors for type 1 diabetes. Lancet 387, 2340–2348 (2016).

  7. 7.

    Arrieta, M. C. et al. Early infancy microbial and metabolic alterations affect risk of childhood asthma. Sci. Transl. Med. 7, 307ra152 (2015).

  8. 8.

    Arvonen, M. et al. Gut microbiota–host interactions and juvenile idiopathic arthritis. Pediatr. Rheumatol. Online J. 14, 44 (2016).

  9. 9.

    Simonyte Sjodin, K., Vidman, L., Ryden, P. & West, C. E. Emerging evidence of the role of gut microbiota in the development of allergic diseases. Curr. Opin. Allergy. Clin. Immunol. 16, 390–395 (2016).

  10. 10.

    Lewis, J. D. et al. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn’s disease. Cell Host Microbe 18, 489–500 (2015).

  11. 11.

    Knip, M. & Siljander, H. The role of the intestinal microbiota in type 1 diabetes mellitus. Nat. Rev. Endocrinol. 12, 154–167 (2016).

  12. 12.

    Maffeis, C. et al. Association between intestinal permeability and faecal microbiota composition in Italian children with beta cell autoimmunity at risk for type 1 diabetes. Diabetes Metab. Res. Rev. 32, 700–709 (2016).

  13. 13.

    Thaiss, C. A., Zmora, N., Levy, M. & Elinav, E. The microbiome and innate immunity. Nature 535, 65–74 (2016).

  14. 14.

    Honda, K. & Littman, D. R. The microbiota in adaptive immune homeostasis and disease. Nature 535, 75–84 (2016).

  15. 15.

    Lebreton, F. et al. Emergence of epidemic multidrug-resistant Enterococcus faecium from animal and commensal strains. Preprint at https://doi.org/10.1128/mBio.00534-13 (2013).

  16. 16.

    Hall, A. B. et al. A novel Ruminococcus gnavus clade enriched in inflammatory bowel disease patients. Genome Med. 9, 103 (2017).

  17. 17.

    Schonherr-Hellec, S. et al. Clostridial strain-specific characteristics associated with necrotizing enterocolitis. Appl. Environ. Microbiol. 84, e02428-17 (2018).

  18. 18.

    Bron, P. A., van Baarlen, P. & Kleerebezem, M. Emerging molecular insights into the interaction between probiotics and the host intestinal mucosa. Nat. Rev. Microbiol. 10, 66–78 (2011).

  19. 19.

    Ward, D. V. et al. Metagenomic sequencing with strain-level resolution implicates uropathogenic E. coli in necrotizing enterocolitis and mortality in preterm infants. Cell Rep. 14, 2912–2924 (2016).

  20. 20.

    Hazen, T. H. et al. Genomic diversity of EPEC associated with clinical presentations of differing severity. Nat. Microbiol. 1, 15014 (2016).

  21. 21.

    Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. & Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626–638 (2017).

  22. 22.

    Lloyd-Price, J. et al. Strains, functions and dynamics in the expanded Human Microbiome Project. Nature 550, 61–66 (2017).

  23. 23.

    Korpela, K. et al. Selective maternal seeding and environment shape the human gut microbiome. Genome Res. 28, 561–568 (2018).

  24. 24.

    Mende, D. R., Sunagawa, S., Zeller, G. & Bork, P. Accurate and universal delineation of prokaryotic species. Nat. Methods 10, 881–884 (2013).

  25. 25.

    Asnicar, F. et al. Studying vertical microbiome transmission from mothers to infants by strain-level metagenomic profiling. mSystems 2, e00164-16 (2017).

  26. 26.

    Nayfach, S., Rodriguez-Mueller, B., Garud, N. & Pollard, K. S. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Res. 26, 1612–1625 (2016).

  27. 27.

    Yassour, M. et al. Strain-level analysis of mother-to-child bacterial transmission during the first few months of life. Cell Host Microbe 24, 146–154 (2018).

  28. 28.

    Ferretti, P. et al. Mother-to-infant microbial transmission from different body sites shapes the developing infant gut microbiome. Cell Host Microbe 24, 133–145 (2018).

  29. 29.

    Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).

  30. 30.

    Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

  31. 31.

    Scher, J. U. et al. Expansion of intestinal Prevotella copri correlates with enhanced susceptibility to arthritis. eLife 2, e01202 (2013).

  32. 32.

    Bottacini, F., van Sinderen, D. & Ventura, M. Omics of bifidobacteria: research and insights into their health-promoting activities. Biochem. J. 474, 4137–4152 (2017).

  33. 33.

    Sela, D. A. & Mills, D. A. Nursing our microbiota: molecular linkages between bifidobacteria and milk oligosaccharides. Trends Microbiol. 18, 298–307 (2010).

  34. 34.

    Sela, D. A. et al. The genome sequence of Bifidobacterium longum subsp. infantis reveals adaptations for milk utilization within the infant microbiome. Proc. Natl Acad. Sci. USA 105, 18964–18969 (2008).

  35. 35.

    Garrido, D. et al. A novel gene cluster allows preferential utilization of fucosylated milk oligosaccharides in Bifidobacterium longum subsp. longum SC596. Sci. Rep. 6, 35045 (2016).

  36. 36.

    Sela, D. A. Bifidobacterial utilization of human milk oligosaccharides. Int. J. Food Microbiol. 149, 58–64 (2011).

  37. 37.

    Kostic, A. D. et al. The dynamics of the human infant gut microbiome in development and in progression toward type 1 diabetes. Cell Host Microbe 17, 260–273 (2015).

  38. 38.

    Yassour, M. et al. Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability. Sci. Transl. Med. 8, 343ra381 (2016).

  39. 39.

    Vatanen, T. et al. Variation in microbiome LPS immunogenicity contributes to autoimmunity in humans. Cell 165, 842–853 (2016).

  40. 40.

    Zhao, G. et al. Intestinal virome changes precede autoimmunity in type I diabetes-susceptible children. Proc. Natl Acad. Sci. USA 114, E6166–E6175 (2017).

  41. 41.

    He, Q. et al. Two distinct metacommunities characterize the gut microbiota in Crohn’s disease patients. Gigascience 6, 1–11 (2017).

  42. 42.

    Browne, H. P. et al. Culturing of ‘unculturable’ human microbiota reveals novel taxa and extensive sporulation. Nature 533, 543–546 (2016).

  43. 43.

    Schloissnig, S. et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50 (2013).

  44. 44.

    Lange, A. et al. Extensive mobilome-driven genome diversification in mouse gut-associated Bacteroides vulgatus mpk. Genome Biol. Evol. 8, 1197–1207 (2016).

  45. 45.

    Skennerton, C. T., Imelfort, M. & Tyson, G. W. Crass: identification and reconstruction of CRISPR from unassembled metagenomic data. Nucleic Acids Res. 41, e105 (2013).

  46. 46.

    Land, M. et al. Insights from 20 years of bacterial genome sequencing. Funct. Integr. Genomics. 15, 141–161 (2015).

  47. 47.

    Snel, B., Bork, P. & Huynen, M. A. Genome phylogeny based on gene content. Nat. Genet. 21, 108–110 (1999).

  48. 48.

    Frese, S. A. et al. Persistence of supplemented Bifidobacterium longum subsp. infantis EVC001 in breastfed infants.mSphere 2, e00501-17 (2017).

  49. 49.

    Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962–968 (2018).

  50. 50.

    Morris, J. J., Lenski, R. E. & Zinser, E. R. The Black Queen Hypothesis: evolution of dependencies through adaptive gene loss. mBio 3, e00036-12 (2012).

  51. 51.

    Andreani, N. A., Hesse, E. & Vos, M. Prokaryote genome fluidity is dependent on effective population size. ISME J. 11, 1719–1721 (2017).

  52. 52.

    Subramanian, S. et al. Persistent gut microbiota immaturity in malnourished Bangladeshi children. Nature 510, 417–421 (2014).

  53. 53.

    Uusitalo, U. et al. Association of early exposure of probiotics and islet autoimmunity in the TEDDY Study. JAMA Pediatr. 170, 20–28 (2016).

  54. 54.

    Fox, M. J., Ahuja, K. D., Robertson, I. K., Ball, M. J. & Eri, R. D. Can probiotic yogurt prevent diarrhoea in children on antibiotics? A double-blind, randomised, placebo-controlled study. BMJ Open 5, e006474 (2015).

  55. 55.

    Henrick, B. M. et al. Elevated fecal pH indicates a profound change in the breastfed infant gut microbiome due to reduction of Bifidobacterium over the past century. mSphere 3, e00041-18 (2018).

  56. 56.

    Insel, R. & Knip, M. Prospects for primary prevention of type 1 diabetes by restoring a disappearing microbe. Preprint at https://doi.org/10.1111/pedi.12756 (2018).

  57. 57.

    Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15, 382–392 (2014).

  58. 58.

    Edgar, R. C. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat. Methods 10, 996–998 (2013).

  59. 59.

    Edgar, R. C. & Flyvbjerg, H. Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics 31, 3476–3482 (2015).

  60. 60.

    McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

  61. 61.

    Morgan, X. C. et al. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol. 13, R79 (2012).

  62. 62.

    Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811–814 (2012).

  63. 63.

    Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

  64. 64.

    Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

  65. 65.

    Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).

  66. 66.

    Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

  67. 67.

    Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).

  68. 68.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  69. 69.

    Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).

  70. 70.

    Schliep, K. P. phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593 (2011).

  71. 71.

    Scholz, M. et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat. Methods 13, 435–438 (2016).

  72. 72.

    Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

  73. 73.

    Huang, K. et al. MetaRef: a pan-genomic database for comparative and community microbial genomics. Nucleic Acids Res. 42, D617–D624 (2014).

  74. 74.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

Download references


The authors thank T. Poon and S. Steelman (Broad Institute) for help with sequence production and sample management, A. Rahnavard for help with HMP SNP haplotype analysis, D. Shungin for discussions and connections regarding the use of infant milk products in Russia, K. Koski and M. Koski (University of Helsinki) for the coordination and database work in the DIABIMMUNE study and T. Reimels for editorial help with writing and figure generation. T.V. was supported by funding from the Juvenile Diabetes Research Foundation (JDRF). A.B.H. is a Merck Fellow of the Helen Hay Whitney Foundation. P.C.M. received funding from the German Research Foundation (grant no. 315980449). C.H. was supported by funding from the JDRF (3-SRA-2016–141-Q-R) and the National Institutes of Health (R24DK110499). M.K. was supported by the European Union Seventh Framework Programme FP7/2007–2013 (202063) and the Academy of Finland Centre of Excellence in Molecular Systems Immunology and Physiology Research (250114). R.J.X. was supported by funding from JDRF (2-SRA-2016–247-S-B and 2-SRA-2018–548-S-B), the National Institutes of Health (DK43351 and AI110498) and the Center for Microbiome Informatics and Therapeutics.

Author information


  1. Broad Institute of MIT and Harvard, Cambridge, MA, USA

    • Tommi Vatanen
    • , Damian R. Plichta
    • , Timothy D. Arthur
    • , Andrew Brantley Hall
    • , Xiaobo Ke
    • , Raivo Kolde
    • , Moran Yassour
    • , Hera Vlamakis
    • , Curtis Huttenhower
    •  & Ramnik J. Xavier
  2. Department of Computer Science, Aalto University, Espoo, Finland

    • Juhi Somani
    •  & Harri Lähdesmäki
  3. Department for Computational Biology of Infection Research, Helmholtz Center for Infection Research, Brunswick, Germany

    • Philipp C. Münch
    •  & Alice C. McHardy
  4. Max von Pettenkofer-Institute for Hygiene and Clinical Microbiology, Ludwig-Maximilian University of Munich, Munich, Germany

    • Philipp C. Münch
  5. Analytical Sciences and Imaging, Novartis Institutes for BioMedical Research, Basel, Switzerland

    • Sabine Rudolf
  6. Chemical Biology and Therapeutics, Novartis Institutes for BioMedical Research, Cambridge, MA, USA

    • Edward J. Oakeley
    • , Xiaobo Ke
    • , Rachel A. Young
    • , Henry J. Haiser
    •  & Jeffrey A. Porter
  7. Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA

    • Moran Yassour
    •  & Ramnik J. Xavier
  8. Children’s Hospital, University of Helsinki and Helsinki University Hospital, Helsinki, Finland

    • Kristiina Luopajärvi
    • , Heli Siljander
    •  & Mikael Knip
  9. Research Programs Unit, Diabetes and Obesity, University of Helsinki, Helsinki, Finland

    • Kristiina Luopajärvi
    • , Heli Siljander
    •  & Mikael Knip
  10. Department of Pediatrics, Tampere University Hospital, Tampere, Finland

    • Heli Siljander
    •  & Mikael Knip
  11. Department of Public Health Solutions, National Institute for Health and Welfare, Helsinki, Finland

    • Suvi M. Virtanen
  12. Faculty of Social Sciences/Health Sciences, University of Tampere, Tampere, Finland

    • Suvi M. Virtanen
  13. Science Centre, Pirkanmaa Hospital District and Research Center for Child Health, University Hospital, Tampere, Finland

    • Suvi M. Virtanen
  14. Immunogenetics Laboratory, University of Turku, Turku, Finland

    • Jorma Ilonen
  15. Clinical Microbiology, Turku University Hospital, Turku, Finland

    • Jorma Ilonen
  16. Department of Immunology, Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia

    • Raivo Uibo
  17. Department of Pediatrics, University of Tartu and Tartu University Hospital, Tartu, Estonia

    • Vallo Tillmann
  18. Ministry of Health and Social Development, Karelian Republic of the Russian Federation, Petrozavodsk, Russia

    • Sergei Mokurov
  19. Petrozavodsk State University, Department of Family Medicine, Petrozavodsk, Russia

    • Natalya Dorshakova
  20. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA

    • Curtis Huttenhower
  21. Folkhälsan Research Center, Helsinki, Finland

    • Mikael Knip
  22. Gastrointestinal Unit, and Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA

    • Ramnik J. Xavier
  23. Center for Microbiome Informatics and Therapeutics, MIT, Cambridge, MA, USA

    • Ramnik J. Xavier


  1. Search for Tommi Vatanen in:

  2. Search for Damian R. Plichta in:

  3. Search for Juhi Somani in:

  4. Search for Philipp C. Münch in:

  5. Search for Timothy D. Arthur in:

  6. Search for Andrew Brantley Hall in:

  7. Search for Sabine Rudolf in:

  8. Search for Edward J. Oakeley in:

  9. Search for Xiaobo Ke in:

  10. Search for Rachel A. Young in:

  11. Search for Henry J. Haiser in:

  12. Search for Raivo Kolde in:

  13. Search for Moran Yassour in:

  14. Search for Kristiina Luopajärvi in:

  15. Search for Heli Siljander in:

  16. Search for Suvi M. Virtanen in:

  17. Search for Jorma Ilonen in:

  18. Search for Raivo Uibo in:

  19. Search for Vallo Tillmann in:

  20. Search for Sergei Mokurov in:

  21. Search for Natalya Dorshakova in:

  22. Search for Jeffrey A. Porter in:

  23. Search for Alice C. McHardy in:

  24. Search for Harri Lähdesmäki in:

  25. Search for Hera Vlamakis in:

  26. Search for Curtis Huttenhower in:

  27. Search for Mikael Knip in:

  28. Search for Ramnik J. Xavier in:


T.V., D.R.P., J.S. and P.C.M. analysed the sequencing data. T.D.A., S.R., E.J.O., X.K., R.A.Y., H.J.H. and J.A.P. contributed to B. dorei isolate sequencing. A.B.H. and R.K. contributed to bioinformatic analysis. M.Y., K.L. and H.S. contributed to study design. J.I., S.M.V., R.U., V.T., S.M. and N.D. collected clinical samples. A.C.M., H.L., H.V., C.H., M.K. and R.J.X. served as principal investigators. T.V., D.R.P., J.S., P.C.M., H.V., C.H., M.K. and R.J.X. drafted the manuscript. All authors discussed the results, contributed to critical revisions and approved the final manuscript.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Ramnik J. Xavier.

Supplementary information

  1. Supplementary Information

    Supplementary Notes, Supplementary References.

  2. Reporting Summary

  3. Supplementary Table 1

    Cohort metadata.

  4. Supplementary Table 2

    PERMANOVA results.

  5. Supplementary Table 3

    Microbial alpha-diversity.

  6. Supplementary Table 4

    Taxonomic associations.

  7. Supplementary Table 5

    Strain diversity of gut microbial species.

  8. Supplementary Table 6

    Extended B. dorei pangenome.

  9. Supplementary Table 7

    Tentative circular genomic elements in the sequenced B. dorei isolates.

  10. Supplementary Table 8

    CRISPR Spacer mapping to virome contigs and DIABIMMUNE assembly.

  11. Supplementary Table 9

    Most frequent taxa assigned to CRISPR spacer carrier contigs with matches to virome contigs of the DIABIMMUNE assembly.

  12. Supplementary Table 10

    Bacterial species by body site.

  13. Supplementary Table 11

    Contributional diversities of biological process GO terms.

About this article

Publication history