The human distal gut harbours a vast ensemble of microbes (the microbiota) that provide important metabolic capabilities, including the ability to extract energy from otherwise indigestible dietary polysaccharides1,2,3,4,5,6. Studies of a few unrelated, healthy adults have revealed substantial diversity in their gut communities, as measured by sequencing 16S rRNA genes6,7,8, yet how this diversity relates to function and to the rest of the genes in the collective genomes of the microbiota (the gut microbiome) remains obscure. Studies of lean and obese mice suggest that the gut microbiota affects energy balance by influencing the efficiency of calorie harvest from the diet, and how this harvested energy is used and stored3,4,5. Here we characterize the faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity, and their mothers, to address how host genotype, environmental exposure and host adiposity influence the gut microbiome. Analysis of 154 individuals yielded 9,920 near full-length and 1,937,461 partial bacterial 16S rRNA sequences, plus 2.14 gigabases from their microbiomes. The results reveal that the human gut microbiome is shared among family members, but that each person’s gut microbial community varies in the specific bacterial lineages present, with a comparable degree of co-variation between adult monozygotic and dizygotic twin pairs. However, there was a wide array of shared microbial genes among sampled individuals, comprising an extensive, identifiable ‘core microbiome’ at the gene, rather than at the organismal lineage, level. Obesity is associated with phylum-level changes in the microbiota, reduced bacterial diversity and altered representation of bacterial genes and metabolic pathways. These results demonstrate that a diversity of organismal assemblages can nonetheless yield a core microbiome at a functional level, and that deviations from this core are associated with different physiological states (obese compared with lean).

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


Primary accessions


Data deposits

This Whole Genome Shotgun project is deposited in DDBJ/EMBL/GenBank under accession number 32089. 454 pyrosequencing reads are deposited in the NCBI Short Read Archive. Nearly full-length 16S rRNA gene sequences are deposited in GenBank under accession numbers FJ362604FJ372382. Annotated sequences are also available in MG-RAST (http://metagenomics.nmpdr.org/). 454-generated 16S rRNA sequences with sample identifiers are also available at http://gordonlab.wustl.edu/SuppData.html.


  1. 1.

    et al. The human microbiome project. Nature 449, 804–810 (2007)

  2. 2.

    et al. Obesity alters gut microbial ecology. Proc. Natl Acad. Sci. USA 102, 11070–11075 (2005)

  3. 3.

    et al. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444, 1027–1031 (2006)

  4. 4.

    , , & Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe 3, 213–223 (2008)

  5. 5.

    et al. The gut microbiota as an environmental factor that regulates fat storage. Proc. Natl Acad. Sci. USA 101, 15718–15723 (2004)

  6. 6.

    , , & Human gut microbes associated with obesity. Nature 444, 1022–1023 (2006)

  7. 7.

    et al. Diversity of the human intestinal microbial flora. Science 308, 1635–1638 (2005)

  8. 8.

    et al. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc. Natl Acad. Sci. USA 104, 13780–13785 (2007)

  9. 9.

    et al. Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles. Am. J. Hum. Genet. 82, 763–771 (2008)

  10. 10.

    et al. The response to long-term overfeeding in identical twins. N. Engl. J. Med. 322, 1477–1482 (1990)

  11. 11.

    , & Genetic and environmental factors in relative body weight and human adiposity. Behav. Genet. 27, 325–351 (1997)

  12. 12.

    et al. Ascertainment of a mid-western US female adolescent twin cohort for alcohol studies: assessment of sample representativeness using birth record data. Twin Res. 5, 107–112 (2002)

  13. 13.

    , , , & Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nature Methods 5, 235–237 (2008)

  14. 14.

    et al. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc. Natl Acad. Sci. USA 103, 12115–12120 (2006)

  15. 15.

    , & UniFrac-an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7, 371 (2006)

  16. 16.

    , & Patterns in phytoplankton taxonomic composition across temperate lakes of differing nutrient status. Limnol. Oceanogr. 42, 487–495 (1997)

  17. 17.

    , , , & The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)

  18. 18.

    et al. STRING 7-recent developments in the integration and prediction of protein interactions. Nucleic Acids Res. 35, D358–D362 (2007)

  19. 19.

    , , & Open source clustering software. Bioinformatics 20, 1453–1454 (2004)

  20. 20.

    et al. Metagenomic analysis of the human distal gut microbiome. Science 312, 1355–1359 (2006)

  21. 21.

    et al. Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res. 14, 169–181 (2007)

  22. 22.

    et al. Functional metagenomic profiling of nine biomes. Nature 452, 629–632 (2008)

  23. 23.

    , & An application of statistics to comparative metagenomics. BMC Bioinformatics 7, 162 (2006)

  24. 24.

    , , , & The host genotype affects the bacterial community in the human gastrointestinal tract. Microb. Ecol. Health Dis. 13, 129–134 (2001)

  25. 25.

    , , , & Development of the human infant intenstinal microbiota. PLoS Biol. 5, e177 (2007)

  26. 26.

    , & Investigations into the influence of host genetics on the predominant eubacteria in the faecal microflora of children. J. Med. Microbiol. 54, 1239–1242 (2005)

  27. 27.

    et al. Evolution of mammals and their gut microbes. Science 320, 1647–1651 (2008)

  28. 28.

    & Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006)

  29. 29.

    et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006)

  30. 30.

    , , & A greedy algorithm for aligning DNA sequences. J. Comput. Biol. 7, 203–214 (2000)

  31. 31.

    , & Clearcut: a fast implementation of relaxed neighbor joining. Bioinformatics 22, 2823–2824 (2006)

  32. 32.

    Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 1–10 (1992)

  33. 33.

    , , & Comparison of DNA sequences with protein sequences. Genomics 46, 24–36 (1997)

  34. 34.

    et al. PyCogent: a toolkit for making sense from sequence. Genome Biol. 8, R171 (2007)

Download references


We thank: S. Wagoner and J. Manchester for technical support; S. Marion and D. Hopper for recruitment of participants and sample collection; A. Goodman, B. Muegge, and M. Mahowald for suggestions; S. Huse (Marine Biological Laboratory), F. Niazi and S. Attiya (454 Life Sciences), C. Markovic, L. Fulton, B. Fulton, E. Mardis and R. Wilson (Washington University Genome Sequencing Center) and S. Macmil, G. Wiley, C. Qu, and P. Wang (University of Oklahoma) for their assistance with sequencing; and P. M. Coutinho (Université de Provence, France) for help with the CAZy analysis. Deep draft assemblies of reference gut genomes were generated as part of a National Human Genome Research Institute (NHGRI)-sponsored human gut microbiome initiative (http://genome.wustl.edu/pub/organism/Microbes/Human_Gut_Microbiome/). This work was supported in part by the National Institutes of Health (DK78669/ES012742/AA09022/HD049024), the National Science Foundation (OCE0430724), the W.M. Keck Foundation, and the Crohn’s and Colitis Foundation of America.

Author Contributions P.J.T., A.C.H., R.K. and J.I.G. designed the experiments. P.J.T., T.Y., A.D., R.E.L., M.L.S., W.J.J., B.A.R., J.P.A. and M.E. generated the data. P.J.T., M.H., M.L.S., B.L.C., A.D., B.H., A.C.H., R.K. and J.I.G. analysed the data. P.J.T., A.C.H., R.K. and J.I.G. wrote the manuscript with input from the other members of the team.

Author information


  1. Center for Genome Sciences

    • Peter J. Turnbaugh
    • , Tanya Yatsunenko
    • , Ruth E. Ley
    •  & Jeffrey I. Gordon
  2. Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri 63108, USA

    • Alexis Duncan
    •  & Andrew C. Heath
  3. Department of Computer Science

    • Micah Hamady
  4. Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, USA

    • Rob Knight
  5. CNRS, UMR6098, Marseille, France

    • Brandi L. Cantarel
    •  & Bernard Henrissat
  6. Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, Massachusetts 02543, USA

    • Mitchell L. Sogin
  7. Environmental Genomics Core Facility, University of South Carolina, Columbia, South Carolina 29208, USA

    • William J. Jones
  8. Department of Chemistry and Biochemistry and the Advanced Center for Genome Technology, University of Oklahoma, Norman, Oklahoma 73019, USA

    • Bruce A. Roe
  9. 454 Life Sciences, Branford, Connecticut 06405, USA

    • Jason P. Affourtit
    •  & Michael Egholm


  1. Search for Peter J. Turnbaugh in:

  2. Search for Micah Hamady in:

  3. Search for Tanya Yatsunenko in:

  4. Search for Brandi L. Cantarel in:

  5. Search for Alexis Duncan in:

  6. Search for Ruth E. Ley in:

  7. Search for Mitchell L. Sogin in:

  8. Search for William J. Jones in:

  9. Search for Bruce A. Roe in:

  10. Search for Jason P. Affourtit in:

  11. Search for Michael Egholm in:

  12. Search for Bernard Henrissat in:

  13. Search for Andrew C. Heath in:

  14. Search for Rob Knight in:

  15. Search for Jeffrey I. Gordon in:

Corresponding author

Correspondence to Jeffrey I. Gordon.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This files contains Supplementary Results, Supplementary Methods, Supplementary References, Supplementary Figures 1-20 and Supplementary Tables 1-15.

About this article

Publication history






Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.