Enterotypes of the human gut microbiome

Journal name:
Date published:
Published online
Corrected online


Our knowledge of species and functional composition of the human gut microbiome is rapidly increasing, but it is still based on very few cohorts and little is known about variation across the world. By combining 22 newly sequenced faecal metagenomes of individuals from four countries with previously published data sets, here we identify three robust clusters (referred to as enterotypes hereafter) that are not nation or continent specific. We also confirmed the enterotypes in two published, larger cohorts, indicating that intestinal microbiota variation is generally stratified, not continuous. This indicates further the existence of a limited number of well-balanced host–microbial symbiotic states that might respond differently to diet and drug intake. The enterotypes are mostly driven by species composition, but abundant molecular functions are not necessarily provided by abundant species, highlighting the importance of a functional analysis to understand microbial communities. Although individual host properties such as body mass index, age, or gender cannot explain the observed enterotypes, data-driven marker genes or functional modules can be identified for each of these host properties. For example, twelve genes significantly correlate with age and three functional modules with the body mass index, hinting at a diagnostic potential of microbial markers.

At a glance


  1. Functional and phylogenetic profiles of human gut microbiome.
    Figure 1: Functional and phylogenetic profiles of human gut microbiome.

    a, Simulation of the detection of distinct orthologous groups when increasing the number of individuals (samples). Complete genomes were classified by habitat information and the orthologous groups divided into those that occur in known gut species (red) and those that have not yet been associated with gut (blue). The former are close to saturation when sampling 35 individuals (excluding infants) whereas functions from non-gut (probably rare and transient) species are not. b, Genus abundance variation box plot for the 30 most abundant genera as determined by read abundance. Genera are coloured by their respective phylum (see inset for colour key). Inset shows phylum abundance box plot. Genus and phylum level abundances were measured using reference-genome-based mapping with 85% and 65% sequence similarity cutoffs. Unclassified genera under a higher rank are marked by asterisks. c, Orthologous group abundance variation box plot for the 30 most abundant orthologous gruops as determined by assignment to eggNOG12. Orthologous groups are coloured by their respective functional category (see inset for colour key). Inset shows abundance box plot of 24 functional categories. Boxes represent the interquartile range (IQR) between first and third quartiles and the line inside represents the median. Whiskers denote the lowest and highest values within 1.5 × IQR from the first and third quartiles, respectively. Circles represent outliers beyond the whiskers.

  2. Phylogenetic differences between enterotypes.
    Figure 2: Phylogenetic differences between enterotypes.

    ac, Between-class analysis, which visualizes results from PCA and clustering, of the genus compositions of 33 Sanger metagenomes estimated by mapping the metagenome reads to 1,511 reference genome sequences using an 85% similarity threshold (a), Danish subset containing 85 metagenomes from a published Illumina data set8 (b) and 154 pyrosequencing-based 16S sequences5 (c) reveal three robust clusters that we call enterotypes. IBD, inflammatory bowel disease. Two principal components are plotted using the ade4 package in R with each sample represented by a filled circle. The centre of gravity for each cluster is marked by a rectangle and the coloured ellipse covers 67% of the samples belonging to the cluster. IBD, inflammatory bowel disease. d, Abundances of the main contributors of each enterotype from the Sanger metagenomes. See Fig. 1 for definition of box plot. e, Co-occurrence networks of the three enterotypes from the Sanger metagenomes. Unclassified genera under a higher rank are marked by asterisks in b and e.

  3. Functional differences between enterotypes.
    Figure 3: Functional differences between enterotypes.

    a, Between-class analysis (see Fig. 2) of orthologous group abundances showing only minor disagreements with enterotypes (unfilled circles indicate the differing samples). The blue cloud represents the local density estimated from the coordinates of orthologous groups; positions of selected orthologous groups are highlighted. b, Four enzymes in the biotin biosynthesis pathway (COG0132, COG0156, COG0161 and COG0502) are overrepresented in enterotype 1. c, Four enzymes in the thiamine biosynthesis pathway (COG0422, COG0351, COG0352 and COG0611) are overrepresented in enterotype 2. d, Six enzymes in the haem biosynthesis pathway (COG0007, COG0276, COG407, COG0408, COG0716 and COG1648) are overrepresented in enterotype 3.

  4. Correlations with host properties.
    Figure 4: Correlations with host properties.

    a, Pairwise correlation of RNA polymerase facultative σ24 subunit (COG1595) with age (P = 0.03, rho = −0.59). b, Pairwise correlation of SusD, a family of proteins that bind glycan molecules before they are transported into the cell, and BMI (P = 0.27, rho = −0.29, weak correlation). c, Multiple orthologous groups (OGs) (COG0085, COG0086, COG0438 and COG0739; see Supplementary Table 18) significantly correlating with age when combined into a linear model (see Supplementary Methods section 13 and ref. 40 for details; P = 2.75×10−5, adjusted R2 = 0.57). d, Two modules, ATPase complex and ectoine biosynthesis (M00051), significantly correlating with BMI when combined into a linear model (P = 6.786×10−6, adjusted R2 = 0.82).

Change history

Corrected online 08 June 2011
An author was omitted. His name has been added to the HTML and PDF and described in the accompanying Corrigendum.


  1. Eckburg, P. B. et al. Diversity of the human intestinal microbial flora. Science 308, 16351638 (2005)
  2. Hayashi, H., Sakamoto, M. & Benno, Y. Phylogenetic analysis of the human gut microbiota using 16S rDNA clone libraries and strictly anaerobic culture-based methods. Microbiol. Immunol. 46, 535548 (2002)
  3. Lay, C. et al. Colonic microbiota signatures across five northern European countries. Appl. Environ. Microbiol. 71, 41534155 (2005)
  4. Gill, S. R. et al. Metagenomic analysis of the human distal gut microbiome. Science 312, 13551359 (2006)
  5. Turnbaugh, P. J. et al. A core gut microbiome in obese and lean twins. Nature 457, 480484 (2009)
  6. Kurokawa, K. et al. Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res. 14, 169181 (2007)
  7. Zoetendal, E. G., Rajilic-Stojanovic, M. & de Vos, W. M. High-throughput diversity and functionality analysis of the gastrointestinal tract microbiota. Gut 57, 16051615 (2008)
  8. Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 5965 (2010)
  9. Raes, J. & Bork, P. Molecular eco-systems biology: towards an understanding of community function. Nature Rev. Microbiol. 6, 693699 (2008)
  10. Nelson, K. E. et al. A catalog of reference genomes from the human microbiome. Science 328, 994999 (2010)
  11. MetaHIT Consortium. MetaHIT Draft Bacterial Genomes at the Sanger Institute. left fencehttp://www.sanger.ac.uk/resources/downloads/bacteria/metahit/right fence (9 July 2010)
  12. Muller, J. et al. eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res. 38, D190D195 (2010)
  13. Palmer, C., Bik, E. M., Digiulio, D. B., Relman, D. A. & Brown, P. O. Development of the human infant intestinal microbiota. PLoS Biol. 5, e177 (2007)
  14. Tap, J. et al. Towards the human intestinal microbiota phylogenetic core. Environ. Microbiol. 11, 25742584 (2009)
  15. Jensen, L. J. et al. STRING 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412D416 (2009)
  16. Dethlefsen, L., Huse, S., Sogin, M. L. & Relman, D. A. The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol. 6, e280 (2008)
  17. Walker, A. Say hello to our little friends. Nature Rev. Microbiol. 5, 572573 (2007)
  18. Krogfelt, K. A. Bacterial adhesion: genetics, biogenesis, and role in pathogenesis of fimbrial adhesins of Escherichia coli. Rev. Infect. Dis. 13, 721735 (1991)
  19. Salonen, A. et al. Comparative analysis of fecal DNA extraction methods with phylogenetic microarray: effective recovery of bacterial and archaeal DNA using mechanical cell lysis. J. Microbiol. Methods 81, 127134 (2010)
  20. Rajilic-Stojanovic, M. et al. Development and application of the human intestinal tract chip, a phylogenetic microarray: analysis of universally conserved phylotypes in the abundant microbiota of young and elderly adults. Environ. Microbiol. 11, 17361751 (2009)
  21. Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 5365 (1987)
  22. Vanhoutte, T., Huys, G., Brandt, E., d & Swings, J. Temporal stability analysis of the microbiota in human feces by denaturing gradient gel electrophoresis using universal and group-specific 16S rRNA gene primers. FEMS Microbiol. Ecol. 48, 437446 (2004)
  23. Tannock, G. W. et al. Analysis of the fecal microflora of human subjects consuming a probiotic product containing Lactobacillus rhamnosus DR20. Appl. Environ. Microbiol. 66, 25782588 (2000)
  24. Seksik, P. et al. Alterations of the dominant faecal bacterial groups in patients with Crohn’s disease of the colon. Gut 52, 237242 (2003)
  25. Costello, E. K. et al. Bacterial community variation in human body habitats across space and time. Science 326, 16941697 (2009)
  26. Martens, E. C., Koropatkin, N. M., Smith, T. J. & Gordon, J. I. Complex glycan catabolism by the human gut microbiota: the Bacteroidetes Sus-like paradigm. J. Biol. Chem. 284, 2467324677 (2009)
  27. Wright, D. P., Rosendale, D. I. & Roberton, A. M. Prevotella enzymes involved in mucin oligosaccharide degradation and evidence for a small operon of genes expressed during growth on mucin. FEMS Microbiol. Lett. 190, 7379 (2000)
  28. Derrien, M., Vaughan, E. E., Plugge, C. M. & de Vos, W. M. Akkermansia muciniphila gen. nov., sp. nov., a human intestinal mucin-degrading bacterium. Int. J. Syst. Evol. Microbiol. 54, 14691476 (2004)
  29. Ley, R. E., Turnbaugh, P. J., Klein, S. & Gordon, J. I. Microbial ecology: human gut microbes associated with obesity. Nature 444, 10221023 (2006)
  30. Schwiertz, A. et al. Microbiota and SCFA in lean and overweight healthy subjects. Obesity 18, 190195 (2009)
  31. Woodmansey, E. J. Intestinal bacteria and ageing. J. Appl. Microbiol. 102, 11781186 (2007)
  32. Kovacikova, G. & Skorupski, K. The alternative sigma factor σE plays an important role in intestinal survival and virulence in Vibrio cholerae. Infect. Immun. 70, 53555362 (2002)
  33. Fujihashi, K. & Kiyono, H. Mucosal immunosenescence: new developments and vaccines to control infectious diseases. Trends Immunol. 30, 334343 (2009)
  34. Turnbaugh, P. J. et al. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444, 10271031 (2006)
  35. Raes, J., Korbel, J. O., Lercher, M. J., von Mering, C. & Bork, P. Prediction of effective genome size in metagenomic samples. Genome Biol. 8, R10 (2007)
  36. Gibson, G. R. et al. Alternative pathways for hydrogen disposal during fermentation in the human colon. Gut 31, 679683 (1990)
  37. Godon, J. J., Zumstein, E., Dabert, P., Habouzit, F. & Moletta, R. Molecular microbial diversity of an anaerobic digestor as determined by small-subunit rDNA sequence analysis. Appl. Environ. Microbiol. 63, 28022813 (1997)
  38. Arumugam, M., Harrington, E. D., Foerstner, K. U., Raes, J. & Bork, P. Smash Community: a metagenomic annotation and analysis tool. Bioinformatics 26, 29772978 (2010)
  39. Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 52615267 (2007)
  40. Gianoulis, T. A. et al. Quantifying environmental adaptation of metabolic pathways in metagenomics. Proc. Natl Acad. Sci. USA 106, 13741379 (2009)

Download references

Author information

  1. These authors contributed equally to this work.

    • Manimozhiyan Arumugam &
    • Jeroen Raes


  1. European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany

    • Manimozhiyan Arumugam,
    • Jeroen Raes,
    • Takuji Yamada,
    • Daniel R. Mende,
    • Gabriel R. Fernandes,
    • Julien Tap &
    • Peer Bork
  2. VIB—Vrije Universiteit Brussel, 1050 Brussels, Belgium

    • Jeroen Raes
  3. Commissariat à l’Energie Atomique, Genoscope, 91000 Evry, France

    • Eric Pelletier,
    • Denis Le Paslier,
    • Thomas Bruls,
    • Julie Poulain,
    • Edgardo Ugarte &
    • Jean Weissenbach
  4. Centre National de la Recherche Scientifique, UMR8030, 91000 Evry, France

    • Eric Pelletier,
    • Denis Le Paslier,
    • Thomas Bruls &
    • Jean Weissenbach
  5. Université d'Evry Val d'Essone 91000 Evry, France

    • Eric Pelletier,
    • Denis Le Paslier,
    • Thomas Bruls &
    • Jean Weissenbach
  6. Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Av. Antônio Carlos 6627, 31270-901 Belo Horizonte, Minas Gerais, Brazil

    • Gabriel R. Fernandes
  7. Institut National de la Recherche Agronomique, 78350 Jouy en Josas, France

    • Julien Tap,
    • Jean-Michel Batto,
    • Marion Leclerc,
    • Florence Levenez,
    • Nicolas Pons,
    • Joel Doré &
    • S. Dusko Ehrlich
  8. Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark

    • Marcelo Bertalan,
    • Laurent Gautier,
    • H. Bjørn Nielsen,
    • Thomas Sicheritz-Ponten &
    • Søren Brunak
  9. Digestive System Research Unit, University Hospital Vall d’Hebron, Ciberehd, 08035 Barcelona, Spain

    • Natalia Borruel,
    • Francesc Casellas,
    • Chaysavanh Manichanh &
    • Francisco Guarner
  10. Barcelona Supercomputing Center, Jordi Girona 31, 08034 Barcelona, Spain

    • Leyden Fernandez &
    • David Torrents
  11. Marie Krogh Center for Metabolic Research, Section of Metabolic Genetics, Faculty of Health Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark

    • Torben Hansen,
    • Trine Nielsen &
    • Oluf Pedersen
  12. Faculty of Health Sciences, University of Southern Denmark, DK-5000 Odense, Denmark

    • Torben Hansen
  13. Computational Biology Laboratory Bld, The University of Tokyo Kashiwa Campus, Kashiwa-no-ha 5-1-5, Kashiwa, Chiba, 277-8561, Japan

    • Masahira Hattori
  14. Division of Bioenvironmental Science, Frontier Science Research Center, University of Miyazaki, 5200 Kiyotake, Miyazaki 889-1692, Japan

    • Tetsuya Hayashi
  15. Laboratory of Microbiology, Wageningen University, 6710BA Ede, The Netherlands

    • Michiel Kleerebezem,
    • Sebastian Tims,
    • Erwin G. Zoetendal &
    • Willem M. de Vos
  16. Tokyo Institute of Technology, Graduate School of Bioscience and Biotechnology, Department of Biological Information, 4259 Nagatsuta-cho, Midori-ku, Yokohama-shi, Kanagawa Pref. 226-8501, Japan

    • Ken Kurokawa
  17. BGI-Shenzhen, Shenzhen 518083, China

    • Junjie Qin &
    • Jun Wang
  18. Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark

    • Thomas Sicheritz-Ponten
  19. Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain

    • David Torrents
  20. Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark

    • Jun Wang
  21. Institute of Biomedical Science, Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark

    • Oluf Pedersen
  22. Hagedorn Research Institute, DK-2820 Gentofte, Denmark

    • Oluf Pedersen
  23. Faculty of Health Sciences, University of Aarhus, DK-8000 Aarhus, Denmark

    • Oluf Pedersen
  24. University of Helsinki, FI-00014 Helsinki, Finland

    • Willem M. de Vos
  25. Max Delbrück Centre for Molecular Medicine, D-13092 Berlin, Germany

    • Peer Bork
  26. Digestive System Research Unit, University Hospital Vall d’Hebron, Ciberehd, 08035 Barcelona, Spain.

    • María Antolín,
    • Antonio Torrejon &
    • Encarna Varela
  27. Commissariat à l’Energie Atomique, Genoscope, 91000 Evry, France.

    • François Artiguenave &
    • Raquel Melo Minardi
  28. Institut National de la Recherche Agronomique, 78350 Jouy en Josas, France.

    • Hervé M. Blottiere,
    • Mathieu Almeida,
    • Antonella Cultrone,
    • Christine Delorme,
    • Rozenn Dervyn,
    • Maarten van de Guchte,
    • Eric Guedon,
    • Florence Haimet,
    • Alexandre Jamet,
    • Catherine Juste,
    • Ghalia Kaci,
    • Omar Lakhdari,
    • Severine Layec,
    • Karine Le Roux,
    • Emmanuelle Maguin,
    • Pierre Renault,
    • Nicolas Sanchez,
    • Gaetana Vandemeulebrouck &
    • Yohanan Winogradsky
  29. UCB Pharma SA, 28046 Madrid, Spain.

    • Carlos Cara
  30. Danone Research, 91120 Palaiseau, France.

    • Christian Chervaux,
    • Gérard Denariaz,
    • Johan van Hylckama-Vlieg,
    • Jan Knol &
    • Raish Oozeer
  31. European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.

    • Konrad U. Foerstner,
    • Wolfgang Huber,
    • Shinichi Sunagawa &
    • Georg Zeller
  32. Heidelberger Strasse 24, 64285 Darmstadt, Germany.

    • Konrad U. Foerstner
  33. Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.

    • Carsten Friss
  34. Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, 67404 Illkrich, France.

    • Jean Muller
  35. The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    • Julian Parkhill &
    • Keith Turner
  36. Istituto Europeo di Oncologia, 20100 Milan, Italy.

    • Maria Rescigno
  37. Institut Mérieux, 17 rue Burgelat, 69002 Lyon, France.

    • Christian Brechot,
    • Alexandre Mérieux &
    • Christine M'rini
  38. Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark.

    • Karsten Kristiansen


  1. MetaHIT Consortium (additional members)

    • María Antolín,
    • François Artiguenave,
    • Hervé M. Blottiere,
    • Mathieu Almeida,
    • Christian Brechot,
    • Carlos Cara,
    • Christian Chervaux,
    • Antonella Cultrone,
    • Christine Delorme,
    • Gérard Denariaz,
    • Rozenn Dervyn,
    • Konrad U. Foerstner,
    • Carsten Friss,
    • Maarten van de Guchte,
    • Eric Guedon,
    • Florence Haimet,
    • Wolfgang Huber,
    • Johan van Hylckama-Vlieg,
    • Alexandre Jamet,
    • Catherine Juste,
    • Ghalia Kaci,
    • Jan Knol,
    • Karsten Kristiansen,
    • Omar Lakhdari,
    • Severine Layec,
    • Karine Le Roux,
    • Emmanuelle Maguin,
    • Alexandre Mérieux,
    • Raquel Melo Minardi,
    • Christine M'rini,
    • Jean Muller,
    • Raish Oozeer,
    • Julian Parkhill,
    • Pierre Renault,
    • Maria Rescigno,
    • Nicolas Sanchez,
    • Shinichi Sunagawa,
    • Antonio Torrejon,
    • Keith Turner,
    • Gaetana Vandemeulebrouck,
    • Encarna Varela,
    • Yohanan Winogradsky &
    • Georg Zeller


All authors are members of the Metagenomics of the Human Intestinal Tract (MetaHIT) Consortium. Jun W., F.G., O.P., W.M.d.V., S.B., J.D., Jean W., S.D.E. and P.B. managed the project. N.B., F.C., T.H., C.M. and T. N. performed clinical analyses. M.L. and F.L. performed DNA extraction. E.P., D.L.P., T.B., J.P. and E.U. performed DNA sequencing. M.A., J.R., S.D.E. and P.B. designed the analyses. M.A., J.R., T.Y., D.R.M., G.R.F., J.T., J.-M.B., M.B., L.F., L.G., M.K., H.B.N., N.P., J.Q., T.S.-P., S.T., D.T., E.G.Z., S.D.E. and P.B. performed the analyses. M.A., J.R., P.B. and S.D.E. wrote the manuscript. M.H., T.H., K.K. and the MetaHIT Consortium members contributed to the design and execution of the study.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Raw Sanger read data from the European faecal metagenomes have been deposited in the NCBI Trace Archive with the following project identifiers: MH6 (33049), MH13 (33053), MH12 (33055), MH30 (33057), CD1 (33059), CD2 (33061), UC4 (33113), UC6 (33063), NO1 (33305), NO3 (33307), NO4 (33309), NO8 (33311), OB2 (33313), OB1 (38231), OB6 (38233), OB8 (45929), A (63073), B (63075), C (63077), D (63079), E (63081), G (63083). Contigs, genes and annotations are available to download from http://www.bork.embl.de/Docu/Arumugam_et_al_2011/.

Author details

Supplementary information

PDF files

  1. Supplementary Information (503K)

    The file contains Supplementary Methods, Supplementary Notes and Supplementary References. A minor error in Supplementary Information section 2.2 was corrected on 02 June 2011.

  2. Supplementary Figures (3M)

    This file contains Supplementary Figures 1-27 with legends.

  3. Supplementary Tables (520K)

    The file contains Supplementary Tables 1 - 2 and 4 - 24 (see separate file for Supplementary Table 3).

  4. Supplementary Table 3 (1.1M)

    The file contains Supplementary Table 3.


  1. Report this comment #21678

    Bobby Baum said:

    All of the long-term general health-monitoring studies should determine the enterotypes of their subjects. This would be incredibly useful for data mining.

Subscribe to comments

Additional data