Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Tracking microbial evolution in the human gut using Hi-C reveals extensive horizontal gene transfer, persistence and adaptation

Abstract

Despite the importance of horizontal gene transfer for rapid bacterial evolution, reliable assignment of mobile genetic elements to their microbial hosts in natural communities such as the human gut microbiota is lacking. We used high-throughput chromosomal conformation capture coupled with probabilistic modelling of experimental noise to resolve 88 strain-level metagenome-assembled genomes of distal gut bacteria from two participants, including 12,251 accessory elements. Comparisons of two samples collected 10 years apart for each of the participants revealed extensive in situ exchange of accessory elements as well as evidence of adaptive evolution in core genomes. Accessory elements were predominantly promiscuous and prevalent in the distal gut metagenomes of 218 adult individuals. This research provides a foundation and approach for studying microbial evolution in natural environments.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Genomic configuration space and an anchor–union representation.
Fig. 2: Genotyping complex microbial communities using Hi-C.
Fig. 3: Core and accessory divergence from species-level reference genomes.
Fig. 4: Attributes of accessory genes.
Fig. 5: Ten-year community evolution.
Fig. 6: Population-based perspective on accessory genes for the two participants.

Similar content being viewed by others

Data availability

Unprocessed DNA sequence reads and recovered MAGs are available in the NCBI database under project PRJNA505354. MAGs can be downloaded from https://purl.stanford.edu/fd871xp9063.

Code availability

HPIPE is available for download as an open-source tool at https://github.com/eitanyaffe/hpipe.

References

  1. Soucy, S. M., Huang, J. & Gogarten, J. P. Horizontal gene transfer: building the web of life. Nat. Rev. Genet. 16, 472–482 (2015).

    CAS  PubMed  Google Scholar 

  2. von Wintersdorff, C. J. H. et al. Dissemination of antimicrobial resistance in microbial ecosystems through horizontal gene transfer. Front. Microbiol. 7, 173 (2016).

    Google Scholar 

  3. Allen, H. K. et al. Call of the wild: antibiotic resistance genes in natural environments. Nat. Rev. Microbiol. 8, 251–259 (2010).

    CAS  PubMed  Google Scholar 

  4. Smillie, C. S. et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 480, 241–244 (2011).

    CAS  PubMed  Google Scholar 

  5. Maiques, E. et al. β-Lactam antibiotics induce the SOS response and horizontal transfer of virulence factors in Staphylococcus aureus. J. Bacteriol. 188, 2726–2729 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Zhang, X. et al. Quinolone antibiotics induce Shiga toxin-encoding bacteriophages, toxin production, and death in mice. J. Infect. Dis. 181, 664–670 (2000).

    CAS  PubMed  Google Scholar 

  7. Modi, S. R., Lee, H. H., Spina, C. S. & Collins, J. J. Antibiotic treatment expands the resistance reservoir and ecological network of the phage metagenome. Nature 499, 219–222 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Stecher, B. et al. Gut inflammation can boost horizontal gene transfer between pathogenic and commensal Enterobacteriaceae. Proc. Natl Acad. Sci. USA 109, 1269–1274 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Faith, J. J. et al. The long-term stability of the human gut microbiota. Science 341, 1237439 (2013).

    PubMed  PubMed Central  Google Scholar 

  10. Koonin, E. V. & Wolf, Y. I. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36, 6688–6719 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Tettelin, H., Riley, D., Cattuto, C. & Medini, D. Comparative genomics: the bacterial pan-genome. Curr. Opin. Microbiol. 11, 472–477 (2008).

    CAS  PubMed  Google Scholar 

  12. Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

    CAS  PubMed  Google Scholar 

  13. Truong, D. T., Tett, A., Pasolli, E., Huttenhower, C. & Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626–638 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Brown Kav, A. et al. Insights into the bovine rumen plasmidome. Proc. Natl Acad. Sci. USA 109, 5452–5457 (2012).

    PubMed  Google Scholar 

  15. Jørgensen, T. S., Xu, Z., Hansen, M. A., Sørensen, S. J. & Hansen, L. H. Hundreds of circular novel plasmids and DNA elements identified in a rat cecum metamobilome. PLoS ONE 9, e87924 (2014).

    PubMed  PubMed Central  Google Scholar 

  16. Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).

    CAS  PubMed  Google Scholar 

  17. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012).

    CAS  PubMed  Google Scholar 

  20. Le, T. B. K., Imakaev, M. V., Mirny, L. A. & Laub, M. T. High-resolution mapping of the spatial organization of a bacterial chromosome. Science 342, 731–734 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Duan, Z. et al. A three-dimensional model of the yeast genome. Nature 465, 363–367 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Marie-Nelly, H. et al. High-quality genome (re)assembly using chromosomal contact data. Nat. Commun. 5, 5695 (2014).

    CAS  PubMed  Google Scholar 

  26. Putnam, N. H. et al. Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 26, 342–350 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Marbouty, M. et al. Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms. eLife 3, e03318 (2014).

    PubMed  PubMed Central  Google Scholar 

  28. Burton, J. N., Liachko, I., Dunham, M. J. & Shendure, J. Species-level deconvolution of metagenome assemblies with Hi-C-based contact probability maps. G3-Genes Genom. Genet. 4, 1339–1346 (2014).

    CAS  Google Scholar 

  29. Beitel, C. W. et al. Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. PeerJ 2, e415–e419 (2014).

    PubMed  PubMed Central  Google Scholar 

  30. Marbouty, M., Baudry, L., Cournac, A. & Koszul, R. Scaffolding bacterial genomes and probing host-virus interactions in gut microbiome by proximity ligation (chromosome capture) assay. Sci. Adv. 3, e1602105 (2017).

    PubMed  PubMed Central  Google Scholar 

  31. Press, M. O. et al. Hi-C deconvolution of a human gut microbiome yields high-quality draft genomes and reveals plasmid-genome interactions. Preprint at https://doi.org/10.1101/198713 (2017).

  32. Stalder, T., Press, M. O., Sullivan, S., Liachko, I. & Top, E. M. Linking the resistome and plasmidome to the microbiome. ISME J. 13, 2437–2446 (2019).

    PubMed  PubMed Central  Google Scholar 

  33. Mukherjee, S. et al. Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements. Nucleic Acids Res. 45, D446–D456 (2017).

    CAS  PubMed  Google Scholar 

  34. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Kang, D. D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).

    PubMed  PubMed Central  Google Scholar 

  36. DeMaere, M. Z. & Darling, A. E. bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes. Genome Biol. 20, 46 (2019).

    PubMed  PubMed Central  Google Scholar 

  37. Duchêne, S. et al. Genome-scale rates of evolutionary change in bacteria. Microb. Genom. 2, e000094 (2016).

    PubMed  PubMed Central  Google Scholar 

  38. Puigbò, P., Lobkovsky, A. E., Kristensen, D. M., Wolf, Y. I. & Koonin, E. V. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes. BMC Biol. 12, 66 (2014).

    PubMed  PubMed Central  Google Scholar 

  39. McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652–654 (1991).

    CAS  PubMed  Google Scholar 

  40. Bishara, A. et al. High-quality genome sequences of uncultured microbes by assembly of read clouds. Nat. Biotechnol. 36, 1067–1075 (2018).

    CAS  Google Scholar 

  41. Kuleshov, V. et al. Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome. Nat. Biotechnol. 34, 64–69 (2016).

    CAS  PubMed  Google Scholar 

  42. Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Garud, N. R., Good, B. H., Hallatschek, O. & Pollard, K. S. Evolutionary dynamics of bacteria in the gut microbiome within and across hosts. PLoS Biol. 17, e3000102 (2019).

    PubMed  PubMed Central  Google Scholar 

  44. Sickle v.1.33 (GitHub, 2011); https://github.com/najoshi/sickle

  45. SeqPrep (GitHub, 2011); https://github.com/jstjohn/SeqPrep

  46. Schmieder, R. & Edwards, R. Fast identification and removal of sequence contamination from genomic and metagenomic datasets. PLoS ONE 6, e17288 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Li, D. et al. MEGAHITv1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods 102, 3–11 (2016).

    CAS  PubMed  Google Scholar 

  48. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

    PubMed  PubMed Central  Google Scholar 

  49. Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).

    PubMed  PubMed Central  Google Scholar 

  50. Zhu, W., Lomsadze, A. & Borodovsky, M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res. 38, e132 (2010).

    PubMed  PubMed Central  Google Scholar 

  51. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    CAS  PubMed  Google Scholar 

  52. Benson, D. A. et al. GenBank. Nucleic Acids Res. 41, D36–D42 (2013).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank M. Kennedy for help in processing clinical samples and the members of the Relman and Holmes laboratories for discussion and feedback. This work was supported by NIH R01AI112401 and NIH R56AI147023 (D.A.R.), EMBO Long-Term Fellowship ALTF 772-2014 (E.Y.), the Chan Zuckerberg Biohub Microbiome Initiative (D.A.R.) and the Thomas C. and Joan M. Merigan Endowment at Stanford University (D.A.R.).

Author information

Authors and Affiliations

Authors

Contributions

E.Y. and D.A.R. designed the study. E.Y. developed the methodology, performed and supervised the experiments, and performed the analysis. E.Y. and D.A.R. reviewed the analysis and wrote the manuscript.

Corresponding author

Correspondence to David A. Relman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Hi-C contact density as a function of linear distance.

Intra-contig read density as a function of the distance between mapped read sides, colored according to the relative strand orientation of the two read sides.

Extended Data Fig. 2 Validation on a simulated microbial community.

The genomes of 55 common gut microbes (GOLD database) were downloaded and 120 M simulated shotgun reads and 100 M simulated Hi-C reads were generated, with relative representation ranging from 1 to 1000. HPIPE identified 32 MAGs. Shown is the density plot of the relative abundance of the entire metagenomic assembly (contigs >1k), as in Fig. 1d. The abundance is the enrichment of the read coverage over a uniform distribution of reads. White/gray stripes denote chunks of 10 Mb. The fraction of the assembly that was included in any recovered MAG (‘anchored contigs’) is depicted with a red line.

Extended Data Fig. 3 Validation on a synthetic microbial community.

The community was composed of Pediococcus pentosaceus (ATCC 25745), Lactobacillus brevis (ATCC 367), Burkholderia thailandensis (E264) and two strains of Escherichia coli (BL21 and K-12), as described in Beitel et al. 29. The pipeline recovered 4 anchor/union pairs. Shown is a pairwise gene alignment between the 4 inferred MAGs (genome unions) and the 5 reference genomes.

Extended Data Fig. 4 Contig-anchor contact enrichments over all anchors.

On the x-axis is the observed number of contacts between the contig and the anchor, and on the y-axis is the enrichment score over the background model. Anchor contigs are colored red, contigs belonging to other anchors are colored blue, and all other contigs are colored gray. Anchors are extended into MAGs (genome unions) by including contigs with >=10-fold contact enrichment (dashed horizontal line), >=8 contacts (dashed vertical line), and a false positive probability of 10-6 assuming a binomial distribution (transition between vertical and horizontal line).

Extended Data Fig. 5 Examples of 2 putative novel MAGs.

On top, 68% of the genes of MAG a27 align to the Ruminococcaceae family (mean identity 74.3%), suggesting it is a novel species in that family. On the bottom, 88% of the genes of MAG a70 align to the Clostridiales order (mean identity 74.5%), indicating it is a novel genome within Lachnospiraceae or Eubacteriaceae. Each taxon is colored according to the mean amino acid identity, and the colored fraction of each rectangle represents the percentage of the aligned genes.

Extended Data Fig. 6 Comparison of HPIPE to alternative metagenomic binning methods.

Single-copy gene estimates of genome completeness percentage (in black) and contamination percentage (in red) with HPIPE, metaBAT2, and bin3C, sorted according to completeness. Minimal completeness (50%) and maximal contamination (10%) thresholds are depicted with dashed horizontal lines. Our results (HPIPE, as in Fig. 2c), are compared to metaBAT2 (tool based on abundance and tetranucleotide frequency), and bin3C (tool based on clustering of Hi-C data).

Extended Data Fig. 7 Comparison of anchors and cores.

(a) Shown for all 44 MAGs (genome unions) is the breakdown of genes into ‘core-only, ‘anchor-only’, ‘both’ or ‘neither’, sorted according to the ‘both’ fraction. (b) The fraction of the 4 gene classifications, colored as in (a), is averaged over all 44 MAGs. Core-only genes (29%) are present due to the stringent selection of anchors which considers only long contigs (>10 kb).

Extended Data Fig. 8 Species-level reference genomes for participant B.

Shown are the core and accessory fractions for the 44 MAGs that had a species-level reference for participant B. For both the recovered MAGs (left) and the matching species-level reference genomes (right), the core fraction is depicted using a colored rectangle, and the accessory fraction (that is, strain-specific genes) is depicted using a gray rectangle. Cores are colored according to genome similarity (nucleotide sequence identity) between MAG cores and matching reference cores.

Extended Data Fig. 9 Polymorphism and 10-year divergence patterns for participant B.

(a) Polymorphism levels, estimated using the density of intermediate alleles (SNPs with a frequency in the range 20%-80%), are shown for 35 MAGs of participant B that had at least 10x coverage. (b) Host classification for the 44 MAGs of participant B. (c) The distribution among element classes, stratified according to element type (shared and non-shared). Data are normalized so that each type sums to 100%. (d) The distribution among element classes, stratified according to host class. Data are normalized so that each host class sums to 100%. Standard deviations are depicted using error bars.

Extended Data Fig. 10 Attributes of the 12 MAGs classified as persistent over the 10-year period.

Columns indicate the participant (or Subject) in whom the MAG was found, the number of non-persistent accessory genes (HGT column), the number of non-synonymous (#Pn) and synonymous (#Ps) sites that were polymorphic within the genotyped sample, and the number of non-synonymous (#Dn) and synonymous (#Ds) sites that were divergent between the genotyped sample and the 10-year sample. Matching site densities (Pn, Ps, Dn and Ds) equal the number of sites divided by the total number of sites of each type (synonymous or non-synonymous). P-values are for the McDonald-Kreitman test (χ2), which examines whether the ratios, Pn/Ps and Dn/Ds differ significantly.

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2, and Supplementary Figs. 1–6.

Reporting Summary

Supplementary Table 1

This table includes three Supplementary Tables: Supplementary Table 1 contains gut genomes used for simulated data; Supplementary Table 2 contains public gut microbiome samples used in this study; and Supplementary Table 3 contains gene ontology enrichment tables a–e.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yaffe, E., Relman, D.A. Tracking microbial evolution in the human gut using Hi-C reveals extensive horizontal gene transfer, persistence and adaptation. Nat Microbiol 5, 343–353 (2020). https://doi.org/10.1038/s41564-019-0625-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-019-0625-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing