Short-read metagenomic sequencing and de novo genome assembly of the human gut microbiome can yield draft bacterial genomes without isolation and culture. However, bacterial genomes assembled from short-read sequencing are often fragmented. Furthermore, these metagenome-assembled genomes often exclude repeated genomic elements, such as mobile genetic elements, compromising our understanding of the contribution of these elements to important bacterial phenotypes. Although long-read sequencing has been applied successfully to the assembly of contiguous bacterial isolate genomes, extraction of DNA of sufficient molecular weight, purity and quantity for metagenomic sequencing from stool samples can be challenging. Here, we present a protocol for the extraction of microgram quantities of high-molecular-weight DNA from human stool samples that are suitable for downstream long-read sequencing applications. We also present Lathe (www.github.com/bhattlab/lathe), a computational workflow for long-read basecalling, assembly, consensus refinement with long reads or Illumina short reads and genome circularization. Altogether, this protocol can yield high-quality contiguous or circular bacterial genomes from a complex human gut sample in approximately 10 d, with 2 d of hands-on bench and computational effort.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Communications Biology Open Access 08 September 2022
BMC Genomics Open Access 06 May 2021
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
No new data were generated or analyzed for this manuscript.
Lathe is available at https://github.com/bhattlab/lathe. Post-assembly binning workflows can be found at https://github.com/bhattlab/metagenomics_workflows.
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e20 (2019).
Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).
Nayfach, S., Shi, Z. J., Seshadri, R., Pollard, K. S. & Kyrpides, N. C. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505–510 (2019).
Almeida, A. et al. A unified sequence catalogue of over 280,000 genomes obtained from the human gut microbiome. Nat. Biotechnol. Forthcoming (2020).
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).
Vandecraen, J., Chandler, M., Aertsen, A. & Van Houdt, R. The impact of insertion sequences on bacterial genome plasticity and adaptability. Crit. Rev. Microbiol. 43, 709–730 (2017).
Darmon, E. & Leach, D. R. F. Bacterial genome instability. Microbiol. Mol. Biol. Rev. 78, 1–39 (2014).
Yuan, S., Cohen, D. B., Ravel, J., Abdo, Z. & Forney, L. J. Evaluation of methods for the extraction and purification of DNA from the human microbiome. PLoS ONE 7, e33865 (2012).
Moss, E. L., Maghini, D. G. & Bhatt, A. S. Complete, closed bacterial genomes from microbiomes using nanopore sequencing. Nat. Biotechnol. 38, 701–707 (2020).
Ribado, J. V. The impact of environmental exposures on the human and mouse gut microbiome. Dissertation, Stanford University, 2019).
Rang, F. J., Kloosterman, W. P. & de Ridder, J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 19, 90 (2018).
Rhoads, A. & Au, K. F. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13, 278–289 (2015).
Tamburini, F. B. et al. Short- and long-read metagenomics of South African gut microbiomes reveal a transitional composition and novel taxa. Preprint at https://www.biorxiv.org/content/10.1101/2020.05.18.099820v2.
Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).
Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome Biol. 20, 129 (2019).
Gorzelak, M. A. et al. Methods for improving human gut microbiome data by reducing variability through sample processing and storage of stool. PLoS One 10, e0134802 (2015).
Flores, R. et al. Collection media and delayed freezing effects on microbial composition of human stool. Microbiome 3, 33 (2015).
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27, 722–736 (2017).
Lin, Y. et al. Assembly of long error-prone reads using de Bruijn graphs. Proc. Natl Acad. Sci. USA 113, E8396–E8405 (2016).
Li, H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).
Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17, 155–158 (2019).
Antipov, D., Korobeynikov, A., McLean, J. S. & Pevzner, P. A. hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015 (2016).
Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Delcher, A. L., Salzberg, S. L. & Phillippy, A. M. Using MUMmer to identify similar regions in large sequence sets. Curr. Protoc. Bioinformatics Chapter 10, Unit 10.3 (2003).
Köster, J. & Rahmann, S. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).
Kurtzer, G. M., Sochat, V. & Bauer, M. W. Singularity: scientific containers for mobility of compute. PLoS ONE 12, e0177459 (2017).
Kang, D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
We thank all members of the Bhatt laboratory for experimental advice and discussions. We thank Brayon Fremin for making suggestions for the abbreviated DNA extraction protocol, and Matthew Grieshop, Keenan Manpearl, David Sanchez Godinez and Alexandra I. Strom for helpful comments on the manuscript. D.G.M. was supported by the Stanford Graduate Fellowships in Science and Engineering program. E.L.M. was supported by the National Science Foundation Graduate Research Fellowship no. DGE-114747. This work was supported by the Damon Runyon Clinical Investigator Award, grant nos. NIH R01AI148623 and NIH R01AI143757 to the Bhatt laboratory and grant no. NIH P30 AG047366, which supports the Stanford ADRC. Computational work was supported by NIH S10 Shared Instrumentation grant no. 1S10OD02014101 and by NIH grant no. P30 CA124435, which supports the Genetics Bioinformatics Service Center, a Stanford Cancer Institute Shared Resource.
The authors declare no competing interests.
Peer review information Nature Protocols thanks Stephen Nayfach and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Key reference using this protocol
Moss, E. L., Maghini, D. G. & Bhatt, A. S. Nat. Biotechnol. 38, 701–707 (2020): https://doi.org/10.1038/s41587-020-0422-6
About this article
Cite this article
Maghini, D.G., Moss, E.L., Vance, S.E. et al. Improved high-molecular-weight DNA extraction, nanopore sequencing and metagenomic assembly from the human gut microbiome. Nat Protoc 16, 458–471 (2021). https://doi.org/10.1038/s41596-020-00424-x
Communications Biology (2022)
BMC Genomics (2021)
Nature Reviews Genetics (2021)