Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Unveiling microbial diversity: harnessing long-read sequencing technology

Abstract

Long-read sequencing has recently transformed metagenomics, enhancing strain-level pathogen characterization, enabling accurate and complete metagenome-assembled genomes, and improving microbiome taxonomic classification and profiling. These advancements are not only due to improvements in sequencing accuracy, but also happening across rapidly changing analysis methods. In this Review, we explore long-read sequencing’s profound impact on metagenomics, focusing on computational pipelines for genome assembly, taxonomic characterization and variant detection, to summarize recent advancements in the field and provide an overview of available analytical methods to fully leverage long reads. We provide insights into the advantages and disadvantages of long reads over short reads and their evolution from the early days of long-read sequencing to their recent impact on metagenomics and clinical diagnostics. We further point out remaining challenges for the field such as the integration of methylation signals in sub-strain analysis and the lack of benchmarks.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of long reads in metagenomics.
Fig. 2: A generalized decision tree for metagenomic studies.
Fig. 3: Graph representation of a metagenome assembly.

Similar content being viewed by others

References

  1. Edwards, R. A. et al. Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genomics 7, 57 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Tamburini, F. B. et al. Short- and long-read metagenomics of urban and rural South African gut microbiomes reveal a transitional composition and undescribed taxa. Nat. Commun. 13, 926 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. van Almsick, V., Schuler, F., Mellmann, A. & Schwierzeck, V. The use of long-read sequencing technologies in infection control: horizontal transfer of a blaCTX-M-27 containing lncFII plasmid in a patient screening sample. Microorganisms 10, 491 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Sedlazeck, F. J., Lee, H., Darby, C. A. & Schatz, M. C. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat. Rev. Genet. 19, 329–346 (2018).

    Article  CAS  PubMed  Google Scholar 

  6. Kong, Y., Mead, E. A. & Fang, G. Navigating the pitfalls of mapping DNA and RNA modifications. Nat. Rev. Genet. 10.1038/s41576-022-00559-5 (2023).

  7. De Coster, W., Weissensteiner, M. H. & Sedlazeck, F. J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 22, 572–587 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Garalde, D. R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods 15, 201–206 (2018).

    Article  CAS  PubMed  Google Scholar 

  9. Gehrig, J. L. et al. Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data. Microb. Genom. 8, 000794 (2022).

    PubMed  PubMed Central  Google Scholar 

  10. Kiguchi, Y., Nishijima, S., Kumar, N., Hattori, M. & Suda, W. Long-read metagenomics of multiple displacement amplified DNA of low-biomass human gut phageomes by SACRA pre-processing chimeric reads. DNA Res. 28, dsab019 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Olson, N. D. et al. Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes. Brief. Bioinform. 20, 1140–1150 (2019).

    Article  CAS  PubMed  Google Scholar 

  12. Ni, Y., Liu, X., Simeneh, Z. M., Yang, M. & Li, R. Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing. Comput. Struct. Biotechnol. J. 21, 2352–2364 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Castro-Wallace, S. L. et al. Nanopore DNA sequencing and genome assembly on the international space station. Sci. Rep. 7, 18022 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Cheng, H. et al. A rapid bacterial pathogen and antimicrobial resistance diagnosis workflow using Oxford nanopore adaptive sequencing method. Brief. Bioinform. 23, bbac453 (2022).

    Article  PubMed  Google Scholar 

  15. Zhang, L. et al. Rapid detection of bacterial pathogens and antimicrobial resistance genes in clinical urine samples with urinary tract infection by metagenomic nanopore sequencing. Front. Microbiol. 13, 858777 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Isidro, J. et al. Phylogenomic characterization and signs of microevolution in the 2022 multi-country outbreak of monkeypox virus. Nat. Med. 28, 1569–1572 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Karthikeyan, S. et al. Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission. Nature 609, 101–108 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gaulke, C. A. et al. Evaluation of the effects of library preparation procedure and sample characteristics on the accuracy of metagenomic profiles. mSystems 6, e0044021 (2021).

    Article  PubMed  Google Scholar 

  19. Portik, D. M., Brown, C. T. & Pierce-Ward, N. T. Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets. BMC Bioinformatics 23, 541 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Wang, C. et al. Toward efficient and high-fidelity metagenomic data from sub-nanogram DNA: evaluation of library preparation and decontamination methods. BMC Biol. 20, 225 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Davis, N. M., Proctor, D. M., Holmes, S. P., Relman, D. A. & Callahan, B. J. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Martí, J. M. Recentrifuge: Robust comparative analysis and contamination removal for metagenomics. PLoS Comput. Biol. 15, e1006967 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Warris, S. et al. Correcting palindromes in long reads after whole-genome amplification. BMC Genomics 19, 798 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. McCall, C. et al. Targeted metagenomic sequencing for detection of vertebrate viruses in wastewater for public health surveillance. ACS EST Water https://doi.org/10.1021/acsestwater.3c00183 (2023).

    Article  Google Scholar 

  26. Ludwig, K. U. et al. LAMP-Seq enables sensitive, multiplexed COVID-19 diagnostics using molecular barcoding. Nat. Biotechnol. 39, 1556–1562 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Loose, M., Malla, S. & Stout, M. Real-time selective sequencing using nanopore technology. Nat. Methods 13, 751–754 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Samarakoon, H., Ferguson, J. M., Gamaarachchi, H. & Deveson, I. W. Accelerated nanopore basecalling with SLOW5 data format. Bioinformatics 39, btad352 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. De Coster, W., D’Hert, S., Schultz, D. T., Cruts, M. & Van Broeckhoven, C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Meyer, F. et al. Critical Assessment of Metagenome Interpretation: the second round of challenges. Nat. Methods 19, 429–440 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Sayers, E. W. et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 50, D20–D26 (2022).

    Article  CAS  PubMed  Google Scholar 

  32. Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).

    Article  CAS  PubMed  Google Scholar 

  33. Cole, J. R. et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 42, D633–D642 (2014).

    Article  CAS  PubMed  Google Scholar 

  34. DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Douglas, G. M. et al. PICRUSt2 for prediction of metagenome functions. Nat. Biotechnol. 38, 685–688 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Curry, K. D. et al. Emu: species-level microbial community profiling of full-length 16S rRNA Oxford Nanopore sequencing data. Nat. Methods 19, 845–853 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Rodríguez-Pérez, H., Ciuffreda, L. & Flores, C. NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data. Bioinformatics 37, 1600–1601 (2021).

    Article  PubMed  Google Scholar 

  38. Zaragoza-Solas, A., Haro-Moreno, J. M., Rodriguez-Valera, F. & López-Pérez, M. Long-read metagenomics improves the recovery of viral diversity from complex natural marine samples. mSystems 7, e0019222 (2022).

    Article  PubMed  Google Scholar 

  39. Chen, I.-M. A. et al. IMG/M: integrated genome and metagenome comparative data analysis system. Nucleic Acids Res. 45, D507–D516 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. Keegan, K. P., Glass, E. M. & Meyer, F. MG-RAST, a metagenomics service for analysis of microbial community structure and function. Methods Mol. Biol. 1399, 207–233 (2016).

    Article  CAS  PubMed  Google Scholar 

  41. Dilthey, A. T., Jain, C., Koren, S. & Phillippy, A. M. Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps. Nat. Commun. 10, 3066 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Huson, D. H. et al. MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs. Biol. Direct 13, 6 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Mirdita, M., Steinegger, M., Breitwieser, F., Söding, J. & Levy Karin, E. Fast and sensitive taxonomic assignment to metagenomic contigs. Bioinformatics 37, 3029–3031 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Bui, V.-K. & Wei, C. CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies. BMC Bioinformatics 21, 468 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Fan, J., Huang, S. & Chorlton, S. D. BugSeq: a highly accurate cloud platform for long-read metagenomic analyses. BMC Bioinformatics 22, 160 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  48. Marić, J., Križanović, K., Riondet, S., Nagarajan, N. & Šikić, M. Comparative analysis of metagenomic classifiers for long-read sequencing datasets. BMC Bioinformatics 25, 15 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Watson, M. & Warr, A. Errors in long-read assemblies can critically affect protein prediction. Nat. Biotechnol. 37, 124–126 (2019).

    Article  CAS  PubMed  Google Scholar 

  50. Balaji, A. et al. SeqScreen-Nano: a computational platform for rapid, in-field characterization of previously unseen pathogens. Preprint at bioRxiv https://doi.org/10.1101/2023.02.10.528096 (2023).

  51. Breitwieser, F. P., Pertea, M., Zimin, A. V. & Salzberg, S. L. Human contamination in bacterial genomes has created thousands of spurious proteins. Genome Res. 29, 954–960 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kazantseva, E., Donmez, A., Pop, M. & Kolmogorov, M. stRainy: assembly-based metagenomic strain phasing using long reads. Preprint at bioRxiv https://doi.org/10.1101/2023.01.31.526521 (2023).

  54. Maguire, F. et al. Metagenome-assembled genome binning methods with short reads disproportionately fail for plasmids and genomic Islands. Microb. Genom. 6, mgen000436 (2020).

    PubMed  PubMed Central  Google Scholar 

  55. Bickhart, D. M. et al. Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities. Nat. Biotechnol. 40, 711–719 (2022).

    Article  CAS  PubMed  Google Scholar 

  56. Ye, C., Hill, C. M., Wu, S., Ruan, J. & Ma, Z. S. DBG2OLC: Efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies. Sci. Rep. 6, 31900 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Wick, R. R., Judd, L. M., Gorrie, C. L. & Holt, K. E. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput. Biol. 13, e1005595 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Haro-Moreno, J. M., López-Pérez, M. & Rodriguez-Valera, F. Enhanced recovery of microbial genes and genomes from a marine water column using long-read metagenomics. Front. Microbiol. 12, 708782 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).

    Article  CAS  PubMed  Google Scholar 

  60. Luo, X., Kang, X. & Schönhuth, A. VeChat: correcting errors in long reads using variation graphs. Nat. Commun. 13, 6657 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Zhang, Z., Yang, C., Veldsman, W. P., Fang, X. & Zhang, L. Benchmarking genome assembly methods on metagenomic sequencing data. Brief. Bioinform. 24, (2023).

  62. Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Feng, X., Cheng, H., Portik, D. & Li, H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat. Methods 19, 671–674 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Moss, E. L., Maghini, D. G. & Bhatt, A. S. Complete, closed bacterial genomes from microbiomes using nanopore sequencing. Nat. Biotechnol. 38, 701–707 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Majidian, S., Agustinho, D. P., Chin, C.-S., Sedlazeck, F. J. & Mahmoud, M. Genomic variant benchmark: if you cannot measure it, you cannot improve it. Genome Biol. 24, 221 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Ghurye, J., Treangen, T., Fedarko, M., Hervey, W. J. 4th & Pop, M. MetaCarvel: linking assembly graph motifs to biological variants. Genome Biol. 20, 174 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Luo, X., Kang, X. & Schönhuth, A. Enhancing long-read-based strain-aware metagenome assembly. Front. Genet. 13, 868280 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Wick, R. R. et al. Trycycler: consensus long-read assemblies for bacterial genomes. Genome Biol. 22, 266 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Vicedomini, R., Quince, C., Darling, A. E. & Chikhi, R. Strainberry: automated strain separation in low-complexity metagenomes using long reads. Nat. Commun. 12, 4485 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Yang, C. et al. A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data. Comput. Struct. Biotechnol. J. 19, 6301–6314 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Mattock, J. & Watson, M. A comparison of single-coverage and multi-coverage metagenomic binning reveals extensive hidden contamination. Nat. Methods 20, 1170–1173 (2023).

    Article  CAS  PubMed  Google Scholar 

  73. Wickramarachchi, A., Mallawaarachchi, V., Rajan, V. & Lin, Y. MetaBCC-LR: metagenomics binning by coverage and composition for long reads. Bioinformatics 36, i3–i11 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Wickramarachchi, A. & Lin, Y. Binning long reads in metagenomics datasets using composition and coverage information. Algorithms Mol. Biol. 17, 14 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Lamurias, A., Sereika, M., Albertsen, M., Hose, K. & Nielsen, T. D. Metagenomic binning with assembly graph embeddings. Bioinformatics 38, 4481–4487 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Muralidharan, H. S., Shah, N., Meisel, J. S. & Pop, M. Binnacle: using scaffolds to improve the contiguity and quality of metagenomic bins. Front. Microbiol. 12, 638561 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Wilbanks, E. G. et al. Metagenomic methylation patterns resolve bacterial genomes of unusual size and structural complexity. ISME J. 16, 1921–1931 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Berthelier, J. et al. Long-read direct RNA sequencing reveals epigenetic regulation of chimeric gene-transposon transcripts in Arabidopsis thaliana. Nat. Commun. 14, 3248 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Lee, J. Y. et al. Comparative evaluation of Nanopore polishing tools for microbial genome assembly and polishing strategies for downstream analysis. Sci. Rep. 11, 20740 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Huang, Y. -T., Liu, P. -Y. & Shih, P. -W. Homopolish: a method for the removal of systematic errors in nanopore sequencing by homologous polishing. Genome Biol. 22, 95 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Shafin, K. et al. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nat. Methods 18, 1322–1332 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Cornet, L. & Baurain, D. Contamination detection in genomic data: more is not enough. Genome Biol. 23, 60 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: assessing genomic data quality and beyond. Curr. Protoc. 1, e323 (2021).

    Article  PubMed  Google Scholar 

  84. Chen, Y., Zhang, Y., Wang, A. Y., Gao, M. & Chong, Z. Accurate long-read de novo assembly evaluation with Inspector. Genome Biol. 22, 312 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Chklovski, A., Parks, D. H., Woodcroft, B. J. & Tyson, G. W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods 20, 1203–1212 (2023).

    Article  CAS  PubMed  Google Scholar 

  86. Mineeva, O., Rojas-Carulla, M., Ley, R. E., Schölkopf, B. & Youngblut, N. D. DeepMAsED: evaluating the quality of metagenomic assemblies. Bioinformatics 36, 3011–3017 (2020).

    Article  CAS  PubMed  Google Scholar 

  87. Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38, 5315–5316 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Blanco-Miguez, A. et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species with MetaPhlAn 4. Nat Biotechnol. 41, 1633–1644 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    Article  CAS  PubMed  Google Scholar 

  92. Tatusova, T. et al. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res. 44, 6614–6624 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. von Meijenfeldt, F. A. B., Arkhipova, K., Cambuy, D. D., Coutinho, F. H. & Dutilh, B. E. Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT. Genome Biol. 20, 217 (2019).

    Article  Google Scholar 

  94. Koboldt, D. C. Best practices for variant calling in clinical sequencing. Genome Med. 12, 91 (2020).

  95. Ajami, N. J., Wong, M. C., Ross, M. C., Lloyd, R. E. & Petrosino, J. F. Maximal viral information recovery from sequence data using VirMAP. Nat. Commun. 9, 3205 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Kim, D. et al. The architecture of SARS-CoV-2 transcriptome. Cell 181, 914–921 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Liu, Y. et al. Rescuing low frequency variants within intra-host viral populations directly from Oxford Nanopore sequencing data. Nat. Commun. 13, 1321 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  98. Sahlin, K., Baudeau, T., Cazaux, B. & Marchet, C. A survey of mapping algorithms in the long-reads era. Genome Biol. 24, 133 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Su, J., Zheng, Z., Ahmed, S. S., Lam, T.-W. & Luo, R. Clair3-trio: high-performance Nanopore long-read variant calling in family trios with trio-to-trio deep neural networks. Brief. Bioinform. 23, bbac301 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).

    Article  CAS  PubMed  Google Scholar 

  101. Ahsan, M. U., Liu, Q., Fang, L. & Wang, K. NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks. Genome Biol. 22, 261 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  102. Wilm, A. et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 40, 11189–11201 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164–e164 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  105. McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  106. Dylus, D., Altenhoff, A., Majidian, S., Sedlazeck, F. J. & Dessimoz, C. Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree. Nat. Biotechnol. 42, 139–147 (2024).

    Article  CAS  PubMed  Google Scholar 

  107. Corel, E. et al. Bipartite network analysis of gene sharings in the microbial world. Mol. Biol. Evol. 35, 899–913 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Chen, L. et al. Short- and long-read metagenomics expand individualized structural variations in gut microbiomes. Nat. Commun. 13, 3175 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Pérez-Losada, M., Arenas, M., Galán, J. C., Palero, F. & González-Candelas, F. Recombination in viruses: mechanisms, methods of study, and evolutionary consequences. Infect. Genet. Evol. 30, 296–307 (2015).

    Article  PubMed  Google Scholar 

  110. Li, H. et al. A synthetic-diploid benchmark for accurate variant-calling evaluation. Nat. Methods 15, 595–597 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  111. Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  112. Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-02024-y (2024).

  113. Heller, D. & Vingron, M. SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36, 5519–5521 (2020).

    Article  CAS  PubMed Central  Google Scholar 

  114. Geoffroy, V. et al. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34, 3572–3574 (2018).

    Article  CAS  PubMed  Google Scholar 

  115. Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. English, A. C., Menon, V. K., Gibbs, R. A., Metcalf, G. A. & Sedlazeck, F. J. Truvari: refined structural variant comparison preserves allelic diversity. Genome Biol. 23, 271 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Curry, K. D. et al. Reference-free structural variant detection in microbiomes via long-read coassembly graphs. Preprint at bioRxiv https://doi.org/10.1101/2024.01.25.577285 (2024).

  118. Zhang, T. et al. N6-methyladenosine RNA modification promotes viral genomic RNA stability and infection. Nat. Commun. 13, 6576 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Barros-Silva, D., Joana Marques, C., Henrique, R. & Jerónimo, C. Profiling DNA methylation based on next-generation sequencing approaches: new insights and clinical applications. Genes 9, 429 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  120. Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    Article  CAS  PubMed  Google Scholar 

  121. Ni, P. et al. DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning. Bioinformatics 35, 4586–4595 (2019).

    Article  CAS  PubMed  Google Scholar 

  122. Bonet, J. et al. DeepMP: a deep learning tool to detect DNA base modifications on Nanopore sequencing data. Bioinformatics 38, 1235–1243 (2021).

    Article  PubMed Central  Google Scholar 

  123. Tourancheau, A., Mead, E. A., Zhang, X. -S. & Fang, G. Discovering multiple types of DNA methylation from bacteria and microbiome using nanopore sequencing. Nat. Methods 18, 491–498 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Leger, A. et al. RNA modifications detection by comparative Nanopore direct RNA sequencing. Nat. Commun. 12, 7198 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Liu, H. et al. Accurate detection of m6A RNA modifications in native RNA sequences. Nat. Commun. 10, 4079 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  126. Fu, Y. et al. MethPhaser: methylation-based haplotype phasing of human genomes. Preprint at bioRxiv https://doi.org/10.1101/2023.05.12.540573 (2023).

  127. Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Pfeiffer, F. et al. Systematic evaluation of error rates and causes in short samples in next-generation sequencing. Sci. Rep. 8, 10950 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  129. Espinosa, E., Bautista, R., Larrosa, R. & Plata, O. Advancements in long-read genome sequencing technologies and algorithms. Genomics 116, 110842 (2024).

    Article  CAS  PubMed  Google Scholar 

  130. Salamon, D. et al. Comparison of iSeq and MiSeq as the two platforms for 16S rRNA sequencing in the study of the gut of rat microbiome. Appl. Microbiol. Biotechnol. 106, 7671–7681 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  131. 41J Blog. Cost per gigabase. https://41j.com/blog/2022/09/cost-per-gigabase/ (2022).

  132. Mastrorosa, F. K., Miller, D. E. & Eichler, E. E. Applications of long-read sequencing to Mendelian genetics. Genome Med. 15, 42 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. M. Portik, for important feedback and kindly authorizing the use of Fig. 3. We also thank A. F. Pomerantz and K. D. Curry for their helpful feedback. This work was supported by the National Institute of Allergy and Infectious Diseases (grant nos. 1U19AI144297 and P01-AI152999) and by the National Science Foundation (grant nos. EF-2126387 and IIS-2239114). Y.F. and T.T. were supported in part by the National Institutes of Health NIAID award P01-AI152999. T.T. was also supported by the National Science Foundation grants EF-2126387, IIS-2239114 and CNS-1338099.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fritz J. Sedlazeck.

Ethics declarations

Competing interests

F.J.S. has received research funding from Illumina, PacBio, Genentech and Oxford Nanopore. V.K.M. is an employee of Genentech. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Ami Bhatt and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agustinho, D.P., Fu, Y., Menon, V.K. et al. Unveiling microbial diversity: harnessing long-read sequencing technology. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02262-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41592-024-02262-1

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research