Modern high-throughput metagenomics is producing hundreds of thousands of metagenome-assembled genomes (MAGs), which is overwhelming traditional sequence-similarity search methods. We present a computational method, skani, that efficiently compares MAGs on a terabyte scale while being robust to the inherent noise in MAGs, enabling larger and more accurate analyses.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021). This paper highlights the scale of modern collections of MAGs, which number in the hundreds of thousands.
Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016). This paper reports one of the first sketching methods for the rapid analysis of genomes.
Belbasi, M., Blanca, A., Harris, R. S., Koslicki, D. & Medvedev, P. The minimizer Jaccard estimator is biased and inconsistent. Bioinformatics 38, i169–i176 (2022). This paper shows that certain k-mer seeding schemes give theoretically incorrect estimates of ANI.
Hera, M. R., Pierce-Ward, N. T. & Koslicki, D. Deriving confidence intervals for mutation rates across a wide range of evolutionary distances using FracMinHash. Genome Res. https://doi.org/10.1101/gr.277651.123 (2023). Our paper uses this k-mer seeding scheme, which has almost no ANI bias.
Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017). This paper shows that Mash gives incorrect estimates of ANI in the presence of MAG incompleteness.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This is a summary of: Shaw, J. & Yu, Y. W. Fast and robust metagenomic sequence comparison through sparse chaining with skani. Nat. Methods, https://doi.org/10.1038/s41592-023-02018-3 (2023).
About this article
Cite this article
Skani enables accurate and efficient genome comparison for modern metagenomic datasets. Nat Methods 20, 1633–1634 (2023). https://doi.org/10.1038/s41592-023-02019-2