Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Research Briefing
  • Published:

Skani enables accurate and efficient genome comparison for modern metagenomic datasets

Modern high-throughput metagenomics is producing hundreds of thousands of metagenome-assembled genomes (MAGs), which is overwhelming traditional sequence-similarity search methods. We present a computational method, skani, that efficiently compares MAGs on a terabyte scale while being robust to the inherent noise in MAGs, enabling larger and more accurate analyses.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Skani gives improved clustering and speed over competing methods.

References

  1. Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021). This paper highlights the scale of modern collections of MAGs, which number in the hundreds of thousands.

    Article  CAS  PubMed  Google Scholar 

  2. Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016). This paper reports one of the first sketching methods for the rapid analysis of genomes.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Belbasi, M., Blanca, A., Harris, R. S., Koslicki, D. & Medvedev, P. The minimizer Jaccard estimator is biased and inconsistent. Bioinformatics 38, i169–i176 (2022). This paper shows that certain k-mer seeding schemes give theoretically incorrect estimates of ANI.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Hera, M. R., Pierce-Ward, N. T. & Koslicki, D. Deriving confidence intervals for mutation rates across a wide range of evolutionary distances using FracMinHash. Genome Res. https://doi.org/10.1101/gr.277651.123 (2023). Our paper uses this k-mer seeding scheme, which has almost no ANI bias.

    Article  Google Scholar 

  5. Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017). This paper shows that Mash gives incorrect estimates of ANI in the presence of MAG incompleteness.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Shaw, J. & Yu, Y. W. Fast and robust metagenomic sequence comparison through sparse chaining with skani. Nat. Methods, https://doi.org/10.1038/s41592-023-02018-3 (2023).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Skani enables accurate and efficient genome comparison for modern metagenomic datasets. Nat Methods 20, 1633–1634 (2023). https://doi.org/10.1038/s41592-023-02019-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-023-02019-2

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics