Brief Communication | Published:

Fast and sensitive protein alignment using DIAMOND

Nature Methods volume 12, pages 5960 (2015) | Download Citation

Subjects

Abstract

The alignment of sequencing reads against a protein reference database is a major computational bottleneck in metagenomics and data-intensive evolutionary projects. Although recent tools offer improved performance over the gold standard BLASTX, they exhibit only a modest speedup or low sensitivity. We introduce DIAMOND, an open-source algorithm based on double indexing that is 20,000 times faster than BLASTX on short reads and has a similar degree of sensitivity.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

References

  1. 1.

    , , , & Chem. Biol. 5, R245–R249 (1998).

  2. 2.

    , , , & Nucleic Acids Res. 33, D34–D38 (2005).

  3. 3.

    & Nucleic Acids Res. 28, 27–30 (2000).

  4. 4.

    , , , & J. Mol. Biol. 215, 403–410 (1990).

  5. 5.

    Genome Res. 12, 656–664 (2002).

  6. 6.

    Bioinformatics 26, 2460–2461 (2010).

  7. 7.

    , & Bioinformatics 28, 125–126 (2012).

  8. 8.

    & Bioinformatics 30, 38–39 (2014).

  9. 9.

    & Fundamenta Informaticae 23, 1001–1018 (2003).

  10. 10.

    , & Bioinformatics 18, 440–445 (2002).

  11. 11.

    , , & BMC Genomics 12, 280 (2011).

  12. 12.

    , & Protein Eng. 13, 149–152 (2000).

  13. 13.

    & J. Mol. Biol. 147, 195–197 (1981).

  14. 14.

    et al. Nature 480, 368–371 (2011).

  15. 15.

    Microbe 6, 309–315 (2011).

  16. 16.

    et al. Nature 449, 804–810 (2007).

  17. 17.

    et al. Science 304, 66–74 (2004).

  18. 18.

    et al. Nature 506, 58–62 (2014).

  19. 19.

    et al. Nucleic Acids Res. 36, D13–D21 (2008).

  20. 20.

    , & Proc. VLDB Conf. 99, 54–65 (1999).

  21. 21.

    et al. Nat. Methods 7, 576–577 (2010).

  22. 22.

    BMC Bioinformatics 12, 221 (2011).

  23. 23.

    & Methods Enzymol. 266, 88–105 (1996).

  24. 24.

    , & Nucleic Acids Res. 38, e132 (2010).

Download references

Acknowledgements

This research was partially supported by the National Research Foundation and Ministry of Education Singapore under its Research Centre of Excellence Programme, and by the A*STAR Computational Resource Centre through the use of its high-performance computing facilities.

Author information

Affiliations

  1. Department of Computer Science and Center for Bioinformatics, University of Tübingen, Tübingen, Germany.

    • Benjamin Buchfink
    •  & Daniel H Huson
  2. Singapore Centre on Environmental Life Sciences Engineering, School of Biological Sciences, Nanyang Technological University, Singapore.

    • Chao Xie
    •  & Daniel H Huson
  3. Life Sciences Institute, National University of Singapore, Singapore.

    • Chao Xie

Authors

  1. Search for Benjamin Buchfink in:

  2. Search for Chao Xie in:

  3. Search for Daniel H Huson in:

Contributions

B.B. designed and implemented the algorithm. C.X. performed the experimental study. C.X. and D.H.H. initiated and guided the project. D.H.H. and B.B. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Benjamin Buchfink or Daniel H Huson.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–3 and Supplementary Tables 1–3

Zip files

  1. 1.

    Supplementary Software

    DIAMOND v0.4.7 source code

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.3176

Further reading