Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment

Abstract

Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM–based lightning-fast iterative sequence search' (HHblits; http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50–100% higher sensitivity and generates more accurate alignments.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Workflow and benchmark comparison.
Figure 2: Structure predictions for Pfam families and the modeling of human Pip49 (also known as FAM69B).

Accession codes

Accessions

Protein Data Bank

References

  1. Altschul, S.F. et al. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  Google Scholar 

  2. Karplus, K., Barrett, C. & Hughey, R. Bioinformatics 14, 846–856 (1998).

    Article  CAS  Google Scholar 

  3. Eddy, S.R. Genome Inform. 23, 205–211 (2009).

    PubMed  Google Scholar 

  4. Söding, J. & Remmert, M. Curr. Opin. Struct. Biol. 21, 404–411 (2011).

    Article  Google Scholar 

  5. Söding, J. Bioinformatics 21, 951–960 (2005).

    Article  Google Scholar 

  6. Söding, J., Biegert, A. & Lupas, A.N. Nucleic Acids Res. 33, W244–W248 (2005).

    Article  Google Scholar 

  7. Hegyi, H. & Gerstein, M. Genome Res. 11, 1632–1640 (2001).

    Article  CAS  Google Scholar 

  8. Biegert, A. & Söding, J. Proc. Natl. Acad. Sci. USA 106, 3770–3775 (2009).

    Article  CAS  Google Scholar 

  9. Farrar, M. Bioinformatics 23, 156–161 (2007).

    Article  CAS  Google Scholar 

  10. Biegert, A. & Söding, J. Bioinformatics 24, 807–814 (2008).

    Article  CAS  Google Scholar 

  11. Andreeva, A. et al. Nucleic Acids Res. 36, D419–D425 (2008).

    Article  CAS  Google Scholar 

  12. Gonzalez, M.W. & Pearson, W.R. Nucleic Acids Res. 38, 2177–2189 (2010).

    Article  CAS  Google Scholar 

  13. Jones, D.T. J. Mol. Biol. 292, 195–202 (1999).

    Article  CAS  Google Scholar 

  14. Aydin, Z., Singh, A., Bilmes, J. & Noble, W. BMC Bioinformatics 12, 154 (2011).

    Article  Google Scholar 

  15. Finn, R.D. et al. Nucleic Acids Res. 38, D211–D222 (2010).

    Article  CAS  Google Scholar 

  16. Marks, D.S. et al. PLoS ONE 6, e28766 (2011).

    Article  CAS  Google Scholar 

  17. Li, W. & Godzik, A. Bioinformatics 22, 1658–1659 (2006).

    Article  CAS  Google Scholar 

  18. Holmes, I. & Durbin, R. J. Comput. Biol. 5, 493–504 (1998).

    Article  CAS  Google Scholar 

  19. Zhang, Y. & Skolnick, J. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  CAS  Google Scholar 

  20. Griep, S. & Hobohm, U. Nucleic Acids Res. 38, D318–D319 (2009).

    Article  Google Scholar 

  21. Sali, A. & Blundell, T.L. J. Mol. Biol. 234, 779–815 (1993).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge financial support by the Deutsche Forschungsgemeinschaft (grant SFB646) and by a Gastprofessur grant from Ludwig-Maximilians Universität Munich financed through the Excellence Initiative of the Bundesministerium für Bildung und Forschung.

Author information

Authors and Affiliations

Authors

Contributions

M.R. performed research, J.S. initiated and guided research, A.B. generated the profile-column alphabet, A.H. contributed code for fast file access, and M.R. and J.S. wrote the manuscript.

Corresponding author

Correspondence to Johannes Söding.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–10, Supplementary Tables 1 and 2 (PDF 3828 kb)

Supplementary Data 1

100 random sequences from the nr database used for run time benchmark. (TXT 55 kb)

Supplementary Data 2

List of query-template pairs for alignment benchmark. (TXT 84 kb)

Supplementary Data 3

3D homology model of PIP49/FAM69B. (TXT 161 kb)

Supplementary Data 4

Training and test set of SCOP domain sequence for sensitivity benchmark. (TXT 174 kb)

Supplementary Data 5

FASTA formatted multiple sequence alignment for human PIP49/FAM69B built by HHblits. (TXT 701 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Remmert, M., Biegert, A., Hauser, A. et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9, 173–175 (2012). https://doi.org/10.1038/nmeth.1818

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1818

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing