Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

I-TASSER: a unified platform for automated protein structure and function prediction

Abstract

The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.

Your institute does not have access to this article

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: A schematic representation of the I-TASSER protocol for protein structure and function predictions.
Figure 2: Example of external restraint files.
Figure 3: An illustrative example of the I-TASSER result page.
Figure 4: An illustrative example of the I-TASSER result page.
Figure 5: Illustrative examples of the I-TASSER function predictions.

References

  1. The UniProt, C. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169–D174 (2008).

    Article  Google Scholar 

  2. Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. Zhang, Y. Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol. 18, 342–348 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  4. Marti-Renom, M.A. et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000).

    CAS  Article  PubMed  Google Scholar 

  5. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Bowie, J.U., Luthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991).

    CAS  Article  PubMed  Google Scholar 

  7. Jones, D.T., Taylor, W.R. & Thornton, J.M. A new approach to protein fold recognition. Nature 358, 86–89 (1992).

    CAS  Article  PubMed  Google Scholar 

  8. Liwo, A., Lee, J., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA 96, 5482–5485 (1999).

    CAS  Article  PubMed  Google Scholar 

  9. Simons, K.T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

    CAS  Article  PubMed  Google Scholar 

  10. Wu, S., Skolnick, J. & Zhang, Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Jauch, R., Yeo, H.C., Kolatkar, P.R. & Clarke, N.D. Assessment of CASP7 structure predictions for template free targets. Proteins 69, 57–67 (2007).

    CAS  Article  PubMed  Google Scholar 

  12. Zhang, Y., Kolinski, A. & Skolnick, J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys. J. 85, 1145–1164 (2003).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  13. Battey, J.N. et al. Automated server predictions in CASP7. Proteins 69, 68–82 (2007).

    CAS  Article  PubMed  Google Scholar 

  14. Moult, J. et al. Critical assessment of methods of protein structure prediction-Round VII. Proteins 69 (Suppl 8): 3–9 (2007).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. Kopp, J., Bordoli, L., Battey, J.N., Kiefer, F. & Schwede, T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 69, 38–56 (2007).

    CAS  Article  PubMed  Google Scholar 

  16. Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins 69, 118–128 (2007).

    CAS  Article  PubMed  Google Scholar 

  17. Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69, 108–117 (2007).

    CAS  Article  PubMed  Google Scholar 

  18. Zhou, H. et al. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 69 (Suppl 8): 90–97 (2007).

    CAS  Article  PubMed  Google Scholar 

  19. Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77, 100–113 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. Cozzetto, D. et al. Evaluation of template-based models in CASP8 with standard measures. Proteins 77 (Suppl 9): 18–28 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. Ekins, S., Mestres, J. & Testa, B. In silico pharmacology for drug discovery: applications to targets and beyond. Br. J. Pharmacol. 152, 21–37 (2007).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. Becker, O.M. et al. An integrated in silico 3D model-driven discovery of a novel, potent, and selective amidosulfonamide 5-HT1A agonist (PRX-00023) for the treatment of anxiety and depression. J. Med. Chem. 49, 3116–3135 (2006).

    CAS  Article  PubMed  Google Scholar 

  24. Brylinski, M. & Skolnick, J. Q-Dock: low-resolution flexible ligand docking with pocket-specific threading restraints. J. Comput. Chem. 29, 1574–1588 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. Arakaki, A.K., Zhang, Y. & Skolnick, J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics 20, 1087–1096 (2004).

    CAS  Article  PubMed  Google Scholar 

  26. Yue, P. & Moult, J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 356, 1263–1274 (2006).

    CAS  Article  PubMed  Google Scholar 

  27. Boyd, A. et al. A random mutagenesis approach to isolate dominant-negative yeast sec1 mutants reveals a functional role for domain 3a in yeast and mammalian Sec1/Munc18 proteins. Genetics 180, 165–178 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. Ye, Y., Li, Z. & Godzik, A. Modeling and analyzing three-dimensional structures of human disease proteins. Pac. Symp. Biocomput. 11, 439–450 (2006).

    Google Scholar 

  29. Keedy, D.A. et al. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins 77 (Suppl 9): 29–49 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  30. Tress, M., Ezkurdia, I., Grana, O., Lopez, G. & Valencia, A. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61 (Suppl 7): 27–45 (2005).

    CAS  Article  PubMed  Google Scholar 

  31. Moult, J. Comparative modeling in structural genomics. Structure 16, 14–16 (2008).

    CAS  Article  PubMed  Google Scholar 

  32. Tress, M. et al. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins 69 (Suppl 8): 137–151 (2007).

    CAS  Article  PubMed  Google Scholar 

  33. Malmstrom, L. et al. Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol. 5, e76 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Zhang, Y., Devries, M.E. & Skolnick, J. Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol. 2, e13 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Lopez, G., Rojas, A., Tress, M. & Valencia, A. Assessment of predictions submitted for the CASP7 function prediction category. Proteins 69 (Suppl 8): 165–174 (2007).

    CAS  Article  PubMed  Google Scholar 

  36. Brylinski, M. & Skolnick, J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. USA 105, 129–134 (2008).

    CAS  Article  PubMed  Google Scholar 

  37. Roy, A., Srinivasan, N. & Gowri, V.S. Molecular and structural basis of drift in the functions of closely-related homologous enzyme domains: implications for function annotation based on homology searches and structural genomics. In Silico Biol. 9, S41–S55 (2009).

    PubMed  Google Scholar 

  38. Bork, P., Sander, C. & Valencia, A. Convergent evolution of similar enzymatic function on different protein folds: the hexokinase, ribokinase, and galactokinase families of sugar kinases. Protein Sci. 2, 31–40 (1993).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. Zhang, Y. & Skolnick, J. Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87, 2647–2655 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  40. Zhang, Y. & Skolnick, J. Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA 101, 7594–7599 (2004).

    CAS  Article  PubMed  Google Scholar 

  41. Karplus, K., Barrett, C. & Hughey, R. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856 (1998).

    CAS  Article  PubMed  Google Scholar 

  42. McGuffin, L.J. & Jones, D.T. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, 874–881 (2003).

    CAS  Article  PubMed  Google Scholar 

  43. Wallner, B. & Elofsson, A. Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics 21, 4248–4254 (2005).

    CAS  Article  PubMed  Google Scholar 

  44. Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).

    Article  PubMed  Google Scholar 

  45. Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321–W326 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003).

    CAS  Article  PubMed  Google Scholar 

  47. Fischer, D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins 51, 434–441 (2003).

    CAS  Article  PubMed  Google Scholar 

  48. Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526–W531 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. Kelley, L.A. & Sternberg, M.J. Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009).

    CAS  Article  PubMed  Google Scholar 

  50. Jones, D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999).

    CAS  Article  PubMed  Google Scholar 

  51. Wu, S. & Zhang, Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 35, 3375–3382 (2007).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  52. Shi, J., Blundell, T.L. & Mizuguchi, K. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257 (2001).

    CAS  Article  PubMed  Google Scholar 

  53. Wu, S. & Zhang, Y. MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72, 547–556 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  54. Xu, Y. & Xu, D. Protein threading using PROSPECT: design and evaluation. Proteins 40, 343–354 (2000).

    CAS  Article  PubMed  Google Scholar 

  55. Zhou, H. & Zhou, Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328 (2005).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  56. Zhou, H. & Zhou, Y. Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013 (2004).

    CAS  Article  PubMed  Google Scholar 

  57. Zhang, Y., Kihara, D. & Skolnick, J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins 48, 192–201 (2002).

    CAS  Article  PubMed  Google Scholar 

  58. Zhang, Y., Hubner, I., Arakaki, A., Shakhnovich, E. & Skolnick, J. On the origin and completeness of highly likely single domain protein structures. Proc. Natl. Acad. Sci. USA 103, 2605–2610 (2006).

    CAS  Article  PubMed  Google Scholar 

  59. Chen, H. & Zhou, H.X. Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res. 33, 3193–3199 (2005).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  60. Wu, S. & Zhang, Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24, 924–931 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Zhang, Y. & Skolnick, J. SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004).

    CAS  Article  PubMed  Google Scholar 

  62. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  63. Li, Y. & Zhang, Y. REMO: a new protocol to refine full atomic protein models from C-α traces by optimizing hydrogen-bonding networks. Proteins 76, 665–676 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  64. Barrett, A.J. Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur. J. Biochem. 250, 1–6 (1997).

    CAS  Article  PubMed  Google Scholar 

  65. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  66. Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).

    CAS  Article  PubMed  Google Scholar 

  67. Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9, 40 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Li, W., Zhang, Y. & Skolnick, J. Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  69. Sali, A. & Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).

    CAS  Article  Google Scholar 

  70. Betancourt, M.R. & Skolnick, J. Universal similarity measure for comparing protein structures. Biopolymers 59, 305–309 (2001).

    CAS  Article  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Dr. A. Szilagyi for reading the manuscript. This work was supported in part by the Alfred P. Sloan Foundation; the National Science Foundation (Career award 0746198); and the National Institute of General Medical Sciences (GM083107, GM084222).

Author information

Authors and Affiliations

Authors

Contributions

Y.Z. conceived and supervised the project. A.R., A.K. and Y.Z. designed and performed the experiments. A.R. and Y.Z. wrote the manuscript.

Corresponding author

Correspondence to Yang Zhang.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5, 725–738 (2010). https://doi.org/10.1038/nprot.2010.5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2010.5

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing