Protocol | Published:

I-TASSER: a unified platform for automated protein structure and function prediction

Nature Protocols volume 5, pages 725738 (2010) | Download Citation

Abstract

The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res.37, D169–D174 (2008).

  2. 2.

    et al. The Protein Data Bank. Nucleic Acids Res.28, 235–242 (2000).

  3. 3.

    Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol.18, 342–348 (2008).

  4. 4.

    et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct.29, 291–325 (2000).

  5. 5.

    et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res.25, 3389–3402 (1997).

  6. 6.

    , & A method to identify protein sequences that fold into a known three-dimensional structure. Science253, 164–170 (1991).

  7. 7.

    , & A new approach to protein fold recognition. Nature358, 86–89 (1992).

  8. 8.

    , , , & Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA96, 5482–5485 (1999).

  9. 9.

    , , & Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol.268, 209–225 (1997).

  10. 10.

    , & Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol.5, 17 (2007).

  11. 11.

    , , & Assessment of CASP7 structure predictions for template free targets. Proteins69, 57–67 (2007).

  12. 12.

    , & TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys. J.85, 1145–1164 (2003).

  13. 13.

    et al. Automated server predictions in CASP7. Proteins69, 68–82 (2007).

  14. 14.

    et al. Critical assessment of methods of protein structure prediction-Round VII. Proteins69(Suppl 8): 3–9 (2007).

  15. 15.

    , , , & Assessment of CASP7 predictions for template-based modeling targets. Proteins69, 38–56 (2007).

  16. 16.

    et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins69, 118–128 (2007).

  17. 17.

    Template-based modeling and free modeling by I-TASSER in CASP7. Proteins69, 108–117 (2007).

  18. 18.

    et al. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins69(Suppl 8): 90–97 (2007).

  19. 19.

    I-TASSER: fully automated protein structure prediction in CASP8. Proteins77, 100–113 (2009).

  20. 20.

    et al. Evaluation of template-based models in CASP8 with standard measures. Proteins77(Suppl 9): 18–28 (2009).

  21. 21.

    Protein structure prediction: when is it useful?Curr. Opin. Struct. Biol.19, 145–155 (2009).

  22. 22.

    , & In silico pharmacology for drug discovery: applications to targets and beyond. Br. J. Pharmacol.152, 21–37 (2007).

  23. 23.

    et al. An integrated in silico 3D model-driven discovery of a novel, potent, and selective amidosulfonamide 5-HT1A agonist (PRX-00023) for the treatment of anxiety and depression. J. Med. Chem.49, 3116–3135 (2006).

  24. 24.

    & Q-Dock: low-resolution flexible ligand docking with pocket-specific threading restraints. J. Comput. Chem.29, 1574–1588 (2008).

  25. 25.

    , & Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics20, 1087–1096 (2004).

  26. 26.

    & Identification and analysis of deleterious human SNPs. J. Mol. Biol.356, 1263–1274 (2006).

  27. 27.

    et al. A random mutagenesis approach to isolate dominant-negative yeast sec1 mutants reveals a functional role for domain 3a in yeast and mammalian Sec1/Munc18 proteins. Genetics180, 165–178 (2008).

  28. 28.

    , & Modeling and analyzing three-dimensional structures of human disease proteins. Pac. Symp. Biocomput.11, 439–450 (2006).

  29. 29.

    et al. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins77(Suppl 9): 29–49 (2009).

  30. 30.

    , , , & Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins61(Suppl 7): 27–45 (2005).

  31. 31.

    Comparative modeling in structural genomics. Structure16, 14–16 (2008).

  32. 32.

    et al. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins69(Suppl 8): 137–151 (2007).

  33. 33.

    et al. Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol.5, e76 (2007).

  34. 34.

    , & Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol.2, e13 (2006).

  35. 35.

    , , & Assessment of predictions submitted for the CASP7 function prediction category. Proteins69(Suppl 8): 165–174 (2007).

  36. 36.

    & A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. USA105, 129–134 (2008).

  37. 37.

    , & Molecular and structural basis of drift in the functions of closely-related homologous enzyme domains: implications for function annotation based on homology searches and structural genomics. In Silico Biol.9, S41–S55 (2009).

  38. 38.

    , & Convergent evolution of similar enzymatic function on different protein folds: the hexokinase, ribokinase, and galactokinase families of sugar kinases. Protein Sci.2, 31–40 (1993).

  39. 39.

    & Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J.87, 2647–2655 (2004).

  40. 40.

    & Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA101, 7594–7599 (2004).

  41. 41.

    , & Hidden Markov models for detecting remote protein homologies. Bioinformatics14, 846–856 (1998).

  42. 42.

    & Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics19, 874–881 (2003).

  43. 43.

    & Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics21, 4248–4254 (2005).

  44. 44.

    Protein homology detection by HMM-HMM comparison. Bioinformatics21, 951–960 (2005).

  45. 45.

    , & The PredictProtein server. Nucleic Acids Res.32, W321–W326 (2004).

  46. 46.

    , , & 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics19, 1015–1018 (2003).

  47. 47.

    3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins51, 434–441 (2003).

  48. 48.

    , & Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res.32, W526–W531 (2004).

  49. 49.

    & Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc.4, 363–371 (2009).

  50. 50.

    Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol.292, 195–202 (1999).

  51. 51.

    & LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res.35, 3375–3382 (2007).

  52. 52.

    , & FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol.310, 243–257 (2001).

  53. 53.

    & MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins72, 547–556 (2008).

  54. 54.

    & Protein threading using PROSPECT: design and evaluation. Proteins40, 343–354 (2000).

  55. 55.

    & Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins58, 321–328 (2005).

  56. 56.

    & Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins55, 1005–1013 (2004).

  57. 57.

    , & Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins48, 192–201 (2002).

  58. 58.

    , , , & On the origin and completeness of highly likely single domain protein structures. Proc. Natl. Acad. Sci. USA103, 2605–2610 (2006).

  59. 59.

    & Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res.33, 3193–3199 (2005).

  60. 60.

    & A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics24, 924–931 (2008).

  61. 61.

    & SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem.25, 865–871 (2004).

  62. 62.

    & TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res.33, 2302–2309 (2005).

  63. 63.

    & REMO: a new protocol to refine full atomic protein models from C-α traces by optimizing hydrogen-bonding networks. Proteins76, 665–676 (2009).

  64. 64.

    Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur. J. Biochem.250, 1–6 (1997).

  65. 65.

    et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet.25, 25–29 (2000).

  66. 66.

    & Scoring function for automated assessment of protein structure template quality. Proteins57, 702–710 (2004).

  67. 67.

    I-TASSER server for protein 3D structure prediction. BMC Bioinformatics9, 40 (2008).

  68. 68.

    , & Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J.87, 1241–1248 (2004).

  69. 69.

    & Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol.234, 779–815 (1993).

  70. 70.

    & Universal similarity measure for comparing protein structures. Biopolymers59, 305–309 (2001).

Download references

Acknowledgements

We thank Dr. A. Szilagyi for reading the manuscript. This work was supported in part by the Alfred P. Sloan Foundation; the National Science Foundation (Career award 0746198); and the National Institute of General Medical Sciences (GM083107, GM084222).

Author information

Affiliations

  1. Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

    • Ambrish Roy
    •  & Yang Zhang
  2. Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, Lawrence, Kansas, USA.

    • Ambrish Roy
    • , Alper Kucukural
    •  & Yang Zhang

Authors

  1. Search for Ambrish Roy in:

  2. Search for Alper Kucukural in:

  3. Search for Yang Zhang in:

Contributions

Y.Z. conceived and supervised the project. A.R., A.K. and Y.Z. designed and performed the experiments. A.R. and Y.Z. wrote the manuscript.

Corresponding author

Correspondence to Yang Zhang.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nprot.2010.5

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.