Abstract
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
The UniProt, C. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169–D174 (2008).
Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Zhang, Y. Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol. 18, 342–348 (2008).
Marti-Renom, M.A. et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000).
Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Bowie, J.U., Luthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991).
Jones, D.T., Taylor, W.R. & Thornton, J.M. A new approach to protein fold recognition. Nature 358, 86–89 (1992).
Liwo, A., Lee, J., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA 96, 5482–5485 (1999).
Simons, K.T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).
Wu, S., Skolnick, J. & Zhang, Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17 (2007).
Jauch, R., Yeo, H.C., Kolatkar, P.R. & Clarke, N.D. Assessment of CASP7 structure predictions for template free targets. Proteins 69, 57–67 (2007).
Zhang, Y., Kolinski, A. & Skolnick, J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys. J. 85, 1145–1164 (2003).
Battey, J.N. et al. Automated server predictions in CASP7. Proteins 69, 68–82 (2007).
Moult, J. et al. Critical assessment of methods of protein structure prediction-Round VII. Proteins 69 (Suppl 8): 3–9 (2007).
Kopp, J., Bordoli, L., Battey, J.N., Kiefer, F. & Schwede, T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 69, 38–56 (2007).
Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins 69, 118–128 (2007).
Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69, 108–117 (2007).
Zhou, H. et al. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 69 (Suppl 8): 90–97 (2007).
Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77, 100–113 (2009).
Cozzetto, D. et al. Evaluation of template-based models in CASP8 with standard measures. Proteins 77 (Suppl 9): 18–28 (2009).
Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).
Ekins, S., Mestres, J. & Testa, B. In silico pharmacology for drug discovery: applications to targets and beyond. Br. J. Pharmacol. 152, 21–37 (2007).
Becker, O.M. et al. An integrated in silico 3D model-driven discovery of a novel, potent, and selective amidosulfonamide 5-HT1A agonist (PRX-00023) for the treatment of anxiety and depression. J. Med. Chem. 49, 3116–3135 (2006).
Brylinski, M. & Skolnick, J. Q-Dock: low-resolution flexible ligand docking with pocket-specific threading restraints. J. Comput. Chem. 29, 1574–1588 (2008).
Arakaki, A.K., Zhang, Y. & Skolnick, J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics 20, 1087–1096 (2004).
Yue, P. & Moult, J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 356, 1263–1274 (2006).
Boyd, A. et al. A random mutagenesis approach to isolate dominant-negative yeast sec1 mutants reveals a functional role for domain 3a in yeast and mammalian Sec1/Munc18 proteins. Genetics 180, 165–178 (2008).
Ye, Y., Li, Z. & Godzik, A. Modeling and analyzing three-dimensional structures of human disease proteins. Pac. Symp. Biocomput. 11, 439–450 (2006).
Keedy, D.A. et al. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins 77 (Suppl 9): 29–49 (2009).
Tress, M., Ezkurdia, I., Grana, O., Lopez, G. & Valencia, A. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61 (Suppl 7): 27–45 (2005).
Moult, J. Comparative modeling in structural genomics. Structure 16, 14–16 (2008).
Tress, M. et al. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins 69 (Suppl 8): 137–151 (2007).
Malmstrom, L. et al. Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol. 5, e76 (2007).
Zhang, Y., Devries, M.E. & Skolnick, J. Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol. 2, e13 (2006).
Lopez, G., Rojas, A., Tress, M. & Valencia, A. Assessment of predictions submitted for the CASP7 function prediction category. Proteins 69 (Suppl 8): 165–174 (2007).
Brylinski, M. & Skolnick, J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. USA 105, 129–134 (2008).
Roy, A., Srinivasan, N. & Gowri, V.S. Molecular and structural basis of drift in the functions of closely-related homologous enzyme domains: implications for function annotation based on homology searches and structural genomics. In Silico Biol. 9, S41–S55 (2009).
Bork, P., Sander, C. & Valencia, A. Convergent evolution of similar enzymatic function on different protein folds: the hexokinase, ribokinase, and galactokinase families of sugar kinases. Protein Sci. 2, 31–40 (1993).
Zhang, Y. & Skolnick, J. Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87, 2647–2655 (2004).
Zhang, Y. & Skolnick, J. Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA 101, 7594–7599 (2004).
Karplus, K., Barrett, C. & Hughey, R. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856 (1998).
McGuffin, L.J. & Jones, D.T. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, 874–881 (2003).
Wallner, B. & Elofsson, A. Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics 21, 4248–4254 (2005).
Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).
Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321–W326 (2004).
Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003).
Fischer, D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins 51, 434–441 (2003).
Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526–W531 (2004).
Kelley, L.A. & Sternberg, M.J. Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009).
Jones, D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999).
Wu, S. & Zhang, Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 35, 3375–3382 (2007).
Shi, J., Blundell, T.L. & Mizuguchi, K. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257 (2001).
Wu, S. & Zhang, Y. MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72, 547–556 (2008).
Xu, Y. & Xu, D. Protein threading using PROSPECT: design and evaluation. Proteins 40, 343–354 (2000).
Zhou, H. & Zhou, Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328 (2005).
Zhou, H. & Zhou, Y. Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013 (2004).
Zhang, Y., Kihara, D. & Skolnick, J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins 48, 192–201 (2002).
Zhang, Y., Hubner, I., Arakaki, A., Shakhnovich, E. & Skolnick, J. On the origin and completeness of highly likely single domain protein structures. Proc. Natl. Acad. Sci. USA 103, 2605–2610 (2006).
Chen, H. & Zhou, H.X. Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res. 33, 3193–3199 (2005).
Wu, S. & Zhang, Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24, 924–931 (2008).
Zhang, Y. & Skolnick, J. SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004).
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).
Li, Y. & Zhang, Y. REMO: a new protocol to refine full atomic protein models from C-α traces by optimizing hydrogen-bonding networks. Proteins 76, 665–676 (2009).
Barrett, A.J. Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur. J. Biochem. 250, 1–6 (1997).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).
Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9, 40 (2008).
Li, W., Zhang, Y. & Skolnick, J. Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004).
Sali, A. & Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).
Betancourt, M.R. & Skolnick, J. Universal similarity measure for comparing protein structures. Biopolymers 59, 305–309 (2001).
Acknowledgements
We thank Dr. A. Szilagyi for reading the manuscript. This work was supported in part by the Alfred P. Sloan Foundation; the National Science Foundation (Career award 0746198); and the National Institute of General Medical Sciences (GM083107, GM084222).
Author information
Authors and Affiliations
Contributions
Y.Z. conceived and supervised the project. A.R., A.K. and Y.Z. designed and performed the experiments. A.R. and Y.Z. wrote the manuscript.
Corresponding author
Rights and permissions
About this article
Cite this article
Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5, 725–738 (2010). https://doi.org/10.1038/nprot.2010.5
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2010.5
This article is cited by
-
Development of a novel multi‑epitope vaccine against the pathogenic human polyomavirus V6/7 using reverse vaccinology
BMC Infectious Diseases (2024)
-
Molecular basis of VEGFR1 autoinhibition at the plasma membrane
Nature Communications (2024)
-
Ab initio modeling of human IRS1 protein to find novel target to dock with drug MH to mitigate T2DM diabetes by insulin signaling
3 Biotech (2024)
-
Modulation of Multidrug Resistance Protein 1-mediated Transport Processes by the Antiviral Drug Ritonavir in Cultured Primary Astrocytes
Neurochemical Research (2024)
-
Chimeric proteins of Mycoplasma hyopneumoniae as vaccine and preclinical model for immunological evaluation
Brazilian Journal of Microbiology (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.