The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Your institute does not have access to this article
Open Access articles citing this article.
Journal of Genetic Engineering and Biotechnology Open Access 11 July 2022
Scientific Reports Open Access 27 June 2022
BMC Ecology and Evolution Open Access 25 June 2022
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
The UniProt, C. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169–D174 (2008).
Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Zhang, Y. Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol. 18, 342–348 (2008).
Marti-Renom, M.A. et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000).
Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Bowie, J.U., Luthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991).
Jones, D.T., Taylor, W.R. & Thornton, J.M. A new approach to protein fold recognition. Nature 358, 86–89 (1992).
Liwo, A., Lee, J., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA 96, 5482–5485 (1999).
Simons, K.T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).
Wu, S., Skolnick, J. & Zhang, Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17 (2007).
Jauch, R., Yeo, H.C., Kolatkar, P.R. & Clarke, N.D. Assessment of CASP7 structure predictions for template free targets. Proteins 69, 57–67 (2007).
Zhang, Y., Kolinski, A. & Skolnick, J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys. J. 85, 1145–1164 (2003).
Battey, J.N. et al. Automated server predictions in CASP7. Proteins 69, 68–82 (2007).
Moult, J. et al. Critical assessment of methods of protein structure prediction-Round VII. Proteins 69 (Suppl 8): 3–9 (2007).
Kopp, J., Bordoli, L., Battey, J.N., Kiefer, F. & Schwede, T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 69, 38–56 (2007).
Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins 69, 118–128 (2007).
Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69, 108–117 (2007).
Zhou, H. et al. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 69 (Suppl 8): 90–97 (2007).
Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77, 100–113 (2009).
Cozzetto, D. et al. Evaluation of template-based models in CASP8 with standard measures. Proteins 77 (Suppl 9): 18–28 (2009).
Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).
Ekins, S., Mestres, J. & Testa, B. In silico pharmacology for drug discovery: applications to targets and beyond. Br. J. Pharmacol. 152, 21–37 (2007).
Becker, O.M. et al. An integrated in silico 3D model-driven discovery of a novel, potent, and selective amidosulfonamide 5-HT1A agonist (PRX-00023) for the treatment of anxiety and depression. J. Med. Chem. 49, 3116–3135 (2006).
Brylinski, M. & Skolnick, J. Q-Dock: low-resolution flexible ligand docking with pocket-specific threading restraints. J. Comput. Chem. 29, 1574–1588 (2008).
Arakaki, A.K., Zhang, Y. & Skolnick, J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics 20, 1087–1096 (2004).
Yue, P. & Moult, J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 356, 1263–1274 (2006).
Boyd, A. et al. A random mutagenesis approach to isolate dominant-negative yeast sec1 mutants reveals a functional role for domain 3a in yeast and mammalian Sec1/Munc18 proteins. Genetics 180, 165–178 (2008).
Ye, Y., Li, Z. & Godzik, A. Modeling and analyzing three-dimensional structures of human disease proteins. Pac. Symp. Biocomput. 11, 439–450 (2006).
Keedy, D.A. et al. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins 77 (Suppl 9): 29–49 (2009).
Tress, M., Ezkurdia, I., Grana, O., Lopez, G. & Valencia, A. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61 (Suppl 7): 27–45 (2005).
Moult, J. Comparative modeling in structural genomics. Structure 16, 14–16 (2008).
Tress, M. et al. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins 69 (Suppl 8): 137–151 (2007).
Malmstrom, L. et al. Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol. 5, e76 (2007).
Zhang, Y., Devries, M.E. & Skolnick, J. Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol. 2, e13 (2006).
Lopez, G., Rojas, A., Tress, M. & Valencia, A. Assessment of predictions submitted for the CASP7 function prediction category. Proteins 69 (Suppl 8): 165–174 (2007).
Brylinski, M. & Skolnick, J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. USA 105, 129–134 (2008).
Roy, A., Srinivasan, N. & Gowri, V.S. Molecular and structural basis of drift in the functions of closely-related homologous enzyme domains: implications for function annotation based on homology searches and structural genomics. In Silico Biol. 9, S41–S55 (2009).
Bork, P., Sander, C. & Valencia, A. Convergent evolution of similar enzymatic function on different protein folds: the hexokinase, ribokinase, and galactokinase families of sugar kinases. Protein Sci. 2, 31–40 (1993).
Zhang, Y. & Skolnick, J. Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87, 2647–2655 (2004).
Zhang, Y. & Skolnick, J. Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA 101, 7594–7599 (2004).
Karplus, K., Barrett, C. & Hughey, R. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856 (1998).
McGuffin, L.J. & Jones, D.T. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, 874–881 (2003).
Wallner, B. & Elofsson, A. Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics 21, 4248–4254 (2005).
Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).
Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321–W326 (2004).
Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003).
Fischer, D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins 51, 434–441 (2003).
Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526–W531 (2004).
Kelley, L.A. & Sternberg, M.J. Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009).
Jones, D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999).
Wu, S. & Zhang, Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 35, 3375–3382 (2007).
Shi, J., Blundell, T.L. & Mizuguchi, K. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257 (2001).
Wu, S. & Zhang, Y. MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72, 547–556 (2008).
Xu, Y. & Xu, D. Protein threading using PROSPECT: design and evaluation. Proteins 40, 343–354 (2000).
Zhou, H. & Zhou, Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328 (2005).
Zhou, H. & Zhou, Y. Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013 (2004).
Zhang, Y., Kihara, D. & Skolnick, J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins 48, 192–201 (2002).
Zhang, Y., Hubner, I., Arakaki, A., Shakhnovich, E. & Skolnick, J. On the origin and completeness of highly likely single domain protein structures. Proc. Natl. Acad. Sci. USA 103, 2605–2610 (2006).
Chen, H. & Zhou, H.X. Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res. 33, 3193–3199 (2005).
Wu, S. & Zhang, Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24, 924–931 (2008).
Zhang, Y. & Skolnick, J. SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004).
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).
Li, Y. & Zhang, Y. REMO: a new protocol to refine full atomic protein models from C-α traces by optimizing hydrogen-bonding networks. Proteins 76, 665–676 (2009).
Barrett, A.J. Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur. J. Biochem. 250, 1–6 (1997).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).
Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9, 40 (2008).
Li, W., Zhang, Y. & Skolnick, J. Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004).
Sali, A. & Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).
Betancourt, M.R. & Skolnick, J. Universal similarity measure for comparing protein structures. Biopolymers 59, 305–309 (2001).
We thank Dr. A. Szilagyi for reading the manuscript. This work was supported in part by the Alfred P. Sloan Foundation; the National Science Foundation (Career award 0746198); and the National Institute of General Medical Sciences (GM083107, GM084222).
About this article
Cite this article
Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5, 725–738 (2010). https://doi.org/10.1038/nprot.2010.5
Cell Communication and Signaling (2022)
Journal of Genetic Engineering and Biotechnology (2022)
In silico approach for the development of phenolic derivatives as potential anti-angiogenic agents against lysyl oxidase-like 2 enzyme
Future Journal of Pharmaceutical Sciences (2022)
In silico structural, phylogenetic and drug target analysis of putrescine monooxygenase from Shewanella putrefaciens-95
Journal of Genetic Engineering and Biotechnology (2022)
BMC Ecology and Evolution (2022)