Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction

Abstract

Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the I-TASSER-MTD protocol for multi-domain protein structure and function prediction.
Fig. 2: Comparison between I-TASSER-MTD and other methods.
Fig. 3: Example of the I-TASSER-MTD results page (Sections 2 and 3).
Fig. 4: Example of the I-TASSER-MTD results page (Sections 4–6).
Fig. 5: Example of the I-TASSER-MTD results page (Sections 7 and 8).
Fig. 6: Example of the I-TASSER-MTD results page (Sections 9–12).

Similar content being viewed by others

Data availability

The raw data and example files are available at https://zhanggroup.org/I-TASSER-MTD/ or from the corresponding author upon reasonable request.

Code availability

The I-TASSER-MTD standalone package is freely available for academic use at https://zhanggroup.org/I-TASSER-MTD/.

References

  1. Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).

    Article  CAS  PubMed  Google Scholar 

  2. Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

    Article  CAS  PubMed  Google Scholar 

  3. Xu, D. & Zhang, Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80, 1715–1735 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Yang, J. et al. The I-TASSER Suite: protein structure and function prediction. Nat. Methods 12, 7–8 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Weigt, M., White, R. A., Szurmant, H., Hoch, J. A. & Hwa, T. Identification of direct residue contacts in protein–protein interaction by message passing. Proc. Natl Acad. Sci. USA 106, 67–72 (2009).

    Article  CAS  PubMed  Google Scholar 

  6. Marks, D. S. et al. Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6, e28766 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Mortuza, S. et al. Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat. Commun. 12, 5011 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13, e1005324 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).

    Article  CAS  PubMed  Google Scholar 

  10. Li, Y., Hu, J., Zhang, C., Yu, D.-J. & Zhang, Y. ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 35, 4647–4655 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins 89, 1607–1617 (2021).

    Article  CAS  PubMed  Google Scholar 

  13. Chothia, C., Gough, J., Vogel, C. & Teichmann, S. A. Evolution of the protein repertoire. Science 300, 1701–1703 (2003).

    Article  CAS  PubMed  Google Scholar 

  14. Apic, G., Huber, W. & Teichmann, S. A. Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination. J. Struct. Funct. Genomics 4, 67–78 (2003).

    Article  CAS  PubMed  Google Scholar 

  15. Han, J.-H., Batey, S., Nickson, A. A., Teichmann, S. A. & Clarke, J. J. N. R. M. C. B. The folding and evolution of multidomain proteins. Nat. Rev. Mol. Cell Biol. 8, 319 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Zhou, X. G., Hu, J., Zhang, C. X., Zhang, G. J. & Zhang, Y. Assembling multidomain protein structures through analogous global structural alignments. Proc. Natl Acad. Sci. USA 116, 15930–15938 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Xu, D., Jaroszewski, L., Li, Z. & Godzik, A. AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain–domain interaction prediction. Bioinformatics 31, 2098–2105 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Xue, Z., Xu, D., Wang, Y. & Zhang, Y. ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics 29, i247–i256 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Hong, S. H., Joo, K. & Lee, J. ConDo: protein domain boundary prediction using coevolutionary information. Bioinformatics 35, 2411–2417 (2019).

    Article  CAS  PubMed  Google Scholar 

  22. Zheng, W. et al. FUpred: detecting protein domains through deep-learning based contact map prediction. Bioinformatics 36, 3749–3757 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Wollacott, A. M., Zanghellini, A., Murphy, P. & Baker, D. Prediction of structures of multidomain proteins from structures of the individual domains. Protein Sci. 16, 165–175 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Zhang, C., Zheng, W., Freddolino, P. L. & Zhang, Y. MetaGO: predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein–protein network mapping. J. Mol. Biol. 430, 2256–2265 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Yao, S. et al. NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information. Nucleic Acids Res. 49, W469–W475 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Piovesan, D. & Tosatto, S. C. INGA 2.0: improving protein function prediction for the dark proteome. Nucleic Acids Res. 47, W373–W378 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Koo, D. C. E. & Bonneau, R. Towards region-specific propagation of protein functions. Bioinformatics 35, 1737–1744 (2019).

    Article  CAS  PubMed  Google Scholar 

  28. Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 1–14 (2021).

    Article  Google Scholar 

  29. Pearce, R. & Zhang, Y. Toward the solution of the protein structure prediction problem. J. Biol. Chem. 297, 100870 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zheng, W. et al. Protein structure prediction using deep learning distance and hydrogen‐bonding restraints in CASP14. Proteins 89, 1734–1751 (2021).

    Article  CAS  PubMed  Google Scholar 

  31. Zheng, W. et al. Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 87, 1149–1164 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Battey, J. N. et al. Automated server predictions in CASP7. Proteins 69 (Suppl.), 68–82 (2007).

    Article  CAS  PubMed  Google Scholar 

  33. Croll, T. I., Sammito, M. D., Kryshtafovych, A. & Read, R. J. Evaluation of template-based modeling in CASP13. Proteins 87, 1113–1127 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69 (Suppl.), 108–117 (2007).

    Article  CAS  PubMed  Google Scholar 

  35. Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77 (Suppl.), 100–113 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Xu, D., Zhang, J., Roy, A. & Zhang, Y. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 79 (Suppl.), 147–160 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Zhang, Y. Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10. Proteins 82 (Suppl.), 175–187 (2014).

    Article  CAS  PubMed  Google Scholar 

  38. Zhang, W. et al. Integration of QUARK and I-TASSER for ab initio protein structure prediction in CASP11. Proteins 84 (Suppl.), 76–86 (2016).

    Article  PubMed  Google Scholar 

  39. Zhang, C., Mortuza, S. M., He, B., Wang, Y. & Zhang, Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 86 (Suppl.), 136–151 (2018).

    Article  CAS  PubMed  Google Scholar 

  40. Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725–738 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition 770–778 (2016).

  42. Zheng, W. et al. LOMETS2: improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins. Nucleic Acids Res. 47, W429–W436 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Wang, Y. et al. ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly. Nucleic Acids Res. 45, W400–W407 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Li, Y. et al. Protein inter‐residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14. Proteins 89, 1911–1921 (2021).

    Article  CAS  PubMed  Google Scholar 

  45. Zhang, C., Freddolino, P. L. & Zhang, Y. COFACTOR: improved protein function prediction by combining structure, sequence and protein–protein interaction information. Nucleic Acids Res. 45, W291–W299 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Sillitoe, I. et al. CATH: increased structural coverage of functional space. Nucleic Acids Res. 49, D266–D273 (2021).

    Article  CAS  PubMed  Google Scholar 

  47. Xu, Y., Xu, D. & Gabow, H. N. Protein domain decomposition using a graph-theoretic approach. Bioinformatics 16, 1091–1104 (2000).

    Article  CAS  PubMed  Google Scholar 

  48. Steinegger, M. & Söding, J. Clustering huge protein sequence sets in linear time. Nat. Commun. 9, 2542 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Steinegger, M., Mirdita, M. & Söding, J. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat. Methods 16, 603–606 (2019).

    Article  CAS  PubMed  Google Scholar 

  50. Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).

    CAS  PubMed  Google Scholar 

  51. Chen, I.-M. A. et al. The IMG/M data management and analysis system v. 6.0: new tools and advanced capabilities. Nucleic Acids Res. 49, D751–D763 (2021).

    Article  CAS  PubMed  Google Scholar 

  52. Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170–D176 (2017).

    Article  CAS  PubMed  Google Scholar 

  53. Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

    Article  CAS  PubMed  Google Scholar 

  54. Zhang, C., Zheng, W., Mortuza, S., Li, Y. & Zhang, Y. DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins. Bioinformatics 36, 2105–2112 (2020).

    Article  CAS  PubMed  Google Scholar 

  55. Yan, R., Xu, D., Yang, J., Walker, S. & Zhang, Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci. Rep. 3, 1–9 (2013).

    Article  Google Scholar 

  56. Ekeberg, M., Hartonen, T. & Aurell, E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J. Comput. Phys. 276, 341–356 (2014).

    Article  CAS  Google Scholar 

  57. Yang, J. et al. Improved protein structure prediction using predicted interresidue orientations. Proc. Natl Acad. Sci. USA 117, 1496–1503 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Thrun, S. in Advances in Neural Information Processing Systems 640–646 (Morgan Kaufmann Publishers, 1996).

  59. Zheng, W., Zhang, C., Bell, E. W. & Zhang, Y. I-TASSER gateway: a protein structure and function prediction server powered by XSEDE. Future Gener. Comput. Syst. 99, 73–85 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinforma. 9, 40 (2008).

    Article  Google Scholar 

  61. Zheng, W. et al. Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. Cell Rep. Methods 1, 100014 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Li, Y., Zhang, C., Bell, E. W., Yu, D. J. & Zhang, Y. Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins 87, 1082–1091 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Li, Y. et al. Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLOS Comput. Biol. 17, e1008865 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. He, B., Mortuza, S., Wang, Y., Shen, H.-B. & Zhang, Y. NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinformatics 33, 2296–2306 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Zhang, Y. & Skolnick, J. SPICKER: a clustering approach to identify near‐native protein folds. J. Comput. Chem. 25, 865–871 (2004).

    Article  CAS  PubMed  Google Scholar 

  66. Huang, X., Pearce, R. & Zhang, Y. FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 36, 3758–3765 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Zhang, J., Liang, Y. & Zhang, Y. Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 19, 1784–1795 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).

    Article  CAS  PubMed  Google Scholar 

  69. Ramachandran, G. T. & Sasisekharan, V. in Advances in Protein Chemistry, 23 283–437 (Elsevier, 1968).

  70. Roy, A., Yang, J. & Zhang, Y. COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res. 40, W471–W477 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Yang, J., Roy, A. & Zhang, Y. BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic Acids Res. 41, D1096–D1103 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Zhou, X. G., Peng, C. X., Liu, J., Zhang, Y. & Zhang, G. J. Underestimation-assisted global-local cooperative differential evolution and the application to protein structure prediction. IEEE Trans. Evol. Comput. 24, 536–550 (2020).

    PubMed  Google Scholar 

  73. Zhou, X. G. & Zhang, G. J. Abstract convex underestimation assisted multistage differential evolution. IEEE Trans. Cybern. 47, 2730–2741 (2017).

    Article  PubMed  Google Scholar 

  74. Zhou, X. G. & Zhang, G. J. Differential evolution with underestimation-based multimutation strategy. IEEE Trans. Cybern. 49, 1353–1364 (2018).

    Article  PubMed  Google Scholar 

  75. Yang, J., Wang, Y. & Zhang, Y. ResQ: an approach to unified estimation of B-factor and residue-specific error in protein structure prediction. J. Mol. Biol. 428, 693–701 (2016).

    Article  CAS  PubMed  Google Scholar 

  76. Glaeser, R. M. How good can cryo-EM become? Nat. Methods 13, 28–32 (2016).

    Article  CAS  PubMed  Google Scholar 

  77. Zhou, X. G. et al. Progressive assembly of multi-domain protein structures from cryo-EM density maps. Nat. Comput. Sci. 2, 265–275 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  78. Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).

    Article  CAS  PubMed  Google Scholar 

  79. Lu, S. et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 48, D265–D268 (2020).

    Article  CAS  PubMed  Google Scholar 

  80. Eickholt, J., Deng, X. & Cheng, J. DoBo: protein domain boundary prediction by integrating evolutionary signals and machine learning. BMC Bioinforma. 12, 1–8 (2011).

    Article  Google Scholar 

  81. Tai, C. H., Lee, W. J., Vincent, J. J. & Lee, B. Evaluation of domain prediction in CASP6. Proteins 61, 183–192 (2005).

    Article  CAS  PubMed  Google Scholar 

  82. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Pearce, R. & Zhang, Y. Deep learning techniques have significantly impacted protein structure prediction and protein design. Curr. Opin. Struct. Biol. 68, 194–207 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Born, A., Henen, M. A. & Vögeli, B. Activity and affinity of Pin1 variants. Molecules 25, 36 (2020).

    Article  CAS  Google Scholar 

  85. Born, A. et al. Reconstruction of coupled intra-and interdomain protein motion from nuclear and electron magnetic resonance. J. Am. Chem. Soc. 143, 16055–16067 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Chandonia, J.-M., Fox, N. K. & Brenner, S. E. SCOPe: manual curation and artifact removal in the structural classification of proteins—extended database. J. Mol. Biol. 429, 348–355 (2017).

    Article  CAS  PubMed  Google Scholar 

  87. Lam, S. D. et al. Gene3D: expanding the utility of domain assignments. Nucleic Acids Res. 44, D404–D409 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Yu, L. et al. Grammar of protein domain architectures. Proc. Natl Acad. Sci. USA 116, 3636–3645 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Chothia, C. & Lesk, A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D. Biol. Crystallogr. 66, 486–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. DiMaio, F. et al. Atomic-accuracy models from 4.5-Å cryo-electron microscopy data with density-guided iterative local refinement. Nat. Methods 12, 361–365 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Zhang, C. et al. Functions of essential genes and a scale-free protein interaction network revealed by structure-based function and interaction prediction for a minimal genome. J. Proteome Res. 20, 1178–1189 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Zhang, C., Wei, X., Omenn, G. S. & Zhang, Y. Structure and protein interaction-based gene ontology annotations reveal likely functions of uncharacterized proteins on human chromosome 17. J. Proteome Res. 17, 4186–4196 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Zhang, C., Lane, L., Omenn, G. S. & Zhang, Y. Blinded testing of function annotation for uPE1 proteins by I-TASSER/COFACTOR pipeline using the 2018–2019 additions to neXtProt and the CAFA3 challenge. J. Proteome Res. 18, 4154–4166 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  95. Iyer, S., Subramanian, V. & Acharya, K. R. C9orf72, a protein associated with amyotrophic lateral sclerosis (ALS) is a guanine nucleotide exchange factor. PeerJ 6, e5815 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Skotnicová, P. et al. The cyanobacterial protoporphyrinogen oxidase HemJ is a new b-type heme protein functionally coupled with coproporphyrinogen III oxidase. J. Biol. Chem. 293, 12394–12404 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  97. Hanson, R. M., Prilusky, J., Renjian, Z., Nakane, T. & Sussman, J. L. JSmol and the next‐generation web‐based representation of 3D molecular structure as applied to proteopedia. Isr. J. Chem. 53, 207–216 (2013).

    Article  CAS  Google Scholar 

  98. Hiranuma, N. et al. Improved protein structure refinement guided by deep learning based accuracy estimation. Nat. Commun. 12, 1–11 (2021).

    Article  Google Scholar 

  99. Guo, S.-S., Liu, J., Zhou, X. & Zhang, G. DeepUMQA: ultrafast shape recognition-based protein model quality assessment using deep learning. Bioinformatics 38, 1895–1903 (2022).

    Article  CAS  Google Scholar 

  100. Xu, J. & Zhang, Y. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26, 889–895 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Ellson, J., Gansner, E.R., Koutsofios, E., North, S.C. & Woodhull, G. in Graph Drawing Software 127–148 (Springer, 2004).

  102. Towns, J. et al. XSEDE: acceleratingscientific discovery. Comput. Sci. Eng. 16, 62–74 (2014).

    Article  Google Scholar 

  103. Xu, J. Distance-based protein folding powered by deep learning. Proc. Natl Acad. Sci. USA 116, 16856–16865 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Källberg, M. et al. Template-based protein structure modeling using the RaptorX web server. Nat. Protoc. 7, 1511–1522 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Du, Z. et al. The trRosetta server for fast and accurate protein structure prediction. Nat. Protoc. 16, 5634–5651 (2021).

    Article  CAS  PubMed  Google Scholar 

  106. Lobley, A., Sadowski, M. I. & Jones, D. T. pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination. Bioinformatics 25, 1761–1767 (2009).

    Article  CAS  PubMed  Google Scholar 

  107. Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).

    Article  CAS  PubMed  Google Scholar 

  108. Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Institute of General Medical Sciences (GM136422 and S10OD026825 to Y.Z.), the National Institute of Allergy and Infectious Diseases (AI134678 to Y.Z.), the National Science Foundation (IIS1901191 and DBI2030790 to Y.Z.), the National Nature Science Foundation of China (62173304 and 61773346 to G.Z.), the ‘New Generation Artificial Intelligence’ major project of Science and Technology Innovation 2030 of the Ministry of Science and Technology of China (2021ZD0150100 to G.Z.) and the Key Project of Zhejiang Provincial Natural Science Foundation of China (LZ20F030002 to G.Z.). This work used the Extreme Science and Engineering Discovery Environment (XSEDE)102, which is supported by the National Science Foundation (ACI1548562).

Author information

Authors and Affiliations

Authors

Contributions

Y.Z. conceived and designed the project. X.Z. developed the pipeline and performed the test. W.Z. developed the method for domain boundaries prediction. Y.L. developed the method for contacts and distances prediction. C.Z. developed the method for protein function prediction. Y.Z., W.Z., Y.L., C.Z. and R.P. developed the method for individual domain modeling. X.Z. developed the method for multi-domain protein structure assembly. X.Z. and E.B. tested the server. G.Z. helped supervise the research. X.Z. and Y.Z. wrote the manuscript, and all authors read and approved the final manuscript.

Corresponding author

Correspondence to Yang Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Protocols thanks Ruben Sánchez-García, Beat R. Vogeli and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Zhou, X. et al. Proc. Natl Acad. Sci. USA 116, 15930–15938 (2019): https://doi.org/10.1073/pnas.1905068116

Zhang, C. et al. Nucleic Acids Res. 45, W291–299 (2017): https://doi.org/10.1093/nar/gkx366

Zheng, W. et al. Cell Rep. Methods 1, 100014 (2021): https://doi.org/10.1016/j.crmeth.2021.100014

Hermes, C. et al. Nat. Commun. 12, 144 (2021): https://doi.org/10.1038/s41467-020-20418-3

Supplementary information

Supplementary Information

Supplementary Figs. 1–19, Tables 1 and 2, Notes 1–6, Equations 1–25 and References.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, X., Zheng, W., Li, Y. et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc 17, 2326–2353 (2022). https://doi.org/10.1038/s41596-022-00728-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-022-00728-0

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics