Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction

Abstract

Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.

Your institute does not have access to this article

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Overview of the I-TASSER-MTD protocol for multi-domain protein structure and function prediction.
Fig. 2: Comparison between I-TASSER-MTD and other methods.
Fig. 3: Example of the I-TASSER-MTD results page (Sections 2 and 3).
Fig. 4: Example of the I-TASSER-MTD results page (Sections 4–6).
Fig. 5: Example of the I-TASSER-MTD results page (Sections 7 and 8).
Fig. 6: Example of the I-TASSER-MTD results page (Sections 9–12).

Data availability

The raw data and example files are available at https://zhanggroup.org/I-TASSER-MTD/ or from the corresponding author upon reasonable request.

Code availability

The I-TASSER-MTD standalone package is freely available for academic use at https://zhanggroup.org/I-TASSER-MTD/.

References

  1. Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).

    CAS  PubMed  Article  Google Scholar 

  2. Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

    CAS  PubMed  Article  Google Scholar 

  3. Xu, D. & Zhang, Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80, 1715–1735 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. Yang, J. et al. The I-TASSER Suite: protein structure and function prediction. Nat. Methods 12, 7–8 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. Weigt, M., White, R. A., Szurmant, H., Hoch, J. A. & Hwa, T. Identification of direct residue contacts in protein–protein interaction by message passing. Proc. Natl Acad. Sci. USA 106, 67–72 (2009).

    CAS  PubMed  Article  Google Scholar 

  6. Marks, D. S. et al. Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6, e28766 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. Mortuza, S. et al. Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat. Commun. 12, 5011 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13, e1005324 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  9. Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).

    CAS  PubMed  Article  Google Scholar 

  10. Li, Y., Hu, J., Zhang, C., Yu, D.-J. & Zhang, Y. ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 35, 4647–4655 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins 89, 1607–1617 (2021).

    CAS  PubMed  Article  Google Scholar 

  13. Chothia, C., Gough, J., Vogel, C. & Teichmann, S. A. Evolution of the protein repertoire. Science 300, 1701–1703 (2003).

    CAS  PubMed  Article  Google Scholar 

  14. Apic, G., Huber, W. & Teichmann, S. A. Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination. J. Struct. Funct. Genomics 4, 67–78 (2003).

    CAS  PubMed  Article  Google Scholar 

  15. Han, J.-H., Batey, S., Nickson, A. A., Teichmann, S. A. & Clarke, J. J. N. R. M. C. B. The folding and evolution of multidomain proteins. Nat. Rev. Mol. Cell Biol. 8, 319 (2007).

    CAS  PubMed  Article  Google Scholar 

  16. Zhou, X. G., Hu, J., Zhang, C. X., Zhang, G. J. & Zhang, Y. Assembling multidomain protein structures through analogous global structural alignments. Proc. Natl Acad. Sci. USA 116, 15930–15938 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. Xu, D., Jaroszewski, L., Li, Z. & Godzik, A. AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain–domain interaction prediction. Bioinformatics 31, 2098–2105 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. Xue, Z., Xu, D., Wang, Y. & Zhang, Y. ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics 29, i247–i256 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. Hong, S. H., Joo, K. & Lee, J. ConDo: protein domain boundary prediction using coevolutionary information. Bioinformatics 35, 2411–2417 (2019).

    CAS  PubMed  Article  Google Scholar 

  22. Zheng, W. et al. FUpred: detecting protein domains through deep-learning based contact map prediction. Bioinformatics 36, 3749–3757 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. Wollacott, A. M., Zanghellini, A., Murphy, P. & Baker, D. Prediction of structures of multidomain proteins from structures of the individual domains. Protein Sci. 16, 165–175 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. Zhang, C., Zheng, W., Freddolino, P. L. & Zhang, Y. MetaGO: predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein–protein network mapping. J. Mol. Biol. 430, 2256–2265 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. Yao, S. et al. NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information. Nucleic Acids Res. 49, W469–W475 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. Piovesan, D. & Tosatto, S. C. INGA 2.0: improving protein function prediction for the dark proteome. Nucleic Acids Res. 47, W373–W378 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. Koo, D. C. E. & Bonneau, R. Towards region-specific propagation of protein functions. Bioinformatics 35, 1737–1744 (2019).

    CAS  PubMed  Article  Google Scholar 

  28. Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 1–14 (2021).

    Article  CAS  Google Scholar 

  29. Pearce, R. & Zhang, Y. Toward the solution of the protein structure prediction problem. J. Biol. Chem. 297, 100870 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. Zheng, W. et al. Protein structure prediction using deep learning distance and hydrogen‐bonding restraints in CASP14. Proteins 89, 1734–1751 (2021).

    CAS  PubMed  Article  Google Scholar 

  31. Zheng, W. et al. Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 87, 1149–1164 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. Battey, J. N. et al. Automated server predictions in CASP7. Proteins 69 (Suppl.), 68–82 (2007).

    CAS  PubMed  Article  Google Scholar 

  33. Croll, T. I., Sammito, M. D., Kryshtafovych, A. & Read, R. J. Evaluation of template-based modeling in CASP13. Proteins 87, 1113–1127 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69 (Suppl.), 108–117 (2007).

    CAS  PubMed  Article  Google Scholar 

  35. Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77 (Suppl.), 100–113 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. Xu, D., Zhang, J., Roy, A. & Zhang, Y. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 79 (Suppl.), 147–160 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. Zhang, Y. Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10. Proteins 82 (Suppl.), 175–187 (2014).

    CAS  PubMed  Article  Google Scholar 

  38. Zhang, W. et al. Integration of QUARK and I-TASSER for ab initio protein structure prediction in CASP11. Proteins 84 (Suppl.), 76–86 (2016).

    PubMed  Article  CAS  Google Scholar 

  39. Zhang, C., Mortuza, S. M., He, B., Wang, Y. & Zhang, Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 86 (Suppl.), 136–151 (2018).

    CAS  PubMed  Article  Google Scholar 

  40. Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725–738 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition 770–778 (2016).

  42. Zheng, W. et al. LOMETS2: improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins. Nucleic Acids Res. 47, W429–W436 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. Wang, Y. et al. ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly. Nucleic Acids Res. 45, W400–W407 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. Li, Y. et al. Protein inter‐residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14. Proteins 89, 1911–1921 (2021).

    CAS  PubMed  Article  Google Scholar 

  45. Zhang, C., Freddolino, P. L. & Zhang, Y. COFACTOR: improved protein function prediction by combining structure, sequence and protein–protein interaction information. Nucleic Acids Res. 45, W291–W299 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. Sillitoe, I. et al. CATH: increased structural coverage of functional space. Nucleic Acids Res. 49, D266–D273 (2021).

    CAS  PubMed  Article  Google Scholar 

  47. Xu, Y., Xu, D. & Gabow, H. N. Protein domain decomposition using a graph-theoretic approach. Bioinformatics 16, 1091–1104 (2000).

    CAS  PubMed  Article  Google Scholar 

  48. Steinegger, M. & Söding, J. Clustering huge protein sequence sets in linear time. Nat. Commun. 9, 2542 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  49. Steinegger, M., Mirdita, M. & Söding, J. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat. Methods 16, 603–606 (2019).

    CAS  PubMed  Article  Google Scholar 

  50. Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).

    CAS  PubMed  Google Scholar 

  51. Chen, I.-M. A. et al. The IMG/M data management and analysis system v. 6.0: new tools and advanced capabilities. Nucleic Acids Res. 49, D751–D763 (2021).

    CAS  PubMed  Article  Google Scholar 

  52. Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170–D176 (2017).

    CAS  PubMed  Article  Google Scholar 

  53. Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

    CAS  PubMed  Article  Google Scholar 

  54. Zhang, C., Zheng, W., Mortuza, S., Li, Y. & Zhang, Y. DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins. Bioinformatics 36, 2105–2112 (2020).

    CAS  PubMed  Article  Google Scholar 

  55. Yan, R., Xu, D., Yang, J., Walker, S. & Zhang, Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci. Rep. 3, 1–9 (2013).

    Google Scholar 

  56. Ekeberg, M., Hartonen, T. & Aurell, E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J. Comput. Phys. 276, 341–356 (2014).

    CAS  Article  Google Scholar 

  57. Yang, J. et al. Improved protein structure prediction using predicted interresidue orientations. Proc. Natl Acad. Sci. USA 117, 1496–1503 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. Thrun, S. in Advances in Neural Information Processing Systems 640–646 (Morgan Kaufmann Publishers, 1996).

  59. Zheng, W., Zhang, C., Bell, E. W. & Zhang, Y. I-TASSER gateway: a protein structure and function prediction server powered by XSEDE. Future Gener. Comput. Syst. 99, 73–85 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  60. Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinforma. 9, 40 (2008).

    Article  CAS  Google Scholar 

  61. Zheng, W. et al. Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. Cell Rep. Methods 1, 100014 (2021).

    PubMed  PubMed Central  Article  Google Scholar 

  62. Li, Y., Zhang, C., Bell, E. W., Yu, D. J. & Zhang, Y. Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins 87, 1082–1091 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  63. Li, Y. et al. Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLOS Comput. Biol. 17, e1008865 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. He, B., Mortuza, S., Wang, Y., Shen, H.-B. & Zhang, Y. NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinformatics 33, 2296–2306 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  65. Zhang, Y. & Skolnick, J. SPICKER: a clustering approach to identify near‐native protein folds. J. Comput. Chem. 25, 865–871 (2004).

    CAS  PubMed  Article  Google Scholar 

  66. Huang, X., Pearce, R. & Zhang, Y. FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 36, 3758–3765 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. Zhang, J., Liang, Y. & Zhang, Y. Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 19, 1784–1795 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).

    CAS  PubMed  Article  Google Scholar 

  69. Ramachandran, G. T. & Sasisekharan, V. in Advances in Protein Chemistry, 23 283–437 (Elsevier, 1968).

  70. Roy, A., Yang, J. & Zhang, Y. COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res. 40, W471–W477 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. Yang, J., Roy, A. & Zhang, Y. BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic Acids Res. 41, D1096–D1103 (2012).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  72. Zhou, X. G., Peng, C. X., Liu, J., Zhang, Y. & Zhang, G. J. Underestimation-assisted global-local cooperative differential evolution and the application to protein structure prediction. IEEE Trans. Evol. Comput. 24, 536–550 (2020).

    PubMed  Google Scholar 

  73. Zhou, X. G. & Zhang, G. J. Abstract convex underestimation assisted multistage differential evolution. IEEE Trans. Cybern. 47, 2730–2741 (2017).

    PubMed  Article  Google Scholar 

  74. Zhou, X. G. & Zhang, G. J. Differential evolution with underestimation-based multimutation strategy. IEEE Trans. Cybern. 49, 1353–1364 (2018).

    PubMed  Article  Google Scholar 

  75. Yang, J., Wang, Y. & Zhang, Y. ResQ: an approach to unified estimation of B-factor and residue-specific error in protein structure prediction. J. Mol. Biol. 428, 693–701 (2016).

    CAS  PubMed  Article  Google Scholar 

  76. Glaeser, R. M. How good can cryo-EM become? Nat. Methods 13, 28–32 (2016).

    CAS  PubMed  Article  Google Scholar 

  77. Zhou, X. G. et al. Progressive assembly of multi-domain protein structures from cryo-EM density maps. Nat. Comput. Sci. 2, 265–275 (2022).

    PubMed  PubMed Central  Article  Google Scholar 

  78. Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).

    CAS  PubMed  Article  Google Scholar 

  79. Lu, S. et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 48, D265–D268 (2020).

    CAS  PubMed  Article  Google Scholar 

  80. Eickholt, J., Deng, X. & Cheng, J. DoBo: protein domain boundary prediction by integrating evolutionary signals and machine learning. BMC Bioinforma. 12, 1–8 (2011).

    Article  CAS  Google Scholar 

  81. Tai, C. H., Lee, W. J., Vincent, J. J. & Lee, B. Evaluation of domain prediction in CASP6. Proteins 61, 183–192 (2005).

    CAS  PubMed  Article  Google Scholar 

  82. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. Pearce, R. & Zhang, Y. Deep learning techniques have significantly impacted protein structure prediction and protein design. Curr. Opin. Struct. Biol. 68, 194–207 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  84. Born, A., Henen, M. A. & Vögeli, B. Activity and affinity of Pin1 variants. Molecules 25, 36 (2020).

    CAS  Article  Google Scholar 

  85. Born, A. et al. Reconstruction of coupled intra-and interdomain protein motion from nuclear and electron magnetic resonance. J. Am. Chem. Soc. 143, 16055–16067 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  86. Chandonia, J.-M., Fox, N. K. & Brenner, S. E. SCOPe: manual curation and artifact removal in the structural classification of proteins—extended database. J. Mol. Biol. 429, 348–355 (2017).

    CAS  PubMed  Article  Google Scholar 

  87. Lam, S. D. et al. Gene3D: expanding the utility of domain assignments. Nucleic Acids Res. 44, D404–D409 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  88. Yu, L. et al. Grammar of protein domain architectures. Proc. Natl Acad. Sci. USA 116, 3636–3645 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  89. Chothia, C. & Lesk, A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  90. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D. Biol. Crystallogr. 66, 486–501 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  91. DiMaio, F. et al. Atomic-accuracy models from 4.5-Å cryo-electron microscopy data with density-guided iterative local refinement. Nat. Methods 12, 361–365 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  92. Zhang, C. et al. Functions of essential genes and a scale-free protein interaction network revealed by structure-based function and interaction prediction for a minimal genome. J. Proteome Res. 20, 1178–1189 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  93. Zhang, C., Wei, X., Omenn, G. S. & Zhang, Y. Structure and protein interaction-based gene ontology annotations reveal likely functions of uncharacterized proteins on human chromosome 17. J. Proteome Res. 17, 4186–4196 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  94. Zhang, C., Lane, L., Omenn, G. S. & Zhang, Y. Blinded testing of function annotation for uPE1 proteins by I-TASSER/COFACTOR pipeline using the 2018–2019 additions to neXtProt and the CAFA3 challenge. J. Proteome Res. 18, 4154–4166 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  95. Iyer, S., Subramanian, V. & Acharya, K. R. C9orf72, a protein associated with amyotrophic lateral sclerosis (ALS) is a guanine nucleotide exchange factor. PeerJ 6, e5815 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  96. Skotnicová, P. et al. The cyanobacterial protoporphyrinogen oxidase HemJ is a new b-type heme protein functionally coupled with coproporphyrinogen III oxidase. J. Biol. Chem. 293, 12394–12404 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  97. Hanson, R. M., Prilusky, J., Renjian, Z., Nakane, T. & Sussman, J. L. JSmol and the next‐generation web‐based representation of 3D molecular structure as applied to proteopedia. Isr. J. Chem. 53, 207–216 (2013).

    CAS  Article  Google Scholar 

  98. Hiranuma, N. et al. Improved protein structure refinement guided by deep learning based accuracy estimation. Nat. Commun. 12, 1–11 (2021).

    Article  CAS  Google Scholar 

  99. Guo, S.-S., Liu, J., Zhou, X. & Zhang, G. DeepUMQA: ultrafast shape recognition-based protein model quality assessment using deep learning. Bioinformatics 38, 1895–1903 (2022).

    CAS  Article  Google Scholar 

  100. Xu, J. & Zhang, Y. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26, 889–895 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  101. Ellson, J., Gansner, E.R., Koutsofios, E., North, S.C. & Woodhull, G. in Graph Drawing Software 127–148 (Springer, 2004).

  102. Towns, J. et al. XSEDE: acceleratingscientific discovery. Comput. Sci. Eng. 16, 62–74 (2014).

    Article  CAS  Google Scholar 

  103. Xu, J. Distance-based protein folding powered by deep learning. Proc. Natl Acad. Sci. USA 116, 16856–16865 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  104. Källberg, M. et al. Template-based protein structure modeling using the RaptorX web server. Nat. Protoc. 7, 1511–1522 (2012).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  105. Du, Z. et al. The trRosetta server for fast and accurate protein structure prediction. Nat. Protoc. 16, 5634–5651 (2021).

    CAS  PubMed  Article  Google Scholar 

  106. Lobley, A., Sadowski, M. I. & Jones, D. T. pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination. Bioinformatics 25, 1761–1767 (2009).

    CAS  PubMed  Article  Google Scholar 

  107. Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).

    CAS  PubMed  Article  Google Scholar 

  108. Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Institute of General Medical Sciences (GM136422 and S10OD026825 to Y.Z.), the National Institute of Allergy and Infectious Diseases (AI134678 to Y.Z.), the National Science Foundation (IIS1901191 and DBI2030790 to Y.Z.), the National Nature Science Foundation of China (62173304 and 61773346 to G.Z.), the ‘New Generation Artificial Intelligence’ major project of Science and Technology Innovation 2030 of the Ministry of Science and Technology of China (2021ZD0150100 to G.Z.) and the Key Project of Zhejiang Provincial Natural Science Foundation of China (LZ20F030002 to G.Z.). This work used the Extreme Science and Engineering Discovery Environment (XSEDE)102, which is supported by the National Science Foundation (ACI1548562).

Author information

Authors and Affiliations

Authors

Contributions

Y.Z. conceived and designed the project. X.Z. developed the pipeline and performed the test. W.Z. developed the method for domain boundaries prediction. Y.L. developed the method for contacts and distances prediction. C.Z. developed the method for protein function prediction. Y.Z., W.Z., Y.L., C.Z. and R.P. developed the method for individual domain modeling. X.Z. developed the method for multi-domain protein structure assembly. X.Z. and E.B. tested the server. G.Z. helped supervise the research. X.Z. and Y.Z. wrote the manuscript, and all authors read and approved the final manuscript.

Corresponding author

Correspondence to Yang Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Protocols thanks Ruben Sánchez-García, Beat R. Vogeli and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Zhou, X. et al. Proc. Natl Acad. Sci. USA 116, 15930–15938 (2019): https://doi.org/10.1073/pnas.1905068116

Zhang, C. et al. Nucleic Acids Res. 45, W291–299 (2017): https://doi.org/10.1093/nar/gkx366

Zheng, W. et al. Cell Rep. Methods 1, 100014 (2021): https://doi.org/10.1016/j.crmeth.2021.100014

Hermes, C. et al. Nat. Commun. 12, 144 (2021): https://doi.org/10.1038/s41467-020-20418-3

Supplementary information

Supplementary Information

Supplementary Figs. 1–19, Tables 1 and 2, Notes 1–6, Equations 1–25 and References.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, X., Zheng, W., Li, Y. et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc (2022). https://doi.org/10.1038/s41596-022-00728-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41596-022-00728-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing