Article

PDB-wide identification of biological assemblies from conserved quaternary structure geometry

  • Nature Methods volume 15, pages 6772 (2018)
  • doi:10.1038/nmeth.4510
  • Download Citation
Received:
Accepted:
Published:

Abstract

Protein structures are key to understanding biomolecular mechanisms and diseases, yet their interpretation is hampered by limited knowledge of their biologically relevant quaternary structure (QS). A critical challenge in inferring QS information from crystallographic data is distinguishing biological interfaces from fortuitous crystal-packing contacts. Here, we tackled this problem by developing strategies for aligning and comparing QS states across both homologs and data repositories. QS conservation across homologs proved remarkably strong at predicting biological relevance and is implemented in two methods, QSalign and anti-QSalign, for annotating homo-oligomers and monomers, respectively. QS conservation across repositories is implemented in QSbio (http://www.QSbio.org), which approaches the accuracy of manual curation and allowed us to predict >100,000 QS states across the Protein Data Bank. Based on this high-quality data set, we analyzed pairs of structurally conserved interfaces, and this analysis revealed a striking plasticity whereby evolutionary distant interfaces maintain similar interaction geometries through widely divergent chemical properties.

  • Subscribe to Nature Methods for full access:

    $59

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Accessions

References

  1. 1.

    & Structural symmetry and protein function. Annu. Rev. Biophys. Biomol. Struct. 29, 105–153 (2000).

  2. 2.

    , , & 3D complex: a structural classification of protein complexes. PLoS Comput. Biol. 2, e155 (2006).

  3. 3.

    , , & Structural similarity enhances interaction propensity of proteins. J. Mol. Biol. 365, 1596–1606 (2007).

  4. 4.

    , , , & Emergence of symmetry in homooligomeric biological assemblies. Proc. Natl. Acad. Sci. USA 105, 16148–16152 (2008).

  5. 5.

    & Structure, dynamics, assembly, and evolution of protein complexes. Annu. Rev. Biochem. 84, 551–575 (2015).

  6. 6.

    , , , & Principles of assembly reveal a periodic table of protein complexes. Science 350, aaa2245 (2015).

  7. 7.

    & Diversity of protein-protein interactions. EMBO J. 22, 3486–3492 (2003).

  8. 8.

    et al. Proteome organization in a genome-reduced bacterium. Science 326, 1235–1240 (2009).

  9. 9.

    et al. The emergence of protein complexes: quaternary structure, dynamics and allostery. Colworth Medal Lecture. Biochem. Soc. Trans. 40, 475–491 (2012).

  10. 10.

    , , , & Dimer formation drives the activation of the cell death protease caspase 9. Proc. Natl. Acad. Sci. USA 98, 14250–14255 (2001).

  11. 11.

    Hung M-C, & Klostergaard, J. Human pro-tumor necrosis factor is a homotrimer. Biochemistry 35, 8216–8225 (1996).

  12. 12.

    , , & Evolution of protein complexes by duplication of homomeric interactions. Genome Biol. 8, R51 (2007).

  13. 13.

    et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

  14. 14.

    et al. PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res. 44 D1, D385–D395 (2016).

  15. 15.

    & PQS: a protein quaternary structure file server. Trends Biochem. Sci. 23, 358–361 (1998).

  16. 16.

    Specific versus non-specific contacts in protein crystals. Nat. Struct. Biol. 4, 973–974 (1997).

  17. 17.

    & Protein-protein crystal-packing contacts. Protein Sci. 6, 2261–2263 (1997).

  18. 18.

    , & Discriminating between homodimeric and monomeric proteins in the crystalline state. Proteins 41, 47–57 (2000).

  19. 19.

    , , & NOXclass: prediction of protein-protein interaction types. BMC Bioinformatics 7, 27 (2006).

  20. 20.

    & Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797 (2007).

  21. 21.

    , , , & DiMoVo: a Voronoi tessellation-based method for discriminating crystallographic and biological protein-protein interactions. Bioinformatics 24, 652–658 (2008).

  22. 22.

    , & Discrimination between biological interfaces and crystal-packing contacts. Adv. Appl. Bioinform. Chem. 1, 99–113 (2008).

  23. 23.

    , , & A dissection of specific and non-specific protein-protein interfaces. J. Mol. Biol. 336, 943–955 (2004).

  24. 24.

    , , , & Peptide segments in protein-protein interfaces. J. Biosci. 32, 101–111 (2007).

  25. 25.

    , & PIC: Protein Interactions Calculator. Nucleic Acids Res. 35, W473–W4766 (2007).

  26. 26.

    , & Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinformatics 15 (Suppl. 16), S3 (2014).

  27. 27.

    & Identification of protein oligomerization states by analysis of interface conservation. Proc. Natl. Acad. Sci. USA 98, 2990–2994 (2001).

  28. 28.

    & Conservation and relative importance of residues across protein-protein interfaces. Proc. Natl. Acad. Sci. USA 102, 15447–15452 (2005).

  29. 29.

    , & CRK: an evolutionary approach for distinguishing biologically relevant interfaces from crystal contacts. Proteins 78, 2707–2713 (2010).

  30. 30.

    , , , & A PDB-wide, evolution-based assessment of protein-protein interfaces. BMC Struct. Biol. 14, 22 (2014).

  31. 31.

    et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, W344–W350 (2016).

  32. 32.

    et al. Statistical analysis of interface similarity in crystals of homologous proteins. J. Mol. Biol. 381, 487–507 (2008).

  33. 33.

    & The protein common interface database (ProtCID)—a comprehensive database of interactions of homologous proteins in multiple crystal forms. Nucleic Acids Res. 39, D761–D770 (2011).

  34. 34.

    et al. IBIS (Inferred Biomolecular Interaction Server) reports, predicts and integrates multiple types of conserved interactions for proteins. Nucleic Acids Res. 40, D834–D840 (2012).

  35. 35.

    , & InterEvol database: exploring the structure and evolution of protein complex interfaces. Nucleic Acids Res. 40, D847–D856 (2012).

  36. 36.

    PiQSi: protein quaternary structure investigation. Structure 15, 1364–1367 (2007).

  37. 37.

    & Detection of spatial correlations in protein structures and molecular complexes. Structure 20, 718–728 (2012).

  38. 38.

    & SCPC: a method to structurally compare protein complexes. Bioinformatics 28, 324–330 (2012).

  39. 39.

    & MM-align: a quick algorithm for aligning multiple-chain protein complex structures using iterative dynamic programming. Nucleic Acids Res. 37, e83 (2009).

  40. 40.

    , , & Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Bioinformatics 28, 3274–3281 (2012).

  41. 41.

    & TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

  42. 42.

    , & Evolution of oligomeric state through geometric coupling of protein interfaces. Proc. Natl. Acad. Sci. USA 109, 8127–8132 (2012).

  43. 43.

    & SKEMPI: a structural kinetic and energetic database of mutant protein interactions and its use in empirical models. Bioinformatics 28, 2600–2607 (2012).

  44. 44.

    , & Versatility and invariance in the evolution of homologous heteromeric interfaces. PLOS Comput. Biol. 8, e1002677 (2012).

  45. 45.

    , , & Weak conservation of structural features in the interfaces of homologous transient protein-protein complexes. Protein Sci. 24, 1856–1873 (2015).

  46. 46.

    & Structural and functional impact of cancer-related missense somatic mutations. J. Mol. Biol. 413, 495–512 (2011).

  47. 47.

    & The contribution of missense mutations in core and rim residues of protein-protein interfaces to human disease. J. Mol. Biol. 427, 2886–2898 (2015).

  48. 48.

    , , & Proteins evolve on the edge of supramolecular self-assembly. Nature 548, 244–247 (2017).

  49. 49.

    , , & Structural determinants of the rate of protein evolution in yeast. Mol. Biol. Evol. 23, 1751–1761 (2006).

  50. 50.

    et al. Functional implications from crystal structures of the conserved Bacillus subtilis protein Maf with and without dUTP. Proc. Natl. Acad. Sci. USA 97, 6328–6333 (2000).

  51. 51.

    , , & Assembly reflects evolution of protein complexes. Nature 453, 1262–1265 (2008).

  52. 52.

    R Core Team. R. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2016).

  53. 53.

    et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).

  54. 54.

    , , & Dissecting subunit interfaces in homodimeric proteins. Proteins 53, 708–719 (2003).

  55. 55.

    , , & Protein interface classification by evolutionary analysis. BMC Bioinformatics 13, 334 (2012).

  56. 56.

    A simple definition of structural regions in proteins and its use in analyzing interface evolution. J. Mol. Biol. 403, 660–670 (2010).

  57. 57.

    , , , & UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

  58. 58.

    MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

  59. 59.

    , , , & Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics 18 (Suppl. 1), S71–S77 (2002).

  60. 60.

    , , & The subunit interfaces of weakly associated homodimeric proteins. J. Mol. Biol. 398, 146–160 (2010).

Download references

Acknowledgements

We thank H. Greenblatt for valued help with operating the computer cluster, and we thank O. Dym and S. Rogotner for providing the photo of a protein crystal used in Figure 1. We thank J. Sussman for feedback on the work and D. Fass for comments on the manuscript. This work was supported by a VATAT fellowship to S.D. by the Israel Science Foundation and the I-CORE Program of the Planning and Budgeting Committee (grant nos. 1775/12 and 2179/14), by the Marie Curie CIG Program to E.D.L. (project no. 711715), by the HFSP Career Development Award to E.D.L. (award no. CDA00077/2015), and by a research grant from A.-M. Boucher. E.D.L. is incumbent of the Recanati Career Development Chair of Cancer Research.

Author information

Affiliations

  1. Weizmann Institute of Science, Department of Structural Biology, Rehovot, Israel.

    • Sucharita Dey
    •  & Emmanuel D Levy
  2. Inria Nancy, Villers-les-Nancy, France.

    • David W Ritchie

Authors

  1. Search for Sucharita Dey in:

  2. Search for David W Ritchie in:

  3. Search for Emmanuel D Levy in:

Contributions

S.D. and E.D.L. designed and performed the experiments. D.W.R. adapted the Kpax algorithm to enable the calculations. S.D. and E.D.L. wrote the manuscript with input from D.W.R. All authors corrected and approved the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Emmanuel D Levy.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7, Supplementary Tables 1–2 and Supplementary Note 1

  2. 2.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Data 1

    Prediction details of PISA, EPPIC, QSalign/anti-QSalign and QSbio on the different datasets.

  2. 2.

    Supplementary Data 2

    QSbio results; for the most up-to-date information see www.QSbio.org.