Abstract

Visualizations of biomolecular structures empower us to gain insights into biological functions, generate testable hypotheses, and communicate biological concepts. Typical visualizations (such as ball and stick) primarily depict covalent bonds. In contrast, non-covalent contacts between atoms, which govern normal physiology, pathogenesis, and drug action, are seldom visualized. We present the Protein Contacts Atlas, an interactive resource of non-covalent contacts from over 100,000 PDB crystal structures. We developed multiple representations for visualization and analysis of non-covalent contacts at different scales of organization: atoms, residues, secondary structure, subunits, and entire complexes. The Protein Contacts Atlas enables researchers from different disciplines to investigate diverse questions in the framework of non-covalent contacts, including the interpretation of allostery, disease mutations and polymorphisms, by exploring individual subunits, interfaces, and protein–ligand contacts and by mapping external information. The Protein Contacts Atlas is available at http://www.mrc-lmb.cam.ac.uk/pca/ and also through PDBe.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Watson, J. D. & Crick, F. H. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953).

  2. 2.

    Kendrew, J. C. et al. Structure of myoglobin: a three-dimensional Fourier synthesis at 2Å resolution. Nature 185, 422–427 (1960).

  3. 3.

    Perutz, M. F. et al. Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5-Å resolution, obtained by X-ray analysis. Nature 185, 416–422 (1960).

  4. 4.

    Shi, Y. A glimpse of structural biology through X-ray crystallography. Cell 159, 995–1014 (2014).

  5. 5.

    Wüthrich, K. The way to NMR structures of proteins. Nat. Struct. Biol. 8, 923–925 (2001).

  6. 6.

    Cheng, Y. Single-particle cryo-EM at crystallographic resolution. Cell 161, 450–457 (2015).

  7. 7.

    Ollis, W. D. Models and molecules. Proc. R. Inst. G. B. 45, 1–31 (1972).

  8. 8.

    Perutz, M. F. The hemoglobin molecule. Sci. Am. 211, 64–76 (1964).

  9. 9.

    Baldwin, J. & Chothia, C. Haemoglobin: the structural changes related to ligand binding and its allosteric mechanism. J. Mol. Biol. 129, 175–220 (1979).

  10. 10.

    Pauling, L., Corey, R. B. & Branson, H. R. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA 37, 205–211 (1951).

  11. 11.

    Richardson, J. S. β-Sheet topology and the relatedness of proteins. Nature 268, 495–500 (1977).

  12. 12.

    Ramachandran, G. N., Ramakrishnan, C. & Sasisekharan, V. Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95–99 (1963).

  13. 13.

    Richardson, J. S. Early ribbon drawings of proteins. Nat. Struct. Biol. 7, 624–625 (2000).

  14. 14.

    Levitt, M. & Chothia, C. Structural patterns in globular proteins. Nature 261, 552–558 (1976).

  15. 15.

    Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).

  16. 16.

    Orengo, C. A. et al. CATH—a hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997).

  17. 17.

    Nishikawa, K., Ooi, T., Isogai, Y. & Saitô, N. Tertiary structure of proteins. I. Representation and computation of the conformations. J. Phys. Soc. Jpn. 32, 1331–1337 (1972).

  18. 18.

    Lesk, A. M. & Chothia, C. How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. J. Mol. Biol. 136, 225–270 (1980).

  19. 19.

    Chakrabarty, B. & Parekh, N. NAPS: Network Analysis of Protein Structures. Nucl. Acids Res. 44 W1, W375–W382 (2016).

  20. 20.

    Seeber, M., Felline, A., Raimondi, F., Mariani, S. & Fanelli, F. WebPSN: a web server for high-throughput investigation of structural communication in biomacromolecules. Bioinformatics 31, 779–781 (2015).

  21. 21.

    Jubb, H. C. et al. Arpeggio: a web server for calculating and visualising interatomic interactions in protein structures. J. Mol. Biol. 429, 365–371 (2017).

  22. 22.

    Doncheva, N. T., Assenov, Y., Domingues, F. S. & Albrecht, M. Topological analysis and interactive visualization of biological networks and protein structures. Nat. Protoc. 7, 670–685 (2012).

  23. 23.

    Piovesan, D., Minervini, G. & Tosatto, S. C. The RING 2.0 web server for high quality residue interaction networks. Nucleic Acids Res. 44 W1, W367–W374 (2016).

  24. 24.

    Vishveshwara, S., Brinda, K. V. & Kannan, N. Protein structure: insights from graph theory. J. Theor. Comp. Chem. 1, 187–211 (2002).

  25. 25.

    Süel, G. M., Lockless, S. W., Wall, M. A. & Ranganathan, R. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biol. 10, 59–69 (2003).

  26. 26.

    del Sol, A., Fujihashi, H., Amoros, D. & Nussinov, R. Residues crucial for maintaining short paths in network communication mediate signaling in proteins. Mol. Syst. Biol. 2, 0019 (2006).

  27. 27.

    Kornev, A. P., Haste, N. M., Taylor, S. S. & Eyck, L. F. Surface comparison of active and inactive protein kinases identifies a conserved activation mechanism. Proc. Natl. Acad. Sci. USA 103, 17783–17788 (2006).

  28. 28.

    Vishveshwara, S., Ghosh, A. & Hansia, P. Intra- and inter-molecular communications through protein structure network. Curr. Protein Pept. Sci. 10, 146–160 (2009).

  29. 29.

    Fanelli, F., Felline, A. & Raimondi, F. Network analysis to uncover the structural communication in GPCRs. Methods Cell. Biol. 117, 43–61 (2013).

  30. 30.

    Bhattacharyya, M., Ghosh, S. & Vishveshwara, S. Protein structure and function: looking through the network of side-chain interactions. Curr. Protein Pept. Sci. 17, 4–25 (2016).

  31. 31.

    Fanelli, F., Felline, A., Raimondi, F. & Seeber, M. Structure network analysis to gain insights into GPCR function. Biochem. Soc. Trans. 44, 613–618 (2016).

  32. 32.

    Ahnert, S. E., Marsh, A. J., Hernández, H., Robinson, C. V. & Teichmann, S. A. Principles of assembly reveal a periodic table of protein complexes. Science 350, aaa2245 (2015).

  33. 33.

    Levy, E. D., Pereira-Leal, J. B., Chothia, C. & Teichmann, S. A. 3D complex: a structural classification of protein complexes. PLoS Comput. Biol. 2, e155 (2006).

  34. 34.

    Greene, L. H. & Higman, V. A. Uncovering network systems within protein structures. J. Mol. Biol. 334, 781–791 (2003).

  35. 35.

    Venkatakrishnan, A. J. et al. Molecular signatures of G-protein-coupled receptors. Nature 494, 185–194 (2013).

  36. 36.

    Flock, T. et al. Universal allosteric mechanism for Gα activation by GPCRs. Nature 524, 173–179 (2015).

  37. 37.

    Venkatakrishnan, A. J. et al. Diverse activation pathways in class A GPCRs converge near the G-protein-coupling region. Nature 536, 484–487 (2016).

  38. 38.

    Flock, T. et al. Selectivity determinants of GPCR-G-protein binding. Nature 545, 317–322 (2017).

  39. 39.

    Hauser, A. S. et al Pharmacogenomics of GPCR drug targets. Cell, https://doi.org/10.1016/j.cell.2017.11.033 (2017).

  40. 40.

    Doncheva, N. T., Klein, K., Domingues, F. S. & Albrecht, M. Analyzing and visualizing residue networks of protein structures. Trends Biochem. Sci. 36, 179–182 (2011).

  41. 41.

    Martin, A. J. et al. RING: networking interacting residues, evolutionary information and energetics in protein structures. Bioinformatics 27, 2003–2005 (2011).

  42. 42.

    Zhang, X., Perica, T. & Teichmann, S. A. Evolution of protein structures and interactions from the perspective of residue contact networks. Curr. Opin. Struct. Biol. 23, 954–963 (2013).

  43. 43.

    Rose, P. W. et al. The RCSB protein data bank: integrative view of protein, gene and 3D structural information. Nucl. Acids Res. 45, D271–D281 (2017).

  44. 44.

    Tsai, J., Taylor, R., Chothia, C. & Gerstein, M. The packing density in proteins: standard radii and volumes. J. Mol. Biol. 290, 253–266 (1999).

  45. 45.

    Carpenter, B., Nehmé, R., Warne, T., Leslie, A. G. & Tate, C. G. Structure of the adenosine A(2A) receptor bound to an engineered G protein. Nature 536, 104–107 (2016).

  46. 46.

    Emamzadah, S., Tropia, L., Vincenti, I., Falquet, B. & Halazonetis, T. D. Reversal of the DNA-binding-induced loop L1 conformational switch in an engineered human p53 protein. J. Mol. Biol. 426, 936–944 (2014).

  47. 47.

    Laskowski, R. A. & Swindells, M. B. LigPlot+: multiple ligand-protein interaction diagrams for drug discovery. J. Chem. Inf. Model. 51, 2778–2786 (2011).

  48. 48.

    Cherezov, V. et al. High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science 318, 1258–1265 (2007).

  49. 49.

    Mendes, H. F., van der Spuy, J., Chapple, J. P. & Cheetham, M. E. Mechanisms of cell death in rhodopsin retinitis pigmentosa: implications for therapy. Trends Mol. Med. 11, 177–185 (2005).

  50. 50.

    del Sol, A., Fujihashi, H., Amoros, D. & Nussinov, R. Residue centrality, functionally important residues, and active site shape: analysis of enzyme and non-enzyme families. Protein Sci. 15, 2120–2128 (2006).

  51. 51.

    Soundararajan, V., Raman, R., Raguram, S., Sasisekharan, V. & Sasisekharan, R. Atomic interaction networks in the core of protein domains and their native folds. PLoS ONE 5, e9391 (2010).

  52. 52.

    Isberg, V. et al. Generic GPCR residue numbers—aligning topology maps while minding the gaps. Trends Pharmacol. Sci. 36, 22–31 (2015).

  53. 53.

    Isberg, V. et al. GPCRdb: an information system for G protein-coupled receptors. Nucleic Acids Res. 44 D1, D356–D364 (2016).

  54. 54.

    Hildebrand, P. W. et al. A ligand channel through the G protein coupled receptor opsin. PLoS ONE 4, e4382 (2009).

  55. 55.

    Deupi, X. et al. Stabilized G protein binding site in the structure of constitutively active metarhodopsin-II. Proc. Natl. Acad. Sci. USA 109, 119–124 (2012).

  56. 56.

    O’Donoghue, S. I. et al. Visualizing biological data-now and in the future. Nat. Methods 7 (Suppl.), S2–S4 (2010).

  57. 57.

    Velankar, S. et al. PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucl. Acids Res. 44, D385–D395 (2016).

  58. 58.

    Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983).

  59. 59.

    Touw, W. G. et al. A series of PDB-related databanks for everyday needs. Nucl. Acids Res. 43, D364–D368 (2015).

  60. 60.

    Cavallo, L., Kleinjung, J. & Fraternali, F. POPS: a fast algorithm for solvent accessible surface areas at atomic and residue level. Nucl. Acids Res. 31, 3364–3366 (2003).

  61. 61.

    Kannan, N. & Vishveshwara, S. Identification of side-chain clusters in protein structures by a graph spectral method. J. Mol. Biol. 292, 441–464 (1999).

  62. 62.

    Costa, L. F., Rodrigues, F. A., Travieso, G. & Villas Boas, P. R. Characterization of complex networks: a survey of measurements. Adv. Phys. 56, 167–242 (2007).

  63. 63.

    Freeman, L. C. A set of measures of centrality based on betweenness. Sociometry 40, 35–41 (1977).

  64. 64.

    Yoon, J., Blumer, A. & Lee, K. An algorithm for modularity analysis of directed and weighted biological networks based on edge-betweenness centrality. Bioinformatics 22, 3106–3108 (2006).

  65. 65.

    Bavelas, A. Communication patterns in task-oriented groups. J. Acoust. Soc. Am. 22, 725–730 (1950).

  66. 66.

    Sabidussi, G. The centrality of a graph. Psychometrika 31, 581–603 (1966).

  67. 67.

    Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

Download references

Acknowledgements

We thank A. Lesk, C. Chothia, S. Balaji, H. Harbrecht, I. Huppertz, M. Ouédraogo, G. Chalancon, A. Morgunov, A. Murzin, A. Andreeva, G. de Baets, R. Peer, S. Chavali, A. Sente, N. S. Latysheva, A. Gunnarsson, and A. Hauser for their comments on this work. We thank T. Nakane for his inputs on structure visualization using WebGLMol. This work was supported by the Medical Research Council (MC_U105185859; M.M.B., M.K., C.N.J.R., T.F., A.J.V., and J.S.-B.), the LMB-Cambridge scholarship (A.J.V.), the St. John's College Benefactor scholarship (A.J.V.), the AFR scholarship from the Luxembourg National Research Fund (C.N.J.R.), and the Boehringer Ingelheim Fond (T.F.). T.F. is a Research Fellow of Fitzwilliam College, University of Cambridge, UK. M.M.B. is a Lister Institute Research Prize Fellow.

Author information

Author notes

    • Melis Kayikci

    Present address: Genomics England, London, UK

    • A. J. Venkatakrishnan

    Present address: Department of Molecular and Cellular Physiology, Department of Computer Science, and Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA

    • James Scott-Brown

    Present address: University of Oxford, Oxford, UK

    • Tilman Flock

    Present address: Paul Scherrer Institute, Villigen, Switzerland

  1. Melis Kayikci and A.J. Venkatakrishnan contributed equally to this work.

Affiliations

  1. MRC Laboratory of Molecular Biology, Cambridge, UK

    • Melis Kayikci
    • , A. J. Venkatakrishnan
    • , James Scott-Brown
    • , Charles N. J. Ravarani
    • , Tilman Flock
    •  & M. Madan Babu
  2. Fitzwilliam College, University of Cambridge, Cambridge, UK

    • Tilman Flock

Authors

  1. Search for Melis Kayikci in:

  2. Search for A. J. Venkatakrishnan in:

  3. Search for James Scott-Brown in:

  4. Search for Charles N. J. Ravarani in:

  5. Search for Tilman Flock in:

  6. Search for M. Madan Babu in:

Contributions

M.K. collected the data, developed the computational pipeline, and built the web server. A.J.V. designed the prototype of the representations with M.M.B. A.J.V., M.K., J.S.-B., and M.M.B. optimized the representations, and M.K. and J.S.-B. implemented the representations. M.K. and A.J.V. performed the GPCR analyses. J.S.-B. made the prototype of the web server. C.N.J.R. and T.F. helped with the web server and analyzing examples. M.K., C.N.J.R., and A.J.V. independently wrote separate drafts of the manuscript. M.K., A.J.V., and M.M.B. wrote the final manuscript with critical inputs from C.N.J.R., J.S.-B., and T.F.; M.K., A.J.V., and C.N.J.R. prepared the figures. A.J.V. and M.M.B. conceived and planned the project. M.K., A.J.V., and M.M.B. executed the project. M.M.B. supervised the project.

Competing interests

The authors declare no competing financial interest.

Corresponding authors

Correspondence to Melis Kayikci or A. J. Venkatakrishnan or M. Madan Babu.

Integrated supplementary information

  1. Supplementary Figure 1 Plot of all PDB structures analyzed

    a, The number of residues (x-axis) vs the number of non-covalent residue contacts (y-axis) in each PDB. b, The number of atoms (x-axis) vs the number of non-covalent atomic contacts (y-axis) in each PDB. Each dot represents a PDB structure. On average, each of these structures contains 5,677 atoms, 45,042 atomic contacts, 684 residues, 3,043 residue contacts, 3 chains, 2,441 atomic contacts between chains, and 125 residue contacts between chains.

  2. Supplementary Figure 2 Visualization and analysis page

    The biomolecular complex network panel, sequence panel, 3D structure panel and contacts panel are shown for the β2 adrenergic receptor-G protein complex structure (PDB ID: 3SN6). The sequence panel shows the secondary structure elements (SSEs) and the amino acid sequence. In the sequence panel, some positions are highlighted with a line on the top, using bold typeface or a red color. A line on top of a residue denotes availability of disease mutation data via Uniprot, which can be accessed by moving the mouse over the residue. Bold letters denote residues that were previously hovered over. Red colored letters denote residues currently selected. Helices are represented as rectangles and sheets are represented by arrows. The top line shows the entire SSEs of the chain and the line below shows the zoomed in version with the loops as a thin black line. The size of the rectangles and arrows are proportional to the length of the SSEs. The contacts panel shows the secondary structures view in which the chord plot can be seen on the top left corner with the selected secondary structures (Helix 38 in blue and Helix 40 in orange). In the middle, the residue contact matrix is shown with all the contacting residues of the selected SSEs. The residues belonging to each SSE are highlighted in their respective colors. The number of atomic contacts is shown within the matrix. On the right is the 3D structure panel, which shows the 3D view of the receptor (grey cartoon view).

  3. Supplementary Figure 3 Mapping disease mutations in rhodopsin

    a, Selected residues of rhodopsin (1GZM; shown in the text field) are highlighted on the scatter plot in red. b, 3D structure view of rhodopsin (1GZM, chain A) is shown in network view (grey) with contacts highlighted in blue. The thickness of the residue contact denotes the number of atomic contacts made by the interacting residues.

  4. Supplementary Figure 4 Mapping external information for residue-level analysis of structures

    a, Scatter plot of residues in the G protein after mapping their stability data (experimentally inferred binding affinity) using a color spectrum (min: cyan; max: magenta). Stability was measured as ΔTm = Tm of Gαi1(Ala) - Tm of Gαi1(WT) with 1mM GDP. Residue F336G.H5.8 is shown in red and the rest of the residues in cyan and magenta according to their ΔTm values. b, Network view and asteroid plot of F336G.H5.8 shows the extensive nature of the contacts this residue mediates in the GDP bound structure of Gαi1 (1GDD).

  5. Supplementary Figure 5 Average of the maximum number of atomic contacts made by a residue across all non-redundant crystal structures

    These values are used to normalize the contacts with respect to the size of the amino acid (Methods).

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–5 and Supplementary Notes 1 and 2

  2. Life Sciences Reporting Summary

  3. Supplementary Data Set 1

    Sample downloadable statistics for PDB 3SN6.

  4. Supplementary Data Set 2

    Sample downloadable list of contacts for PDB 3SN6.

  5. Supplementary Data Set 3

    Sample downloadable structure report for PDB 5C1M.

  6. Supplementary Data Set 4

    Raw data used for analysis of Rhodopsin conformational cycle.

  7. Supplementary Data Set 5

    Contact statistics for all structures in the PDB.

  8. Supplementary Data Set 6

    Source Data, Supplementary Fig. 5a,d.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41594-017-0019-z

Further reading