Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Mapping the energetic and allosteric landscapes of protein binding domains

Abstract

Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1,2,3,4,5,6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes—here we examine binding and protein abundance—in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering ‘edgetic’ variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: ddPCA quantifies the effects of mutations on protein abundance and binding.
Fig. 2: From molecular phenotypes to free energy changes.
Fig. 3: Binding and folding free energy landscapes of the SH3 and PDZ domains.
Fig. 4: Mutational effects on protein stability.
Fig. 5: Major allosteric sites in protein binding domains.
Fig. 6: Protein surfaces are frequent sites of binding affinity modulation.

Similar content being viewed by others

Data availability

All DNA sequencing data have been deposited in the Gene Expression Omnibus with accession number GSE184042. Protein structures were obtained from the Protein Data Bank with the following accessions: GRB2-SH3, 2VWF; PSD95-PDZ3, 1BE9; GB1: 1FCC; GRB2 homodimer: 1GRI, and the AlphaFold prediction for PSD95 was obtained from the AlphaFold Protein Structure Database with accession P78352.

Code availability

Source code used to fit thermodynamic models, perform all downstream analyses and to reproduce all figures in this work is available at https://github.com/lehner-lab/doubledeepms.

References

  1. Guarnera, E. & Berezovsky, I. N. Allosteric drugs and mutations: chances, challenges, and necessity. Curr. Opin. Struct. Biol. 62, 149–157 (2020).

    Article  CAS  PubMed  Google Scholar 

  2. Arkin, M. R., Tang, Y. & Wells, J. A. Small-molecule inhibitors of protein-protein interactions: progressing toward the reality. Chem. Biol. 21, 1102–1114 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Motlagh, H. N., Wrabl, J. O., Li, J. & Hilser, V. J. The ensemble nature of allostery. Nature 508, 331–339 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Xie, J. & Lai, L. Protein topology and allostery. Curr. Opin. Struct. Biol. 62, 158–165 (2020).

    Article  CAS  PubMed  Google Scholar 

  5. Kuriyan, J. & Eisenberg, D. The origin of protein interactions and allostery in colocalization. Nature 450, 983–990 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  6. Nussinov, R. & Tsai, C.-J. Allostery in disease and in drug discovery. Cell 153, 293–305 (2013).

    Article  CAS  PubMed  Google Scholar 

  7. Monod, J., Changeux, J. P. & Jacob, F. Allosteric proteins and cellular control systems. J. Mol. Biol. 6, 306–329 (1963).

    Article  CAS  PubMed  Google Scholar 

  8. Ullmann, A. In memoriam: Jacques Monod (1910–1976). Genome Biol. Evol. 3, 1025–1033 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Halabi, N., Rivoire, O., Leibler, S. & Ranganathan, R. Protein sectors: evolutionary units of three-dimensional structure. Cell 138, 774–786 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Dionne, U. et al. Protein context shapes the specificity of SH3 domain-mediated interactions in vivo. Nat. Commun. 12, 1597 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. McCormick, J. W., Russo, M. A., Thompson, S., Blevins, A. & Reynolds, K. A. Structurally distributed surface sites tune allosteric regulation. eLife 10, e68346 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bandaru, P. et al. Deconstruction of the Ras switching cycle through saturation mutagenesis. eLife 6, e27810 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Reynolds, K. A., McLaughlin, R. N. & Ranganathan, R. Hot spots for allosteric regulation on protein surfaces. Cell 147, 1564–1575 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Oakes, B. L. et al. Profiling of engineering hotspots identifies an allosteric CRISPR–Cas9 switch. Nat. Biotechnol. 34, 646–651 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Leander, M., Yuan, Y., Meger, A., Cui, Q. & Raman, S. Functional plasticity and evolutionary adaptation of allosteric regulation. Proc. Natl Acad. Sci. USA 117, 25445–25454 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Tack, D. S. et al. The genotype-phenotype landscape of an allosteric protein. Mol. Syst. Biol. 17, e10179 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Coyote-Maestas, W., He, Y., Myers, C. L. & Schmidt, D. Domain insertion permissibility-guided engineering of allostery in ion channels. Nat. Commun. 10, 290 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  18. Li, X. & Lehner, B. Biophysical ambiguities prevent accurate genetic prediction. Nat. Commun. 11, 4923 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. Otwinowski, J. Biophysical inference of epistasis and the effects of mutations on protein stability and function. Mol. Biol. Evol. 35, 2345–2354 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Woodsmith, J. et al. Protein interaction perturbation profiling at amino-acid resolution. Nat. Methods 14, 1213–1221 (2017).

    Article  CAS  PubMed  Google Scholar 

  22. Cagiada, M. et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol. Biol. Evol. 38, 3235–3246 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). Annu. Rev. Genom. Hum. Genet. 20, 433–460 (2019).

    Article  CAS  Google Scholar 

  24. Diss, G. & Lehner, B. The genetic landscape of a physical interaction. eLife 7, e32472 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Levy, E. D., Kowarzyk, J. & Michnick, S. W. High-resolution mapping of protein concentration reveals principles of proteome architecture and adaptation. Cell Rep. 7, 1333–1340 (2014).

    Article  CAS  PubMed  Google Scholar 

  26. Pelletier, J. N., Arndt, K. M., Plückthun, A. & Michnick, S. W. An in vivo library-versus-library selection of optimized protein-protein interactions. Nat. Biotechnol. 17, 683–690 (1999).

    Article  CAS  PubMed  Google Scholar 

  27. Campbell-Valois, F.-X., Tarassov, K. & Michnick, S. W. Massive sequence perturbation of a small protein. Proc. Natl Acad. Sci. USA. 102, 14988–14993 (2005).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  28. Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).

    Article  CAS  PubMed  Google Scholar 

  29. Wei, X. et al. A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genet. 10, e1004819 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Horovitz, A., Fleisher, R. C. & Mondal, T. Double-mutant cycles: new directions and applications. Curr. Opin. Struct. Biol. 58, 10–17 (2019).

    Article  CAS  PubMed  Google Scholar 

  31. Calosci, N. et al. Comparison of successive transition states for folding reveals alternative early folding pathways of two homologous proteins. Proc. Natl Acad. Sci. USA 105, 19241–19246 (2008).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Kellogg, E. H., Leaver-Fay, A. & Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79, 830–838 (2011).

    Article  CAS  PubMed  Google Scholar 

  33. Nisthal, A., Wang, C. Y., Ary, M. L. & Mayo, S. L. Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc. Natl Acad. Sci. USA 116, 16367–16377 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Laursen, L., Kliche, J., Gianni, S. & Jemth, P. Supertertiary protein structure affects an allosteric network. Proc. Natl Acad. Sci. USA 117, 24294–24304 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Shoichet, B. K., Baase, W. A., Kuroki, R. & Matthews, B. W. A relationship between protein stability and protein function. Proc. Natl Acad. Sci. USA 92, 452–456 (1995).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Redler, R. L., Das, J., Diaz, J. R. & Dokholyan, N. V. Protein destabilization as a common factor in diverse inherited disorders. J. Mol. Evol. 82, 11–16 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  38. Mosca, R., Céol, A. & Aloy, P. Interactome3D: adding structural details to protein networks. Nat. Methods 10, 47–53 (2013).

    Article  CAS  PubMed  Google Scholar 

  39. McLaughlin, R. N. Jr et al. The spatial architecture of protein function and adaptation. Nature 491, 138–142 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  40. Wang, J. et al. Mapping allosteric communications within individual proteins. Nat. Commun. 11, 3862 (2020).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  41. Zhong, Q. et al. Edgetic perturbation models of human inherited disorders. Mol. Syst. Biol. 5, 321 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Sahni, N. et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Kinney, J. B., Murugan, A., Callan, C. G. Jr & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA 107, 9158–9163 (2010).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  44. Forcier, T. L. et al. Measuring cis-regulatory energetics in living cells using allelic manifolds. eLife 7, e40618 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Tareen, A. et al. MAVE-NN: learning genotype–phenotype maps from multiplex assays of variant effect. Preprint at bioArxiv https://doi.org/10.1101/2020.07.14.201475 (2020).

  46. Adams, R. M., Mora, T., Walczak, A. M. & Kinney, J. B. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife 5, e23156 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Kinney, J. B. & McCandlish, D. M. Massively parallel assays and quantitative sequence–function relationships. Annu. Rev. Genomics Hum. Genet. 20, 99–127 (2019).

    Article  CAS  PubMed  Google Scholar 

  48. Skoulidis, F. et al. Sotorasib for lung cancers with KRAS p.G12C mutation. N. Engl. J. Med. 384, 2371–2381 (2021).

    Article  CAS  PubMed  Google Scholar 

  49. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was funded by European Research Council (ERC) Advanced (883742) and Consolidator (616434) grants, the Spanish Ministry of Science and Innovation (PID2020-118723GB-I00, BFU2017-89488-P, EMBL Partnership, Severo Ochoa Centre of Excellence), the Bettencourt Schueller Foundation, the AXA Research Fund, Agencia de Gestio d’Ajuts Universitaris i de Recerca (AGAUR, 2017 SGR 1322), and the CERCA Program/Generalitat de Catalunya. J.M.S. was supported by an EMBO Long-Term Fellowship (ALTF 857-2016) and a Marie Skłodowska-Curie Fellowship (752809, EU Commission Horizon 2020). We thank M. Dias, J. Frazer and D. Marks for providing EVE and EVmutation predictions, J. Taipale for motivation and discussion, and all members of the Lehner laboratory for helpful discussions and suggestions, especially P. Baeza-Centurion, X. Li and A. M. New.

Author information

Authors and Affiliations

Authors

Contributions

J.D., J.M.S., G.D. and B.L. conceived the project and designed the experiments. J.D., J.M.S. and C.H.-C. constructed the mutant libraries. J.D. performed the yeast competition experiments with help from C.H.-C. J.D. constructed the sequencing libraries for next-generation sequencing. A.J.F. led the data analysis with help from J.D. and J.M.S. A.J.F., J.M.S. and B.L. formulated the thermodynamic model. A.J.F. wrote the code to implement and fit the model. B.L., A.J.F. and J.D. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Ben Lehner.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer review reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Performance of thermodynamic models.

a, Distribution of the number of double aa substitutions comprising the same single aa substitution in the AbundancePCA (blue) or BindingPCA (red) assays for the GRB2-SH3 (left) and PSD95-PDZ3 (right) protein domains. Median indicated with a dashed line and text label. b–d, 2d density plots comparing the ddPCA observed fitness and the model predicted fitness of single (left panels) and double aa substitutions (right panels) for the binding (top panels) and when existing, folding assays (bottom panels) of the GB1 (b), GRB2-SH3 (c) and PSD95-PDZ3 (d) domains. R2 = proportion variance explained. e–g, Same as (b–d) but using validation data comprising 10% of double mutants held out during model fitting.

Extended Data Fig. 2 Performance of thermodynamic models after restricting data to a single phenotype or a single genetic background.

a, 2d density plots comparing the observed and predicted fitness of the binding (top panels) and abundance (bottom panels) assays when only the BindingPCA data is used for training the model for GRB2-SH3 (left panels) and PSD95-PDZ3 (right panels). b, Same as in (a), but only using single mutant data from both binding and abundance assays to fit the models. R2 = proportion variance explained. c, d, Comparisons of inferred free energy changes to previously reported PSD95-PDZ3 mutant in vitro measurements where only BindingPCA data (c) or single mutants (d) were used to fit thermodynamic models. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments) and the regression error bands indicate 95% CI for predictions from a linear model (panel c top: n = 22, bottom: n = 25, panel d top: n = 32, bottom: n = 29). r = Pearson correlation coefficient.

Extended Data Fig. 3 Performance of thermodynamic models after downsampling and comparisons of inferred free energy changes to smaller-scale datasets of in vitro measurements.

a, Dashed lines indicate the relationship between the percentage of fitness variance explained by model predictions with respect to held out validation data (10% of doubles) and the percentage of randomly retained double aa mutants used to train the model in the abundance (blue) or binding (red) assay. Results are shown separately for all protein domains. Solid lines indicate the relationship between the percentage variance explained by inferred free energies with respect to previously reported in vitro measurements for GB1 (Nisthal et al. 201933) and PSD95-PDZ3 (Laursen et al. 202034 for ΔΔG binding, red; Calosci et al. 200831 for ΔΔG folding, blue), where models were trained using varying fractions of randomly downsampled double mutants (x-axis). The top scale indicates the median number of double aa mutants per single aa mutant in the full dataset. b, Comparisons of the model-inferred free energy changes to previously reported in vitro measurements for GRB2-SH3 (Malagrinò et al. 201956 for ΔΔG binding and Troilo et al. 201857 for ΔΔG folding) and PSD95-PDZ3 (Chi et al. 200858). Note the modest effect sizes of variants assayed in Malagrinò et al. 2019. Free energies are from a single model; error bars indicate 95%CI from a Monte Carlo simulation approach (n = 10 experiments, in vitro error measurement not provided) and the regression error bands indicate 95% CI for predictions from a linear model (top left: n = 11, bottom left: n = 15, top right: n = 11, bottom right: n = 12). r = Pearson correlation coefficient.

Extended Data Fig. 4 Correlation of folding free energy changes with computational predictions of mutational effects.

a, High confidence inferred folding free energy changes versus corresponding FoldX59 predictions upon mutation (“PositionScan” command), excluding substitutions involving potentially large increases in mass/volume (at wild-type Glycine, Alanine, Valine) or the replacement of Histidine (whose charge depends on the pH and local chemical environment). b, High confidence inferred folding free energy changes versus corresponding PolyPhen260 predictions for amino acid substitutions reachable by single nucleotide substitutions (SNPs). c, High confidence inferred folding free energy changes versus corresponding EVE pathogenicity scores61. d, Same as in (c), but scores are based on evolutionary couplings62. r = Pearson correlation coefficient.

Extended Data Fig. 5 Binding and folding free energy landscapes of the GB1 domain and biophysical mechanism of mutations that affect binding.

a, b, Heatmaps showing inferred changes in free energies of binding (a) and folding (b) for the GB1 domain. The final row in each heatmap indicates the minimal distance to the ligand (considering the side chain heavy atoms or the alpha carbon atoms in the case of glycine). Free energy changes of ligand-proximal residues (ligand distance < 5 Å) are boxed. Low confidence estimates are indicated with dots (95% CI ≥ 1 kcal/mol). Free energy changes more extreme than ±2.5 were set to this limit. c, Scatter plot comparing binding and folding free energy changes of mutations in the core, surface and binding interface. Contours indicate estimates of 2D densities with 6 contour bins. d, Distribution of binding (red) and folding (blue) free energy changes. e, Percentage of mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively. f, Changes in free energy of binding (blue) or folding (red) of single aa substitutions with different fitness effects in the binding assay for the three protein domains. g, Percentage of core, surface or ligand binding mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively.

Extended Data Fig. 6 GB1 mutational effects on protein stability and characterisation of surface de-stabilizing residues.

a, 3D structure of GB1 (PDB entry 1FCC) where residue atoms are coloured by the position-wise average change in the free energy of folding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Violin plots indicating distributions of confident changes in free energy of folding (n = 898; ***P < 2.2e–16, two-sided Mann-Whitney U test comparing mutations in the core versus the remainder). c, Anti-correlation between the position-wise average change in free energy of folding and the solvent exposure of the corresponding residue (RSASA) in GB1. Error bars indicate 95% CI (n = 19). r = Pearson correlation coefficient. d, Percentage of core, surface or binding-interface residues in GB1 shown separately for de-stabilizing residues (positions with ≥ 5 stabilizing mutations, folding ∆∆G < 0, FDR = 0.05) and the remainder. Inset numbers are total counts. e, Violin plots indicating evolutionary conservation scores (from a multiple sequence alignment of 185, 8,852, 276,481 homologous sequences of the GB1, GRB2-SH3 and PSD95-PDZ3 domains, respectively) shown separately for surface de-stabilizing residues and remaining surface or core residues. f, Violin plots indicating hydrophobicity score distributions shown separately for surface de-stabilizing residues and remaining surface or core residues. g, 3D structures of the GRB2-SH3 and PSD95-PDZ3 domains (grey cartoons) with the side-chains of surface de-stabilizing residues highlighted in green sticks. Ligands are shown as black sticks. In the insets, in yellow is shown the SH2 domains of the second monomer of GRB2 when found in dimeric form (left, PDB entry 1GRI)63, and relevant proximal portions of PSD95 C-terminal to the PDZ3 domain (middle and right, PDB entry 1BE9 and AlphaFold Protein Structure Database entry P78352).

Extended Data Fig. 7 Major allosteric sites in the GB1 domain and changes in free energy of binding in ligand binding interfaces.

a, 3D structures of the protein G B1 domain where residue atoms are coloured by the position-wise average absolute change in the free energy of binding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, GB1 domain structure with binding-interface residues (ligand distance < 5 Å) highlighted in red and major allosteric site residues highlighted in orange c, Relationship between the position-wise average absolute change in free energy of binding and the distance to the ligand (minimal side chain heavy atom distance) in the GB1 domain. Major allosteric sites (yellow) are defined as non-binding-interface residues with weighted average absolute change in free energy of binding higher than the average of binding-interface residue mutations (red). d, ROC curves for predicting ligand-contacting residues (ligand distance < 5 Å) using (weighted) mean absolute binding ∆∆G considering all variants or those with confident inferred free energies (conf.). AUC = Area Under the Curve. e, Inferring changes in free energy of binding provides insights into the interactions that mediate binding between GRB2-SH3 and GAB2 peptide, and how mutations disrupt binding. F7 and Y51 of the GRB2-SH3 domain contact P3 and P4 of the GAB2 peptide through aromatic-proline interactions (left heatmap). In these two positions, only mutations to Y, F, Q and H, which can interact with proline through aromatic-proline or amino-aromatic interactions, are tolerated, while all other amino acid substitutions result in decreased binding affinity (positive binding ∆∆G). Residue M46 can tolerate all amino acid substitutions except to positively charged residues (right heatmap). The closest residue of GAB2 is a lysine, and so a repulsive electrostatic interaction likely occurs when a positively charged amino acid occupies position 46 of the SH3 domain (binding ∆∆G of 2.1 and 1.99 for M46K and M46R respectively). f, ROC curves for predicting ligand contacting residues using (weighted) mean BindingPCA or AbundancePCA fitness.

Extended Data Fig. 8 Changes in fitness and free energy of binding and folding of major allosteric sites and allosteric mutations.

a, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for the GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel) protein domains. Variants are coloured by aa position if found in a major allosteric site. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). b, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site. Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding. c, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). d, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding.

Extended Data Fig. 9 Allosteric mutations in GB1 and enrichment of allosteric mutations in literature allosteric networks and specific residue types and classes.

a, Domain structure of GB1 with surface allosteric sites and surface residues with allosteric mutations highlighted in orange and green respectively. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Scatter plot showing the binding free energy changes of all mutations and coloured according to residue position: allosteric site (orange), orthosteric site/mutation (red), core allosteric mutation (blue), surface allosteric mutation (green). c, Percentage of allosteric mutations per residue versus ligand proximity, excluding sites within the binding interface. Points are coloured according to residue position and major allosteric sites are indicated (see legend). ρ = Spearman rank correlation coefficient. d. Total numbers of mutations decreasing or increasing binding fitness (i.e. the fraction of bound protein complex) beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively. e, Enrichment of allosteric mutations in sets of residues defined by previously reported allosteric networks in PSD95-PDZ3: Mclaughlin et al. 201239, Salinas et al. 201864, Gerek et al. 201165, Kumawat et al. 201766, Gianni et al. 201167, Kalescky et al. 201568, Du et al. 201069, Kaya et al. 201370. The enrichment (log2 odds ratio) corresponding to a 2x2 contingency table is shown on the x-axis and the associated P value from a two-sided Fisher’s Exact Test is indicated. Residues within the binding interface (ligand distance < 5 Å) were ignored. Original literature allosteric network sizes are shown in parentheses. f-g, Same as (e) except sets of residues are defined by the identity of the WT or mutant amino acid (see legend) or their physicochemical properties (hydrophobic i.e. A, V, I, L, M, F, Y, W or charged i.e. R, H, K, D, E). Results are shown for all residues outside the binding interface (f) and further restricted to those residues in beta strands or helices i.e. not within loops/turns (g). Sets are ranked by their mean effect across the three protein domains.

Extended Data Fig. 10 Comparisons to computationally predicted allosteric coupling scores and mutational biases towards increased or decreased binding given the position in the domain structure.

a, Percentage of allosteric mutations per residue versus allosteric coupling scores estimated by a network-based perturbation propagation algorithm40, where residues in the binding interface (ligand distance < 5 Å) are omitted as they represent the query set. Residues immediately adjacent to binding-interface residues in the linear aa sequence (i.e. backbone-backbone contacts which are disregarded by the Ohm algorithm) were given the maximum allosteric coupling score (1.0). Major allosteric sites (in yellow) and Spearman rank correlation coefficients (ρ) are indicated. b, Total numbers of mutations decreasing or increasing the free energy of binding beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively, stratified by position in the structure considering all variants (regardless of the confidence of inferred free energies).

Supplementary information

Supplementary Methods

This file contains Methods, supplementary text, equations and additional references.

Reporting Summary

Peer Review File

Supplementary Table 1

Primers used in this study.

Supplementary Table 2

Gene blocks used in this study.

Supplementary Table 3

Experimental details and numbers of the mutagenesis libraries in this study.

Supplementary Table 4

Illumina indexed primers combinations used in this study to demultiplex samples after deep sequencing.

Supplementary Table 5

Degenerate NNK oligonucleotides used for the GRB2-SH3 and PSD95-PDZ3 nicking mutagenesis libraries.

Supplementary Table 6

Fitness estimates for GB1, GRB2-SH3 and PSD95-PDZ3.

Supplementary Table 7

Inferred folding and binding free energy changes and associated annotations for GB1, GRB2-SH3 and PSD95-PDZ3.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Faure, A.J., Domingo, J., Schmiedel, J.M. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022). https://doi.org/10.1038/s41586-022-04586-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-04586-4

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research