Mapping the energetic and allosteric landscapes of protein binding domains

Faure, Andre J.; Domingo, Júlia; Schmiedel, Jörn M.; Hidalgo-Carcedo, Cristina; Diss, Guillaume; Lehner, Ben

doi:10.1038/s41586-022-04586-4

Article
Published: 06 April 2022

Mapping the energetic and allosteric landscapes of protein binding domains

Nature volume 604, pages 175–183 (2022)Cite this article

35k Accesses
55 Citations
299 Altmetric
Metrics details

Subjects

Abstract

Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development^1,2,3,4,5,6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes—here we examine binding and protein abundance—in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering ‘edgetic’ variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: ddPCA quantifies the effects of mutations on protein abundance and binding.**

**Fig. 2: From molecular phenotypes to free energy changes.**

**Fig. 3: Binding and folding free energy landscapes of the SH3 and PDZ domains.**

**Fig. 4: Mutational effects on protein stability.**

**Fig. 5: Major allosteric sites in protein binding domains.**

**Fig. 6: Protein surfaces are frequent sites of binding affinity modulation.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Srinivas Niranj Chandrasekaran, Beth A. Cimini, … Anne E. Carpenter

Genomic language model predicts protein co-regulation and function

Article Open access 03 April 2024

Yunha Hwang, Andre L. Cornman, … Peter R. Girguis

Data availability

All DNA sequencing data have been deposited in the Gene Expression Omnibus with accession number GSE184042. Protein structures were obtained from the Protein Data Bank with the following accessions: GRB2-SH3, 2VWF; PSD95-PDZ3, 1BE9; GB1: 1FCC; GRB2 homodimer: 1GRI, and the AlphaFold prediction for PSD95 was obtained from the AlphaFold Protein Structure Database with accession P78352.

Code availability

Source code used to fit thermodynamic models, perform all downstream analyses and to reproduce all figures in this work is available at https://github.com/lehner-lab/doubledeepms.

References

Guarnera, E. & Berezovsky, I. N. Allosteric drugs and mutations: chances, challenges, and necessity. Curr. Opin. Struct. Biol. 62, 149–157 (2020).
Article CAS PubMed Google Scholar
Arkin, M. R., Tang, Y. & Wells, J. A. Small-molecule inhibitors of protein-protein interactions: progressing toward the reality. Chem. Biol. 21, 1102–1114 (2014).
Article CAS PubMed PubMed Central Google Scholar
Motlagh, H. N., Wrabl, J. O., Li, J. & Hilser, V. J. The ensemble nature of allostery. Nature 508, 331–339 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Xie, J. & Lai, L. Protein topology and allostery. Curr. Opin. Struct. Biol. 62, 158–165 (2020).
Article CAS PubMed Google Scholar
Kuriyan, J. & Eisenberg, D. The origin of protein interactions and allostery in colocalization. Nature 450, 983–990 (2007).
Article ADS CAS PubMed Google Scholar
Nussinov, R. & Tsai, C.-J. Allostery in disease and in drug discovery. Cell 153, 293–305 (2013).
Article CAS PubMed Google Scholar
Monod, J., Changeux, J. P. & Jacob, F. Allosteric proteins and cellular control systems. J. Mol. Biol. 6, 306–329 (1963).
Article CAS PubMed Google Scholar
Ullmann, A. In memoriam: Jacques Monod (1910–1976). Genome Biol. Evol. 3, 1025–1033 (2011).
Article CAS PubMed PubMed Central Google Scholar
Halabi, N., Rivoire, O., Leibler, S. & Ranganathan, R. Protein sectors: evolutionary units of three-dimensional structure. Cell 138, 774–786 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dionne, U. et al. Protein context shapes the specificity of SH3 domain-mediated interactions in vivo. Nat. Commun. 12, 1597 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
McCormick, J. W., Russo, M. A., Thompson, S., Blevins, A. & Reynolds, K. A. Structurally distributed surface sites tune allosteric regulation. eLife 10, e68346 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bandaru, P. et al. Deconstruction of the Ras switching cycle through saturation mutagenesis. eLife 6, e27810 (2017).
Article PubMed PubMed Central Google Scholar
Reynolds, K. A., McLaughlin, R. N. & Ranganathan, R. Hot spots for allosteric regulation on protein surfaces. Cell 147, 1564–1575 (2011).
Article CAS PubMed PubMed Central Google Scholar
Oakes, B. L. et al. Profiling of engineering hotspots identifies an allosteric CRISPR–Cas9 switch. Nat. Biotechnol. 34, 646–651 (2016).
Article CAS PubMed PubMed Central Google Scholar
Leander, M., Yuan, Y., Meger, A., Cui, Q. & Raman, S. Functional plasticity and evolutionary adaptation of allosteric regulation. Proc. Natl Acad. Sci. USA 117, 25445–25454 (2020).
Article CAS PubMed PubMed Central Google Scholar
Tack, D. S. et al. The genotype-phenotype landscape of an allosteric protein. Mol. Syst. Biol. 17, e10179 (2021).
Article PubMed PubMed Central Google Scholar
Coyote-Maestas, W., He, Y., Myers, C. L. & Schmidt, D. Domain insertion permissibility-guided engineering of allostery in ion channels. Nat. Commun. 10, 290 (2019).
Article ADS PubMed PubMed Central Google Scholar
Li, X. & Lehner, B. Biophysical ambiguities prevent accurate genetic prediction. Nat. Commun. 11, 4923 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Otwinowski, J. Biophysical inference of epistasis and the effects of mutations on protein stability and function. Mol. Biol. Evol. 35, 2345–2354 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).
Article CAS PubMed PubMed Central Google Scholar
Woodsmith, J. et al. Protein interaction perturbation profiling at amino-acid resolution. Nat. Methods 14, 1213–1221 (2017).
Article CAS PubMed Google Scholar
Cagiada, M. et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol. Biol. Evol. 38, 3235–3246 (2021).
Article CAS PubMed PubMed Central Google Scholar
Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). Annu. Rev. Genom. Hum. Genet. 20, 433–460 (2019).
Article CAS Google Scholar
Diss, G. & Lehner, B. The genetic landscape of a physical interaction. eLife 7, e32472 (2018).
Article PubMed PubMed Central Google Scholar
Levy, E. D., Kowarzyk, J. & Michnick, S. W. High-resolution mapping of protein concentration reveals principles of proteome architecture and adaptation. Cell Rep. 7, 1333–1340 (2014).
Article CAS PubMed Google Scholar
Pelletier, J. N., Arndt, K. M., Plückthun, A. & Michnick, S. W. An in vivo library-versus-library selection of optimized protein-protein interactions. Nat. Biotechnol. 17, 683–690 (1999).
Article CAS PubMed Google Scholar
Campbell-Valois, F.-X., Tarassov, K. & Michnick, S. W. Massive sequence perturbation of a small protein. Proc. Natl Acad. Sci. USA. 102, 14988–14993 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).
Article CAS PubMed Google Scholar
Wei, X. et al. A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genet. 10, e1004819 (2014).
Article PubMed PubMed Central Google Scholar
Horovitz, A., Fleisher, R. C. & Mondal, T. Double-mutant cycles: new directions and applications. Curr. Opin. Struct. Biol. 58, 10–17 (2019).
Article CAS PubMed Google Scholar
Calosci, N. et al. Comparison of successive transition states for folding reveals alternative early folding pathways of two homologous proteins. Proc. Natl Acad. Sci. USA 105, 19241–19246 (2008).
Article ADS CAS PubMed PubMed Central Google Scholar
Kellogg, E. H., Leaver-Fay, A. & Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79, 830–838 (2011).
Article CAS PubMed Google Scholar
Nisthal, A., Wang, C. Y., Ary, M. L. & Mayo, S. L. Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc. Natl Acad. Sci. USA 116, 16367–16377 (2019).
Article CAS PubMed PubMed Central Google Scholar
Laursen, L., Kliche, J., Gianni, S. & Jemth, P. Supertertiary protein structure affects an allosteric network. Proc. Natl Acad. Sci. USA 117, 24294–24304 (2020).
Article CAS PubMed PubMed Central Google Scholar
Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).
Article CAS PubMed PubMed Central Google Scholar
Shoichet, B. K., Baase, W. A., Kuroki, R. & Matthews, B. W. A relationship between protein stability and protein function. Proc. Natl Acad. Sci. USA 92, 452–456 (1995).
Article ADS CAS PubMed PubMed Central Google Scholar
Redler, R. L., Das, J., Diaz, J. R. & Dokholyan, N. V. Protein destabilization as a common factor in diverse inherited disorders. J. Mol. Evol. 82, 11–16 (2016).
Article ADS CAS PubMed Google Scholar
Mosca, R., Céol, A. & Aloy, P. Interactome3D: adding structural details to protein networks. Nat. Methods 10, 47–53 (2013).
Article CAS PubMed Google Scholar
McLaughlin, R. N. Jr et al. The spatial architecture of protein function and adaptation. Nature 491, 138–142 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, J. et al. Mapping allosteric communications within individual proteins. Nat. Commun. 11, 3862 (2020).
Article ADS PubMed PubMed Central Google Scholar
Zhong, Q. et al. Edgetic perturbation models of human inherited disorders. Mol. Syst. Biol. 5, 321 (2009).
Article PubMed PubMed Central Google Scholar
Sahni, N. et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kinney, J. B., Murugan, A., Callan, C. G. Jr & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA 107, 9158–9163 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Forcier, T. L. et al. Measuring cis-regulatory energetics in living cells using allelic manifolds. eLife 7, e40618 (2018).
Article PubMed PubMed Central Google Scholar
Tareen, A. et al. MAVE-NN: learning genotype–phenotype maps from multiplex assays of variant effect. Preprint at bioArxiv https://doi.org/10.1101/2020.07.14.201475 (2020).
Adams, R. M., Mora, T., Walczak, A. M. & Kinney, J. B. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife 5, e23156 (2016).
Article PubMed PubMed Central Google Scholar
Kinney, J. B. & McCandlish, D. M. Massively parallel assays and quantitative sequence–function relationships. Annu. Rev. Genomics Hum. Genet. 20, 99–127 (2019).
Article CAS PubMed Google Scholar
Skoulidis, F. et al. Sotorasib for lung cancers with KRAS p.G12C mutation. N. Engl. J. Med. 384, 2371–2381 (2021).
Article CAS PubMed Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was funded by European Research Council (ERC) Advanced (883742) and Consolidator (616434) grants, the Spanish Ministry of Science and Innovation (PID2020-118723GB-I00, BFU2017-89488-P, EMBL Partnership, Severo Ochoa Centre of Excellence), the Bettencourt Schueller Foundation, the AXA Research Fund, Agencia de Gestio d’Ajuts Universitaris i de Recerca (AGAUR, 2017 SGR 1322), and the CERCA Program/Generalitat de Catalunya. J.M.S. was supported by an EMBO Long-Term Fellowship (ALTF 857-2016) and a Marie Skłodowska-Curie Fellowship (752809, EU Commission Horizon 2020). We thank M. Dias, J. Frazer and D. Marks for providing EVE and EVmutation predictions, J. Taipale for motivation and discussion, and all members of the Lehner laboratory for helpful discussions and suggestions, especially P. Baeza-Centurion, X. Li and A. M. New.

Author information

These authors contributed equally: Andre J. Faure, Júlia Domingo, Jörn M. Schmiedel

Authors and Affiliations

Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
Andre J. Faure, Júlia Domingo, Jörn M. Schmiedel, Cristina Hidalgo-Carcedo, Guillaume Diss & Ben Lehner
Universitat Pompeu Fabra (UPF), Barcelona, Spain
Ben Lehner
Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
Ben Lehner
New York Genome Center (NYGC), New York, NY, USA
Júlia Domingo
Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
Guillaume Diss

Authors

Andre J. Faure
View author publications
You can also search for this author in PubMed Google Scholar
Júlia Domingo
View author publications
You can also search for this author in PubMed Google Scholar
Jörn M. Schmiedel
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Hidalgo-Carcedo
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Diss
View author publications
You can also search for this author in PubMed Google Scholar
Ben Lehner
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.D., J.M.S., G.D. and B.L. conceived the project and designed the experiments. J.D., J.M.S. and C.H.-C. constructed the mutant libraries. J.D. performed the yeast competition experiments with help from C.H.-C. J.D. constructed the sequencing libraries for next-generation sequencing. A.J.F. led the data analysis with help from J.D. and J.M.S. A.J.F., J.M.S. and B.L. formulated the thermodynamic model. A.J.F. wrote the code to implement and fit the model. B.L., A.J.F. and J.D. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Ben Lehner.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer review reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Performance of thermodynamic models.

a, Distribution of the number of double aa substitutions comprising the same single aa substitution in the AbundancePCA (blue) or BindingPCA (red) assays for the GRB2-SH3 (left) and PSD95-PDZ3 (right) protein domains. Median indicated with a dashed line and text label. b–d, 2d density plots comparing the ddPCA observed fitness and the model predicted fitness of single (left panels) and double aa substitutions (right panels) for the binding (top panels) and when existing, folding assays (bottom panels) of the GB1 (b), GRB2-SH3 (c) and PSD95-PDZ3 (d) domains. R² = proportion variance explained. e–g, Same as (b–d) but using validation data comprising 10% of double mutants held out during model fitting.

Extended Data Fig. 2 Performance of thermodynamic models after restricting data to a single phenotype or a single genetic background.

a, 2d density plots comparing the observed and predicted fitness of the binding (top panels) and abundance (bottom panels) assays when only the BindingPCA data is used for training the model for GRB2-SH3 (left panels) and PSD95-PDZ3 (right panels). b, Same as in (a), but only using single mutant data from both binding and abundance assays to fit the models. R² = proportion variance explained. c, d, Comparisons of inferred free energy changes to previously reported PSD95-PDZ3 mutant in vitro measurements where only BindingPCA data (c) or single mutants (d) were used to fit thermodynamic models. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments) and the regression error bands indicate 95% CI for predictions from a linear model (panel c top: n = 22, bottom: n = 25, panel d top: n = 32, bottom: n = 29). r = Pearson correlation coefficient.

Extended Data Fig. 3 Performance of thermodynamic models after downsampling and comparisons of inferred free energy changes to smaller-scale datasets of in vitro measurements.

a, Dashed lines indicate the relationship between the percentage of fitness variance explained by model predictions with respect to held out validation data (10% of doubles) and the percentage of randomly retained double aa mutants used to train the model in the abundance (blue) or binding (red) assay. Results are shown separately for all protein domains. Solid lines indicate the relationship between the percentage variance explained by inferred free energies with respect to previously reported in vitro measurements for GB1 (Nisthal et al. 2019³³) and PSD95-PDZ3 (Laursen et al. 2020³⁴ for ΔΔG binding, red; Calosci et al. 2008³¹ for ΔΔG folding, blue), where models were trained using varying fractions of randomly downsampled double mutants (x-axis). The top scale indicates the median number of double aa mutants per single aa mutant in the full dataset. b, Comparisons of the model-inferred free energy changes to previously reported in vitro measurements for GRB2-SH3 (Malagrinò et al. 2019⁵⁶ for ΔΔG binding and Troilo et al. 2018⁵⁷ for ΔΔG folding) and PSD95-PDZ3 (Chi et al. 2008⁵⁸). Note the modest effect sizes of variants assayed in Malagrinò et al. 2019. Free energies are from a single model; error bars indicate 95%CI from a Monte Carlo simulation approach (n = 10 experiments, in vitro error measurement not provided) and the regression error bands indicate 95% CI for predictions from a linear model (top left: n = 11, bottom left: n = 15, top right: n = 11, bottom right: n = 12). r = Pearson correlation coefficient.

Extended Data Fig. 4 Correlation of folding free energy changes with computational predictions of mutational effects.

a, High confidence inferred folding free energy changes versus corresponding FoldX⁵⁹ predictions upon mutation (“PositionScan” command), excluding substitutions involving potentially large increases in mass/volume (at wild-type Glycine, Alanine, Valine) or the replacement of Histidine (whose charge depends on the pH and local chemical environment). b, High confidence inferred folding free energy changes versus corresponding PolyPhen2⁶⁰ predictions for amino acid substitutions reachable by single nucleotide substitutions (SNPs). c, High confidence inferred folding free energy changes versus corresponding EVE pathogenicity scores⁶¹. d, Same as in (c), but scores are based on evolutionary couplings⁶². r = Pearson correlation coefficient.

Extended Data Fig. 5 Binding and folding free energy landscapes of the GB1 domain and biophysical mechanism of mutations that affect binding.

a, b, Heatmaps showing inferred changes in free energies of binding (a) and folding (b) for the GB1 domain. The final row in each heatmap indicates the minimal distance to the ligand (considering the side chain heavy atoms or the alpha carbon atoms in the case of glycine). Free energy changes of ligand-proximal residues (ligand distance < 5 Å) are boxed. Low confidence estimates are indicated with dots (95% CI ≥ 1 kcal/mol). Free energy changes more extreme than ±2.5 were set to this limit. c, Scatter plot comparing binding and folding free energy changes of mutations in the core, surface and binding interface. Contours indicate estimates of 2D densities with 6 contour bins. d, Distribution of binding (red) and folding (blue) free energy changes. e, Percentage of mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively. f, Changes in free energy of binding (blue) or folding (red) of single aa substitutions with different fitness effects in the binding assay for the three protein domains. g, Percentage of core, surface or ligand binding mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively.

Extended Data Fig. 6 GB1 mutational effects on protein stability and characterisation of surface de-stabilizing residues.

a, 3D structure of GB1 (PDB entry 1FCC) where residue atoms are coloured by the position-wise average change in the free energy of folding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Violin plots indicating distributions of confident changes in free energy of folding (n = 898; ***P < 2.2e–16, two-sided Mann-Whitney U test comparing mutations in the core versus the remainder). c, Anti-correlation between the position-wise average change in free energy of folding and the solvent exposure of the corresponding residue (RSASA) in GB1. Error bars indicate 95% CI (n = 19). r = Pearson correlation coefficient. d, Percentage of core, surface or binding-interface residues in GB1 shown separately for de-stabilizing residues (positions with ≥ 5 stabilizing mutations, folding ∆∆G < 0, FDR = 0.05) and the remainder. Inset numbers are total counts. e, Violin plots indicating evolutionary conservation scores (from a multiple sequence alignment of 185, 8,852, 276,481 homologous sequences of the GB1, GRB2-SH3 and PSD95-PDZ3 domains, respectively) shown separately for surface de-stabilizing residues and remaining surface or core residues. f, Violin plots indicating hydrophobicity score distributions shown separately for surface de-stabilizing residues and remaining surface or core residues. g, 3D structures of the GRB2-SH3 and PSD95-PDZ3 domains (grey cartoons) with the side-chains of surface de-stabilizing residues highlighted in green sticks. Ligands are shown as black sticks. In the insets, in yellow is shown the SH2 domains of the second monomer of GRB2 when found in dimeric form (left, PDB entry 1GRI)⁶³, and relevant proximal portions of PSD95 C-terminal to the PDZ3 domain (middle and right, PDB entry 1BE9 and AlphaFold Protein Structure Database entry P78352).

Extended Data Fig. 7 Major allosteric sites in the GB1 domain and changes in free energy of binding in ligand binding interfaces.

a, 3D structures of the protein G B1 domain where residue atoms are coloured by the position-wise average absolute change in the free energy of binding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, GB1 domain structure with binding-interface residues (ligand distance < 5 Å) highlighted in red and major allosteric site residues highlighted in orange c, Relationship between the position-wise average absolute change in free energy of binding and the distance to the ligand (minimal side chain heavy atom distance) in the GB1 domain. Major allosteric sites (yellow) are defined as non-binding-interface residues with weighted average absolute change in free energy of binding higher than the average of binding-interface residue mutations (red). d, ROC curves for predicting ligand-contacting residues (ligand distance < 5 Å) using (weighted) mean absolute binding ∆∆G considering all variants or those with confident inferred free energies (conf.). AUC = Area Under the Curve. e, Inferring changes in free energy of binding provides insights into the interactions that mediate binding between GRB2-SH3 and GAB2 peptide, and how mutations disrupt binding. F7 and Y51 of the GRB2-SH3 domain contact P3 and P4 of the GAB2 peptide through aromatic-proline interactions (left heatmap). In these two positions, only mutations to Y, F, Q and H, which can interact with proline through aromatic-proline or amino-aromatic interactions, are tolerated, while all other amino acid substitutions result in decreased binding affinity (positive binding ∆∆G). Residue M46 can tolerate all amino acid substitutions except to positively charged residues (right heatmap). The closest residue of GAB2 is a lysine, and so a repulsive electrostatic interaction likely occurs when a positively charged amino acid occupies position 46 of the SH3 domain (binding ∆∆G of 2.1 and 1.99 for M46K and M46R respectively). f, ROC curves for predicting ligand contacting residues using (weighted) mean BindingPCA or AbundancePCA fitness.

Extended Data Fig. 8 Changes in fitness and free energy of binding and folding of major allosteric sites and allosteric mutations.

a, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for the GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel) protein domains. Variants are coloured by aa position if found in a major allosteric site. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). b, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site. Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding. c, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). d, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding.

Extended Data Fig. 9 Allosteric mutations in GB1 and enrichment of allosteric mutations in literature allosteric networks and specific residue types and classes.

a, Domain structure of GB1 with surface allosteric sites and surface residues with allosteric mutations highlighted in orange and green respectively. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Scatter plot showing the binding free energy changes of all mutations and coloured according to residue position: allosteric site (orange), orthosteric site/mutation (red), core allosteric mutation (blue), surface allosteric mutation (green). c, Percentage of allosteric mutations per residue versus ligand proximity, excluding sites within the binding interface. Points are coloured according to residue position and major allosteric sites are indicated (see legend). ρ = Spearman rank correlation coefficient. d. Total numbers of mutations decreasing or increasing binding fitness (i.e. the fraction of bound protein complex) beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively. e, Enrichment of allosteric mutations in sets of residues defined by previously reported allosteric networks in PSD95-PDZ3: Mclaughlin et al. 2012³⁹, Salinas et al. 2018⁶⁴, Gerek et al. 2011⁶⁵, Kumawat et al. 2017⁶⁶, Gianni et al. 2011⁶⁷, Kalescky et al. 2015⁶⁸, Du et al. 2010⁶⁹, Kaya et al. 2013⁷⁰. The enrichment (log₂ odds ratio) corresponding to a 2x2 contingency table is shown on the x-axis and the associated P value from a two-sided Fisher’s Exact Test is indicated. Residues within the binding interface (ligand distance < 5 Å) were ignored. Original literature allosteric network sizes are shown in parentheses. f-g, Same as (e) except sets of residues are defined by the identity of the WT or mutant amino acid (see legend) or their physicochemical properties (hydrophobic i.e. A, V, I, L, M, F, Y, W or charged i.e. R, H, K, D, E). Results are shown for all residues outside the binding interface (f) and further restricted to those residues in beta strands or helices i.e. not within loops/turns (g). Sets are ranked by their mean effect across the three protein domains.

Extended Data Fig. 10 Comparisons to computationally predicted allosteric coupling scores and mutational biases towards increased or decreased binding given the position in the domain structure.

a, Percentage of allosteric mutations per residue versus allosteric coupling scores estimated by a network-based perturbation propagation algorithm⁴⁰, where residues in the binding interface (ligand distance < 5 Å) are omitted as they represent the query set. Residues immediately adjacent to binding-interface residues in the linear aa sequence (i.e. backbone-backbone contacts which are disregarded by the Ohm algorithm) were given the maximum allosteric coupling score (1.0). Major allosteric sites (in yellow) and Spearman rank correlation coefficients (ρ) are indicated. b, Total numbers of mutations decreasing or increasing the free energy of binding beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively, stratified by position in the structure considering all variants (regardless of the confidence of inferred free energies).

Supplementary information

Supplementary Methods

This file contains Methods, supplementary text, equations and additional references.

Reporting Summary

Peer Review File

Supplementary Table 1

Primers used in this study.

Supplementary Table 2

Gene blocks used in this study.

Supplementary Table 3

Experimental details and numbers of the mutagenesis libraries in this study.

Supplementary Table 4

Illumina indexed primers combinations used in this study to demultiplex samples after deep sequencing.

Supplementary Table 5

Degenerate NNK oligonucleotides used for the GRB2-SH3 and PSD95-PDZ3 nicking mutagenesis libraries.

Supplementary Table 6

Fitness estimates for GB1, GRB2-SH3 and PSD95-PDZ3.

Supplementary Table 7

Inferred folding and binding free energy changes and associated annotations for GB1, GRB2-SH3 and PSD95-PDZ3.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Faure, A.J., Domingo, J., Schmiedel, J.M. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022). https://doi.org/10.1038/s41586-022-04586-4

Download citation

Received: 14 September 2021
Accepted: 25 February 2022
Published: 06 April 2022
Issue Date: 07 April 2022
DOI: https://doi.org/10.1038/s41586-022-04586-4

This article is cited by

Characterizing glucokinase variant mechanisms using a multiplexed abundance assay
- Sarah Gersing
- Thea K. Schulze
- Rasmus Hartmann-Petersen
Genome Biology (2024)
Mutational scanning pinpoints distinct binding sites of key ATGL regulators in lipolysis
- Johanna M. Kohlmayr
- Gernot F. Grabner
- Ulrich Stelzl
Nature Communications (2024)
Integrated multiplexed assays of variant effect reveal determinants of catechol-O-methyltransferase gene expression
- Ian Hoskins
- Shilpa Rao
- Can Cenik
Molecular Systems Biology (2024)
The energetic and allosteric landscape for KRAS inhibition
- Chenchun Weng
- Andre J. Faure
- Ben Lehner
Nature (2024)
A three-level regulatory mechanism of the aldo-keto reductase subfamily AKR12D
- Zhihong Xiao
- Jinyin Zha
- Shaobo Dai
Nature Communications (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.