Abstract
Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1,2,3,4,5,6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes—here we examine binding and protein abundance—in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering ‘edgetic’ variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All DNA sequencing data have been deposited in the Gene Expression Omnibus with accession number GSE184042. Protein structures were obtained from the Protein Data Bank with the following accessions: GRB2-SH3, 2VWF; PSD95-PDZ3, 1BE9; GB1: 1FCC; GRB2 homodimer: 1GRI, and the AlphaFold prediction for PSD95 was obtained from the AlphaFold Protein Structure Database with accession P78352.
Code availability
Source code used to fit thermodynamic models, perform all downstream analyses and to reproduce all figures in this work is available at https://github.com/lehner-lab/doubledeepms.
References
Guarnera, E. & Berezovsky, I. N. Allosteric drugs and mutations: chances, challenges, and necessity. Curr. Opin. Struct. Biol. 62, 149–157 (2020).
Arkin, M. R., Tang, Y. & Wells, J. A. Small-molecule inhibitors of protein-protein interactions: progressing toward the reality. Chem. Biol. 21, 1102–1114 (2014).
Motlagh, H. N., Wrabl, J. O., Li, J. & Hilser, V. J. The ensemble nature of allostery. Nature 508, 331–339 (2014).
Xie, J. & Lai, L. Protein topology and allostery. Curr. Opin. Struct. Biol. 62, 158–165 (2020).
Kuriyan, J. & Eisenberg, D. The origin of protein interactions and allostery in colocalization. Nature 450, 983–990 (2007).
Nussinov, R. & Tsai, C.-J. Allostery in disease and in drug discovery. Cell 153, 293–305 (2013).
Monod, J., Changeux, J. P. & Jacob, F. Allosteric proteins and cellular control systems. J. Mol. Biol. 6, 306–329 (1963).
Ullmann, A. In memoriam: Jacques Monod (1910–1976). Genome Biol. Evol. 3, 1025–1033 (2011).
Halabi, N., Rivoire, O., Leibler, S. & Ranganathan, R. Protein sectors: evolutionary units of three-dimensional structure. Cell 138, 774–786 (2009).
Dionne, U. et al. Protein context shapes the specificity of SH3 domain-mediated interactions in vivo. Nat. Commun. 12, 1597 (2021).
McCormick, J. W., Russo, M. A., Thompson, S., Blevins, A. & Reynolds, K. A. Structurally distributed surface sites tune allosteric regulation. eLife 10, e68346 (2021).
Bandaru, P. et al. Deconstruction of the Ras switching cycle through saturation mutagenesis. eLife 6, e27810 (2017).
Reynolds, K. A., McLaughlin, R. N. & Ranganathan, R. Hot spots for allosteric regulation on protein surfaces. Cell 147, 1564–1575 (2011).
Oakes, B. L. et al. Profiling of engineering hotspots identifies an allosteric CRISPR–Cas9 switch. Nat. Biotechnol. 34, 646–651 (2016).
Leander, M., Yuan, Y., Meger, A., Cui, Q. & Raman, S. Functional plasticity and evolutionary adaptation of allosteric regulation. Proc. Natl Acad. Sci. USA 117, 25445–25454 (2020).
Tack, D. S. et al. The genotype-phenotype landscape of an allosteric protein. Mol. Syst. Biol. 17, e10179 (2021).
Coyote-Maestas, W., He, Y., Myers, C. L. & Schmidt, D. Domain insertion permissibility-guided engineering of allostery in ion channels. Nat. Commun. 10, 290 (2019).
Li, X. & Lehner, B. Biophysical ambiguities prevent accurate genetic prediction. Nat. Commun. 11, 4923 (2020).
Otwinowski, J. Biophysical inference of epistasis and the effects of mutations on protein stability and function. Mol. Biol. Evol. 35, 2345–2354 (2018).
Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).
Woodsmith, J. et al. Protein interaction perturbation profiling at amino-acid resolution. Nat. Methods 14, 1213–1221 (2017).
Cagiada, M. et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol. Biol. Evol. 38, 3235–3246 (2021).
Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). Annu. Rev. Genom. Hum. Genet. 20, 433–460 (2019).
Diss, G. & Lehner, B. The genetic landscape of a physical interaction. eLife 7, e32472 (2018).
Levy, E. D., Kowarzyk, J. & Michnick, S. W. High-resolution mapping of protein concentration reveals principles of proteome architecture and adaptation. Cell Rep. 7, 1333–1340 (2014).
Pelletier, J. N., Arndt, K. M., Plückthun, A. & Michnick, S. W. An in vivo library-versus-library selection of optimized protein-protein interactions. Nat. Biotechnol. 17, 683–690 (1999).
Campbell-Valois, F.-X., Tarassov, K. & Michnick, S. W. Massive sequence perturbation of a small protein. Proc. Natl Acad. Sci. USA. 102, 14988–14993 (2005).
Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).
Wei, X. et al. A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genet. 10, e1004819 (2014).
Horovitz, A., Fleisher, R. C. & Mondal, T. Double-mutant cycles: new directions and applications. Curr. Opin. Struct. Biol. 58, 10–17 (2019).
Calosci, N. et al. Comparison of successive transition states for folding reveals alternative early folding pathways of two homologous proteins. Proc. Natl Acad. Sci. USA 105, 19241–19246 (2008).
Kellogg, E. H., Leaver-Fay, A. & Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79, 830–838 (2011).
Nisthal, A., Wang, C. Y., Ary, M. L. & Mayo, S. L. Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc. Natl Acad. Sci. USA 116, 16367–16377 (2019).
Laursen, L., Kliche, J., Gianni, S. & Jemth, P. Supertertiary protein structure affects an allosteric network. Proc. Natl Acad. Sci. USA 117, 24294–24304 (2020).
Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).
Shoichet, B. K., Baase, W. A., Kuroki, R. & Matthews, B. W. A relationship between protein stability and protein function. Proc. Natl Acad. Sci. USA 92, 452–456 (1995).
Redler, R. L., Das, J., Diaz, J. R. & Dokholyan, N. V. Protein destabilization as a common factor in diverse inherited disorders. J. Mol. Evol. 82, 11–16 (2016).
Mosca, R., Céol, A. & Aloy, P. Interactome3D: adding structural details to protein networks. Nat. Methods 10, 47–53 (2013).
McLaughlin, R. N. Jr et al. The spatial architecture of protein function and adaptation. Nature 491, 138–142 (2012).
Wang, J. et al. Mapping allosteric communications within individual proteins. Nat. Commun. 11, 3862 (2020).
Zhong, Q. et al. Edgetic perturbation models of human inherited disorders. Mol. Syst. Biol. 5, 321 (2009).
Sahni, N. et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015).
Kinney, J. B., Murugan, A., Callan, C. G. Jr & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA 107, 9158–9163 (2010).
Forcier, T. L. et al. Measuring cis-regulatory energetics in living cells using allelic manifolds. eLife 7, e40618 (2018).
Tareen, A. et al. MAVE-NN: learning genotype–phenotype maps from multiplex assays of variant effect. Preprint at bioArxiv https://doi.org/10.1101/2020.07.14.201475 (2020).
Adams, R. M., Mora, T., Walczak, A. M. & Kinney, J. B. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife 5, e23156 (2016).
Kinney, J. B. & McCandlish, D. M. Massively parallel assays and quantitative sequence–function relationships. Annu. Rev. Genomics Hum. Genet. 20, 99–127 (2019).
Skoulidis, F. et al. Sotorasib for lung cancers with KRAS p.G12C mutation. N. Engl. J. Med. 384, 2371–2381 (2021).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Acknowledgements
This work was funded by European Research Council (ERC) Advanced (883742) and Consolidator (616434) grants, the Spanish Ministry of Science and Innovation (PID2020-118723GB-I00, BFU2017-89488-P, EMBL Partnership, Severo Ochoa Centre of Excellence), the Bettencourt Schueller Foundation, the AXA Research Fund, Agencia de Gestio d’Ajuts Universitaris i de Recerca (AGAUR, 2017 SGR 1322), and the CERCA Program/Generalitat de Catalunya. J.M.S. was supported by an EMBO Long-Term Fellowship (ALTF 857-2016) and a Marie Skłodowska-Curie Fellowship (752809, EU Commission Horizon 2020). We thank M. Dias, J. Frazer and D. Marks for providing EVE and EVmutation predictions, J. Taipale for motivation and discussion, and all members of the Lehner laboratory for helpful discussions and suggestions, especially P. Baeza-Centurion, X. Li and A. M. New.
Author information
Authors and Affiliations
Contributions
J.D., J.M.S., G.D. and B.L. conceived the project and designed the experiments. J.D., J.M.S. and C.H.-C. constructed the mutant libraries. J.D. performed the yeast competition experiments with help from C.H.-C. J.D. constructed the sequencing libraries for next-generation sequencing. A.J.F. led the data analysis with help from J.D. and J.M.S. A.J.F., J.M.S. and B.L. formulated the thermodynamic model. A.J.F. wrote the code to implement and fit the model. B.L., A.J.F. and J.D. wrote the manuscript with input from all authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer review reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Performance of thermodynamic models.
a, Distribution of the number of double aa substitutions comprising the same single aa substitution in the AbundancePCA (blue) or BindingPCA (red) assays for the GRB2-SH3 (left) and PSD95-PDZ3 (right) protein domains. Median indicated with a dashed line and text label. b–d, 2d density plots comparing the ddPCA observed fitness and the model predicted fitness of single (left panels) and double aa substitutions (right panels) for the binding (top panels) and when existing, folding assays (bottom panels) of the GB1 (b), GRB2-SH3 (c) and PSD95-PDZ3 (d) domains. R2 = proportion variance explained. e–g, Same as (b–d) but using validation data comprising 10% of double mutants held out during model fitting.
Extended Data Fig. 2 Performance of thermodynamic models after restricting data to a single phenotype or a single genetic background.
a, 2d density plots comparing the observed and predicted fitness of the binding (top panels) and abundance (bottom panels) assays when only the BindingPCA data is used for training the model for GRB2-SH3 (left panels) and PSD95-PDZ3 (right panels). b, Same as in (a), but only using single mutant data from both binding and abundance assays to fit the models. R2 = proportion variance explained. c, d, Comparisons of inferred free energy changes to previously reported PSD95-PDZ3 mutant in vitro measurements where only BindingPCA data (c) or single mutants (d) were used to fit thermodynamic models. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments) and the regression error bands indicate 95% CI for predictions from a linear model (panel c top: n = 22, bottom: n = 25, panel d top: n = 32, bottom: n = 29). r = Pearson correlation coefficient.
Extended Data Fig. 3 Performance of thermodynamic models after downsampling and comparisons of inferred free energy changes to smaller-scale datasets of in vitro measurements.
a, Dashed lines indicate the relationship between the percentage of fitness variance explained by model predictions with respect to held out validation data (10% of doubles) and the percentage of randomly retained double aa mutants used to train the model in the abundance (blue) or binding (red) assay. Results are shown separately for all protein domains. Solid lines indicate the relationship between the percentage variance explained by inferred free energies with respect to previously reported in vitro measurements for GB1 (Nisthal et al. 201933) and PSD95-PDZ3 (Laursen et al. 202034 for ΔΔG binding, red; Calosci et al. 200831 for ΔΔG folding, blue), where models were trained using varying fractions of randomly downsampled double mutants (x-axis). The top scale indicates the median number of double aa mutants per single aa mutant in the full dataset. b, Comparisons of the model-inferred free energy changes to previously reported in vitro measurements for GRB2-SH3 (Malagrinò et al. 201956 for ΔΔG binding and Troilo et al. 201857 for ΔΔG folding) and PSD95-PDZ3 (Chi et al. 200858). Note the modest effect sizes of variants assayed in Malagrinò et al. 2019. Free energies are from a single model; error bars indicate 95%CI from a Monte Carlo simulation approach (n = 10 experiments, in vitro error measurement not provided) and the regression error bands indicate 95% CI for predictions from a linear model (top left: n = 11, bottom left: n = 15, top right: n = 11, bottom right: n = 12). r = Pearson correlation coefficient.
Extended Data Fig. 4 Correlation of folding free energy changes with computational predictions of mutational effects.
a, High confidence inferred folding free energy changes versus corresponding FoldX59 predictions upon mutation (“PositionScan” command), excluding substitutions involving potentially large increases in mass/volume (at wild-type Glycine, Alanine, Valine) or the replacement of Histidine (whose charge depends on the pH and local chemical environment). b, High confidence inferred folding free energy changes versus corresponding PolyPhen260 predictions for amino acid substitutions reachable by single nucleotide substitutions (SNPs). c, High confidence inferred folding free energy changes versus corresponding EVE pathogenicity scores61. d, Same as in (c), but scores are based on evolutionary couplings62. r = Pearson correlation coefficient.
Extended Data Fig. 5 Binding and folding free energy landscapes of the GB1 domain and biophysical mechanism of mutations that affect binding.
a, b, Heatmaps showing inferred changes in free energies of binding (a) and folding (b) for the GB1 domain. The final row in each heatmap indicates the minimal distance to the ligand (considering the side chain heavy atoms or the alpha carbon atoms in the case of glycine). Free energy changes of ligand-proximal residues (ligand distance < 5 Å) are boxed. Low confidence estimates are indicated with dots (95% CI ≥ 1 kcal/mol). Free energy changes more extreme than ±2.5 were set to this limit. c, Scatter plot comparing binding and folding free energy changes of mutations in the core, surface and binding interface. Contours indicate estimates of 2D densities with 6 contour bins. d, Distribution of binding (red) and folding (blue) free energy changes. e, Percentage of mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively. f, Changes in free energy of binding (blue) or folding (red) of single aa substitutions with different fitness effects in the binding assay for the three protein domains. g, Percentage of core, surface or ligand binding mutations that significantly decrease (top) or increase (bottom) fitness in the binding assay (FDR = 0.05) categorised by their biophysical mechanism. Pleiotropic mutations have significant changes in free energies of both folding and binding (FDR = 0.05) and are classified as either synergistic or antagonistic depending on whether their effects are in the same or different direction respectively.
Extended Data Fig. 6 GB1 mutational effects on protein stability and characterisation of surface de-stabilizing residues.
a, 3D structure of GB1 (PDB entry 1FCC) where residue atoms are coloured by the position-wise average change in the free energy of folding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Violin plots indicating distributions of confident changes in free energy of folding (n = 898; ***P < 2.2e–16, two-sided Mann-Whitney U test comparing mutations in the core versus the remainder). c, Anti-correlation between the position-wise average change in free energy of folding and the solvent exposure of the corresponding residue (RSASA) in GB1. Error bars indicate 95% CI (n = 19). r = Pearson correlation coefficient. d, Percentage of core, surface or binding-interface residues in GB1 shown separately for de-stabilizing residues (positions with ≥ 5 stabilizing mutations, folding ∆∆G < 0, FDR = 0.05) and the remainder. Inset numbers are total counts. e, Violin plots indicating evolutionary conservation scores (from a multiple sequence alignment of 185, 8,852, 276,481 homologous sequences of the GB1, GRB2-SH3 and PSD95-PDZ3 domains, respectively) shown separately for surface de-stabilizing residues and remaining surface or core residues. f, Violin plots indicating hydrophobicity score distributions shown separately for surface de-stabilizing residues and remaining surface or core residues. g, 3D structures of the GRB2-SH3 and PSD95-PDZ3 domains (grey cartoons) with the side-chains of surface de-stabilizing residues highlighted in green sticks. Ligands are shown as black sticks. In the insets, in yellow is shown the SH2 domains of the second monomer of GRB2 when found in dimeric form (left, PDB entry 1GRI)63, and relevant proximal portions of PSD95 C-terminal to the PDZ3 domain (middle and right, PDB entry 1BE9 and AlphaFold Protein Structure Database entry P78352).
Extended Data Fig. 7 Major allosteric sites in the GB1 domain and changes in free energy of binding in ligand binding interfaces.
a, 3D structures of the protein G B1 domain where residue atoms are coloured by the position-wise average absolute change in the free energy of binding. The FC domain of the human Immunoglobulin G is shown as black sticks. b, GB1 domain structure with binding-interface residues (ligand distance < 5 Å) highlighted in red and major allosteric site residues highlighted in orange c, Relationship between the position-wise average absolute change in free energy of binding and the distance to the ligand (minimal side chain heavy atom distance) in the GB1 domain. Major allosteric sites (yellow) are defined as non-binding-interface residues with weighted average absolute change in free energy of binding higher than the average of binding-interface residue mutations (red). d, ROC curves for predicting ligand-contacting residues (ligand distance < 5 Å) using (weighted) mean absolute binding ∆∆G considering all variants or those with confident inferred free energies (conf.). AUC = Area Under the Curve. e, Inferring changes in free energy of binding provides insights into the interactions that mediate binding between GRB2-SH3 and GAB2 peptide, and how mutations disrupt binding. F7 and Y51 of the GRB2-SH3 domain contact P3 and P4 of the GAB2 peptide through aromatic-proline interactions (left heatmap). In these two positions, only mutations to Y, F, Q and H, which can interact with proline through aromatic-proline or amino-aromatic interactions, are tolerated, while all other amino acid substitutions result in decreased binding affinity (positive binding ∆∆G). Residue M46 can tolerate all amino acid substitutions except to positively charged residues (right heatmap). The closest residue of GAB2 is a lysine, and so a repulsive electrostatic interaction likely occurs when a positively charged amino acid occupies position 46 of the SH3 domain (binding ∆∆G of 2.1 and 1.99 for M46K and M46R respectively). f, ROC curves for predicting ligand contacting residues using (weighted) mean BindingPCA or AbundancePCA fitness.
Extended Data Fig. 8 Changes in fitness and free energy of binding and folding of major allosteric sites and allosteric mutations.
a, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for the GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel) protein domains. Variants are coloured by aa position if found in a major allosteric site. Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). b, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site. Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding. c, Scatter plots of single aa substitutions’ changes in free energy of binding and folding for GB1 (left panel), GRB2-SH3 (middle panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Free energies are from a single model; error bars indicate 95% CI from a Monte Carlo simulation approach (n = 10 experiments). d, Scatter plots comparing abundance and binding fitness of single aa substitutions in GRB2-SH3 (left panel) and PSD95-PDZ3 (right panel). Variants are coloured by aa position if found in a major allosteric site (yellow) or in a position that has allosteric mutations (green). Data are presented as mean values and error bars indicate 95% CI (n = 3 biological replicates). The red line indicates the model-derived relationship between abundance and binding fitness in the absence of a change in the free energy of binding.
Extended Data Fig. 9 Allosteric mutations in GB1 and enrichment of allosteric mutations in literature allosteric networks and specific residue types and classes.
a, Domain structure of GB1 with surface allosteric sites and surface residues with allosteric mutations highlighted in orange and green respectively. The FC domain of the human Immunoglobulin G is shown as black sticks. b, Scatter plot showing the binding free energy changes of all mutations and coloured according to residue position: allosteric site (orange), orthosteric site/mutation (red), core allosteric mutation (blue), surface allosteric mutation (green). c, Percentage of allosteric mutations per residue versus ligand proximity, excluding sites within the binding interface. Points are coloured according to residue position and major allosteric sites are indicated (see legend). ρ = Spearman rank correlation coefficient. d. Total numbers of mutations decreasing or increasing binding fitness (i.e. the fraction of bound protein complex) beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively. e, Enrichment of allosteric mutations in sets of residues defined by previously reported allosteric networks in PSD95-PDZ3: Mclaughlin et al. 201239, Salinas et al. 201864, Gerek et al. 201165, Kumawat et al. 201766, Gianni et al. 201167, Kalescky et al. 201568, Du et al. 201069, Kaya et al. 201370. The enrichment (log2 odds ratio) corresponding to a 2x2 contingency table is shown on the x-axis and the associated P value from a two-sided Fisher’s Exact Test is indicated. Residues within the binding interface (ligand distance < 5 Å) were ignored. Original literature allosteric network sizes are shown in parentheses. f-g, Same as (e) except sets of residues are defined by the identity of the WT or mutant amino acid (see legend) or their physicochemical properties (hydrophobic i.e. A, V, I, L, M, F, Y, W or charged i.e. R, H, K, D, E). Results are shown for all residues outside the binding interface (f) and further restricted to those residues in beta strands or helices i.e. not within loops/turns (g). Sets are ranked by their mean effect across the three protein domains.
Extended Data Fig. 10 Comparisons to computationally predicted allosteric coupling scores and mutational biases towards increased or decreased binding given the position in the domain structure.
a, Percentage of allosteric mutations per residue versus allosteric coupling scores estimated by a network-based perturbation propagation algorithm40, where residues in the binding interface (ligand distance < 5 Å) are omitted as they represent the query set. Residues immediately adjacent to binding-interface residues in the linear aa sequence (i.e. backbone-backbone contacts which are disregarded by the Ohm algorithm) were given the maximum allosteric coupling score (1.0). Major allosteric sites (in yellow) and Spearman rank correlation coefficients (ρ) are indicated. b, Total numbers of mutations decreasing or increasing the free energy of binding beyond the indicated minimum or maximum thresholds (x-axis; two-sided Z-test P < 0.05) respectively, stratified by position in the structure considering all variants (regardless of the confidence of inferred free energies).
Supplementary information
Supplementary Methods
This file contains Methods, supplementary text, equations and additional references.
Supplementary Table 1
Primers used in this study.
Supplementary Table 2
Gene blocks used in this study.
Supplementary Table 3
Experimental details and numbers of the mutagenesis libraries in this study.
Supplementary Table 4
Illumina indexed primers combinations used in this study to demultiplex samples after deep sequencing.
Supplementary Table 5
Degenerate NNK oligonucleotides used for the GRB2-SH3 and PSD95-PDZ3 nicking mutagenesis libraries.
Supplementary Table 6
Fitness estimates for GB1, GRB2-SH3 and PSD95-PDZ3.
Supplementary Table 7
Inferred folding and binding free energy changes and associated annotations for GB1, GRB2-SH3 and PSD95-PDZ3.
Rights and permissions
About this article
Cite this article
Faure, A.J., Domingo, J., Schmiedel, J.M. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022). https://doi.org/10.1038/s41586-022-04586-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-022-04586-4
This article is cited by
-
Rosace: a robust deep mutational scanning analysis framework employing position and mean-variance shrinkage
Genome Biology (2024)
-
Characterizing glucokinase variant mechanisms using a multiplexed abundance assay
Genome Biology (2024)
-
Optimization of a deep mutational scanning workflow to improve quantification of mutation effects on protein–protein interactions
BMC Genomics (2024)
-
The genetic landscape of a metabolic interaction
Nature Communications (2024)
-
Integrated multiplexed assays of variant effect reveal determinants of catechol-O-methyltransferase gene expression
Molecular Systems Biology (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.