Most proteins assemble into multisubunit complexes1. The persistence of these complexes across evolutionary time is usually explained as the result of natural selection for functional properties that depend on multimerization, such as intersubunit allostery or the capacity to do mechanical work2. In many complexes, however, multimerization does not enable any known function3. An alternative explanation is that multimers could become entrenched if substitutions accumulate that are neutral in multimers but deleterious in monomers; purifying selection would then prevent reversion to the unassembled form, even if assembly per se does not enhance biological function3,4,5,6,7. Here we show that a hydrophobic mutational ratchet systematically entrenches molecular complexes. By applying ancestral protein reconstruction and biochemical assays to the evolution of steroid hormone receptors, we show that an ancient hydrophobic interface, conserved for hundreds of millions of years, is entrenched because exposure of this interface to solvent reduces protein stability and causes aggregation, even though the interface makes no detectable contribution to function. Using structural bioinformatics, we show that a universal mutational propensity drives sites that are buried in multimeric interfaces to accumulate hydrophobic substitutions to levels that are not tolerated in monomers. In a database of hundreds of families of multimers, most show signatures of long-term hydrophobic entrenchment. It is therefore likely that many protein complexes persist because a simple ratchet-like mechanism entrenches them across evolutionary time, even when they are functionally gratuitous.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 28 April 2023
Structure of a mitochondrial ribosome with fragmented rRNA in complex with membrane-targeting elements
Nature Communications Open Access 17 October 2022
Nature Communications Open Access 15 July 2022
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Data have been deposited in the Open Science Framework (https://osf.io/) under accession GTJ86, including alignment, phylogeny, sequences and posterior probability of ancestral reconstructions; list of PDB identifiers for coordinates of dimers and monomers in our structural database; and molecular dynamics trajectories.
Scripts and code for structural bioinformatics analysis have been deposited at github (https://github.com/JoeThorntonLab).
Marsh, J. A. & Teichmann, S. A. Structure, dynamics, assembly, and evolution of protein complexes. Annu. Rev. Biochem. 84, 551–575 (2015).
Goodsell, D. S. & Olson, A. J. Structural symmetry and protein function. Annu. Rev. Biophys. Biomol. Struct. 29, 105–153 (2000).
Lynch, M. Evolutionary diversification of the multimeric states of proteins. Proc. Natl Acad. Sci. USA 110, E2821–E2828 (2013).
Lukeš, J., Archibald, J. M., Keeling, P. J., Doolittle, W. F. & Gray, M. W. How a neutral evolutionary ratchet can build cellular complexity. IUBMB Life 63, 528–537 (2011).
Manhart, M. & Morozov, A. V. Protein folding and binding can emerge as evolutionary spandrels through structural coupling. Proc. Natl Acad. Sci. USA 112, 1797–1802 (2015).
Schank, J. C. & Wimsatt, W. C. Generative entrenchment and evolution. PSA: Proc. Biennial Meeting Philos. Sci. Assoc. 1986, 33–60 (1986).
Muller, H. J. Genetic variability, twin hybrids and constant hybrids, in a case of balanced lethal factors. Genetics 3, 422–499 (1918).
Moody, A. D., Miura, M. T., Connaghan, K. D. & Bain, D. L. Thermodynamic dissection of estrogen receptor-promoter interactions reveals that steroid receptors differentially partition their self-association and promoter binding energetics. Biochemistry 51, 739–749 (2012).
Tamrazi, A., Carlson, K. E., Daniels, J. R., Hurth, K. M. & Katzenellenbogen, J. A. Estrogen receptor dimerization: ligand binding regulates dimer affinity and dimer dissociation rate. Mol. Endocrinol. 16, 2706–2719 (2002).
Robblee, J. P., Miura, M. T. & Bain, D. L. Glucocorticoid receptor-promoter interactions: energetic dissection suggests a framework for the specificity of steroid receptor-mediated gene regulation. Biochemistry 51, 4463–4472 (2012).
Alroy, I. & Freedman, L. P. DNA binding analysis of glucocorticoid receptor specificity mutants. Nucleic Acids Res. 20, 1045–1052 (1992).
McKeown, A. N. et al. Evolution of DNA specificity in a transcription factor family produced a new gene regulatory module. Cell 159, 58–68 (2014).
Harms, M. J. et al. Biophysical mechanisms for large-effect mutations in the evolution of steroid hormone receptors. Proc. Natl Acad. Sci. USA 110, 11475–11480 (2013).
Eick, G. N., Colucci, J. K., Harms, M. J., Ortlund, E. A. & Thornton, J. W. Evolution of minimal specificity and promiscuity in steroid hormone receptors. PLoS Genet. 8, e1003072 (2012).
Fagart, J. et al. Crystal structure of a mutant mineralocorticoid receptor responsible for hypertension. Nat. Struct. Mol. Biol. 12, 554–555 (2005).
Kauppi, B. et al. The three-dimensional structures of antagonistic and agonistic forms of the glucocorticoid receptor ligand-binding domain: RU-486 induces a transconformation that leads to active antagonism. J. Biol. Chem. 278, 22748–22754 (2003).
Sack, J. S. et al. Crystallographic structures of the ligand-binding domains of the androgen receptor and its T877A mutant complexed with the natural agonist dihydrotestosterone. Proc. Natl Acad. Sci. USA 98, 4904–4909 (2001).
Williams, S. P. & Sigler, P. B. Atomic structure of progesterone complexed with its receptor. Nature 393, 392–396 (1998).
Bowie, J. U., Reidhaar-Olson, J. F., Lim, W. A. & Sauer, R. T. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science 247, 1306–1310 (1990).
Pakula, A. A. & Sauer, R. T. Reverse hydrophobic effects relieved by amino-acid substitutions at a protein surface. Nature 344, 363–364 (1990).
Valentine, J. E., Kalkhoven, E., White, R., Hoare, S. & Parker, M. G. Mutations in the estrogen receptor ligand binding domain discriminate between hormone-dependent transactivation and transrepression. J. Biol. Chem. 275, 25322–25329 (2000).
Ince, B. A., Zhuang, Y., Wrenn, C. K., Shapiro, D. J. & Katzenellenbogen, B. S. Powerful dominant negative mutants of the human estrogen receptor. J. Biol. Chem. 268, 14026–14032 (1993).
Xu, J., Nawaz, Z., Tsai, S. Y., Tsai, M. J. & O’Malley, B. W. The extreme C terminus of progesterone receptor contains a transcriptional repressor domain that functions through a putative corepressor. Proc. Natl Acad. Sci. USA 93, 12195–12199 (1996).
Zhang, S., Liang, X. & Danielsen, M. Role of the C terminus of the glucocorticoid receptor in hormone binding and agonist/antagonist discrimination. Mol. Endocrinol. 10, 24–34 (1996).
Ahnert, S. E., Marsh, J. A., Hernández, H., Robinson, C. V. & Teichmann, S. A. Principles of assembly reveal a periodic table of protein complexes. Science 350, aaa2245 (2015).
Finnigan, G. C., Hanson-Smith, V., Stevens, T. H. & Thornton, J. W. Evolution of increased complexity in a molecular machine. Nature 481, 360–364 (2012).
Force, A. et al. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151, 1531–1545 (1999).
Gray, M. W., Lukes, J., Archibald, J. M., Keeling, P. J. & Doolittle, W. F. Irremediable complexity? Science 330, 920–921 (2010).
Lynch, M. The frailty of adaptive hypotheses for the origins of organismal complexity. Proc. Natl Acad. Sci. USA 104 (Suppl. 1), 8597–8604 (2007).
Stoltzfus, A. On the possibility of constructive neutral evolution. J. Mol. Evol. 49, 169–181 (1999).
Hershberg, R. & Petrov, D. A. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet. 6, e1001115 (2010).
Hochberg, G. K. A. et al. Structural principles that enable oligomeric small heat-shock protein paralogs to evolve distinct functions. Science 359, 930–935 (2018).
Kaltenegger, E. & Ober, D. Paralogue interference affects the dynamics after gene duplication. Trends Plant Sci. 20, 814–821 (2015).
Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164–1165 (2011).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Katsu, Y. et al. A second estrogen receptor from Japanese lamprey (Lethenteron japonicum) does not have activities for estrogen binding and transcription. Gen. Comp. Endocrinol. 236, 105–114 (2016).
Simakov, O. et al. Deeply conserved synteny resolves early events in vertebrate evolution. Nat. Ecol. Evol. 4, 820–830 (2020).
Philippe, H. et al. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470, 255–258 (2011).
Cannon, J. T. et al. Xenacoelomorpha is the sister group to Nephrozoa. Nature 530, 89–93 (2016).
Bridgham, J. T. et al. Protein evolution by molecular tinkering: diversification of the nuclear receptor superfamily from a ligand-dependent ancestor. PLoS Biol. 8, e1000497 (2010).
Lemoine, F. et al. Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556, 452–456 (2018).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Cong, X. et al. Determining membrane protein-lipid binding thermodynamics using native mass spectrometry. J. Am. Chem. Soc. 138, 4346–4349 (2016).
Marty, M. T. et al. Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to polydisperse ensembles. Anal. Chem. 87, 4370–4376 (2015).
Mazurenko, S. et al. CalFitter: a web server for analysis of protein thermal denaturation data. Nucleic Acids Res. 46 (W1), W344–W349 (2018).
Abraham, M. J. et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015).
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78, 1950–1958 (2010).
Jorgensen, W. L. et al. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935 (1983).
Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A. & Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 25, 1157–1174 (2004).
Feenstra, K. A. et al. Improving efficiency of large time-scale molecular dynamics simulations of hydrogen-rich systems. J. Comput. Chem. 20, 786–798 (1999).
Larsson, P., Kneiszl, R. C. & Marklund, E. G. MkVsites: A tool for creating GROMACS virtual sites parameters to increase performance in all-atom molecular dynamics simulations. J. Comput. Chem. 41, 1564–1569 (2020).
Hess, B. et al. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463–1472 (1997).
Miyamoto, S. & Kollman, P. A. Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models. J. Comput. Chem. 13, 952–962 (1992).
Berendsen, H. J. C. et al. Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690 (1984).
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 52, 7182–7190 (1981).
Bussi, G., Donadio, D. & Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. 126, 014101 (2007).
Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235–242 (2011).
Tsai, C. J., Lin, S. L., Wolfson, H. J. & Nussinov, R. Studies of protein-protein interfaces: a statistical analysis of the hydrophobic effect. Protein Sci. 6, 53–64 (1997).
Tien, M. Z., Meyer, A. G., Sydykova, D. K., Spielman, S. J. & Wilke, C. O. Maximum allowed solvent accessibilites of residues in proteins. PLoS One 8, e80635 (2013).
Degiacomi, M. T., Schmidt, C., Baldwin, A. J. & Benesch, J. L. P. Accommodating protein dynamics in the modeling of chemical crosslinks. Structure 25, 1751–1757.e5 (2017).
Wang, D. GCevobase: an evolution-based database for GC content in eukaryotic genomes. Bioinformatics 34, 2129–2131 (2018).
Zhu, Y. O., Siegal, M. L., Hall, D. W. & Petrov, D. A. Precise estimates of mutation rate and spectrum in yeast. Proc. Natl Acad. Sci. USA 111, E2310–E2318 (2014).
Lee, H., Popodi, E., Tang, H. & Foster, P. L. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl Acad. Sci. USA 109, E2774–E2783 (2012).
Dumont, B. L. Significant strain variation in the mutation spectra of inbred laboratory mice. Mol. Biol. Evol. 36, 865–874 (2019).
Dettman, J. R., Sztepanacz, J. L. & Kassen, R. The properties of spontaneous mutations in the opportunistic pathogen Pseudomonas aeruginosa. BMC Genomics 17, 27 (2016).
We thank J. Bridgham for cell culture training and advice, A. Pillai for assistance with experiments, and members of the Thornton Laboratory for comments. Molecular dynamics computations were performed on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Projects SNIC 2019/8-36 and SNIC 2019/3-189. Supported by a Chicago Fellowship (G.K.A.H.), NIH R01GM131128 (J.W.T.) and R01GM121931 (J.W.T.).
The authors declare no competing interests.
Peer review information Nature thanks Douglas Theobald, Claus Wilke and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Phylogeny of steroid receptors and related nuclear receptor family members. AR, androgen receptors, PR, progestorone receptors, GR, gluccocortociod receptors, MR, mineralocortocoid receptors. Sequence identifiers are in brackets. This topology corresponds to the ‘Chordate tree’ in Extended Data Fig. 2. Scale bar, expected substitutions per site. b, Sequence alignment of the human ER and GR LBDs, with the MAP sequences of AncSR1 and AncSR2. Green, C-terminal extension. Most ERs contain additional sequence on the C terminus that is unalignable, even among ERs.
a,b, Distribution of posterior probabilities (PP) of the maximum a posteriori (MAP) state at each site in reconstructed LBDs (top) and DBDs (bottom) of AncSR1 (a) and AncSR2 (b). c, Stoichiometry of purified alternative LBD reconstructions (AltAll) of AncSR1 (pink) and AncSR2 (green), as measured by SEC-MALS. AncSR1 is a dimer, AncSR2 a monomer. AltAll reconstructions contain the MAP state at unambiguously reconstructed sites and the state with the next highest PP at all ambiguously reconstructed wites. d, The ‘chordate’ phylogeny (top) was used for primary ancestral reconstructions; it places the gene duplication yielding ERs and kSRs within the chordates. An alternative less parsimonious tree (‘Bilaterian’ because it places the duplication deep in the Bilateria, bottom), has very slightly higher likelihood but requires two additional gene losses (dashed lines). The Bilaterian topology was used for alternative reconstructions (AltPhy). Node labels, approximate likelihood ratio test statistic and transfer bootstrap value. lnl, log-likelihood. e, Distribution of per-site posterior probabilities for reconstructed LBDs on the Bilaterian topology for AncSR1 (top) and AncSR2 (bottom). f, Stoichiometry of purified AltPhy versions of AncSR1 (pink) and AncSR2 (green) LBDs, as measured by SEC-MALS. The average molar mass and elution time of AltPhy-AncSR1-LBD are between that of a dimer and a monomer, indicating that it is a fast-exchanging, weaker dimer than other AncSR1-LBD versions.
Extended Data Fig. 3 Concentration-dependence of activation and dimerization by AncSR1-LBD and mutants.
a, Activation of AncSR1 from 40 ng ERE response element plasmid as a function of the AncSR1 plasmid concentration. Grey bar, concentration at which assays in Fig. 2f were performed. b, Molar fraction in the dimeric form measured by nMS as a function of LBD concentration for AncSR1-LBD (purple) and dimerization-interface mutants SR1-LBD(+3) (black) and SR1-LBD(L184E) (grey). Dissociation constant (Kd) estimated by nonlinear regression is indicated next to each curve. c, Dimeric fraction as a function of LBD concentration for AncSR1-LBD (purple) and activation-helix mutant SR1-LBD(L126Q) (grey), which affects activation but not dimerization.
a, SEC of AncSR2 LBD (top) and mutants that delete the CTE (ΔCTE) or contain point mutations that impair CTE-LBD interactions (bottom), when fused to MBP. The mutants elute in the same fraction as AncSR2, demonstrating that they are monomeric and that re-exposing the patch does not re-establish dimerization. b, TEV cleavage of AncSR2 mutants in the absence (left) and presence (right) of 2% Triton X-100. The positions of bands corresponding to the uncleaved construct, cleaved MBP, cleaved LBD, and TEV protease are indicated. This experiment was performed twice, with similar results. See Supplementary Fig. 1 for uncropped gels. c, Average root mean square deviation (r.m.s.d.) from replicate 2-μs molecular dynamics simulations of AncSR2-LBD (WT) and ΔCTE mutant. The average Cα r.m.s.d. in pairwise comparisons of all simulations is shown as a heatmap. d, SEC-MALS trace of AncSR1-LBD fused to the CTE of AncSR2-LBD. The LBD is still dimeric.
Extended Data Fig. 5 Observed hydrophobicity of interfaces compared to expected hydrophobicity from mutation.
a, Difference between the fraction of residues that are hydrophobic in dimer interfaces versus that on solvent-exposed surfaces of the same proteins. The histogram shows the distribution of this difference across every protein in our structural database. b, Fraction of hydrophobic residues in dimer interfaces as a function of the number of interface residues. The variation in the fraction is caused mostly by very small interfaces. c, Expected equilibrium fraction of hydrophobic amino acids from mutation alone. Black: expectation based on GC content and the genetic code. Red dots and lines: mean and standard deviation of the hydrophobic fraction of residues observed in 200 replicate simulations using mutational spectra from mutation accumulation experiments (Fig. 4b), plotted against GC content of the organism tested. d, GC content of organisms represented by proteins in our database.
Supplemental Data: 1 Raw gel images. Uncropped gels for data presented in Extended Data Figure 4b. Boxes are drawn around lanes that were used in for the figure. Supplemental Data: 2 Scaled Q matrices based on mutation accumulation experiments. Row indicates the initial state, column the mutated state. a, M. musculus. b, S. cerevisiae. c, E.coli. d, P aeruginosa.
About this article
Cite this article
Hochberg, G.K.A., Liu, Y., Marklund, E.G. et al. A hydrophobic ratchet entrenches molecular complexes. Nature 588, 503–508 (2020). https://doi.org/10.1038/s41586-020-3021-2
This article is cited by
Nature Communications (2023)
Nature Structural & Molecular Biology (2023)
Structure of a mitochondrial ribosome with fragmented rRNA in complex with membrane-targeting elements
Nature Communications (2022)
Nature Communications (2022)
Photosynthesis Research (2022)