Understanding how chance historical events shape evolutionary processes is a central goal of evolutionary biology1,2,3,4,5,6,7. Direct insights into the extent and causes of evolutionary contingency have been limited to experimental systems7,8,9, because it is difficult to know what happened in the deep past and to characterize other paths that evolution could have followed. Here we combine ancestral protein reconstruction, directed evolution and biophysical analysis to explore alternative ‘might-have-been’ trajectories during the ancient evolution of a novel protein function. We previously found that the evolution of cortisol specificity in the ancestral glucocorticoid receptor (GR) was contingent on permissive substitutions, which had no apparent effect on receptor function but were necessary for GR to tolerate the large-effect mutations that caused the shift in specificity6. Here we show that alternative mutations that could have permitted the historical function-switching substitutions are extremely rare in the ensemble of genotypes accessible to the ancestral GR. In a library of thousands of variants of the ancestral protein, we recovered historical permissive substitutions but no alternative permissive genotypes. Using biophysical analysis, we found that permissive mutations must satisfy at least three physical requirements—they must stabilize specific local elements of the protein structure, maintain the correct energetic balance between functional conformations, and be compatible with the ancestral and derived structures—thus revealing why permissive mutations are rare. These findings demonstrate that GR evolution depended strongly on improbable, non-deterministic events, and this contingency arose from intrinsic biophysical properties of the protein.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Monod, J. Chance and Necessity: An Essay on the Natural Philosophy of Modern Biology (Vintage Books, 1972)
Gould, S. J. Wonderful Life: The Burgess Shale and the Nature of History (W. W. Norton & Company, 1990)
Losos, J. B., Jackman, T. R., Larson, A., Queiroz, K. & Rodríguez-Schettino, L. Contingency and determinism in replicated adaptive radiations of island lizards. Science 279, 2115–2118 (1998)
Morris, S. C. The Crucible of Creation: The Burgess Shale and the Rise of Animals (Oxford Univ. Press, 2000)
Beatty, J. Replaying life’s tape. J. Phil. 103, 336–362 (2006)
Ortlund, E. A., Bridgham, J. T., Redinbo, M. R. & Thornton, J. W. Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317, 1544–1548 (2007)
Blount, Z. D., Borland, C. Z. & Lenski, R. E. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc. Natl Acad. Sci. USA 105, 7899–7906 (2008)
Travisano, M., Mongold, J. A., Bennett, A. F. & Lenski, R. E. Experimental tests of the roles of adaptation, chance, and history in evolution. Science 267, 87–90 (1995)
Meyer, J. R. et al. Repeatability and contingency in the evolution of a key innovation in phage lambda. Science 335, 428–432 (2012)
Fisher, R. A. The Genetical Theory of Natural Selection (Oxford Univ. Press, 1958)
Martin, R. E. et al. Chloroquine transport via the malaria parasite’s chloroquine resistance transporter. Science 325, 1680–1682 (2009)
Field, S. F. & Matz, M. V. Retracing evolution of red fluorescence in GFP-like proteins from faviina corals. Mol. Biol. Evol. 27, 225–233 (2010)
Bloom, J. D., Gong, L. I. & Baltimore, D. Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science 328, 1272–1275 (2010)
Lynch, V. J., May, G. & Wagner, G. P. Regulatory evolution through divergence of a phosphoswitch in the transcription factor CEBPB. Nature 480, 383–386 (2011)
Gong, L. I., Suchard, M. A. & Bloom, J. D. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife 2, e00631 (2013)
Peisajovich, S. G. & Tawfik, D. S. Protein engineers turned evolutionists. Nature Methods 4, 991–994 (2007)
Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nature Rev. Mol. Cell Biol. 10, 866–876 (2009)
Bledsoe, R. K., Stewart, E. L. & Pearce, K. H. Structure and function of the glucocorticoid receptor ligand binding domain. Vitamins Hormones 68, 49–91 (2004)
Moras, D. & Gronemeyer, H. The nuclear receptor ligand-binding domain: structure and function. Curr. Opin. Cell Biol. 10, 384–391 (1998)
Smith, J. M. Natural selection and the concept of a protein space. Nature 225, 563–564 (1970)
Carroll, S. M., Ortlund, E. A. & Thornton, J. W. Mechanisms for the evolution of a derived function in the ancestral glucocorticoid receptor. PLoS Genet. 7, e1002117 (2011)
Ding, X. F. et al. Nuclear receptor-binding sites of coactivators glucocorticoid receptor interacting protein 1 (GRIP1) and steroid receptor coactivator 1 (SRC-1): multiple motifs with different binding specificities. Mol. Endocrinol. 12, 302–313 (1998)
Chen, Z., Katzenellenbogen, B. S., Katzenellenbogen, J. A. & Zhao, H. Directed evolution of human estrogen receptor variants with significantly enhanced androgen specificity and affinity. J. Biol. Chem. 279, 33855–33864 (2004)
Bershtein, S., Segal, M., Bekerman, R., Tokuriki, N. & Tawfik, D. S. Robustness–epistasis link shapes the fitness landscape of a randomly drifting protein. Nature 444, 929–932 (2006)
Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl Acad. Sci. USA 103, 5869–5874 (2006)
Bridgham, J. T., Ortlund, E. A. & Thornton, J. W. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461, 515–519 (2009)
Tokuriki, N., Stricher, F., Schymkowitz, J., Serrano, L. & Tawfik, D. S. The stability effects of protein mutations appear to be universally distributed. J. Mol. Biol. 369, 1318–1332 (2007)
Bloom, J. D., Arnold, F. H. & Wilke, C. O. Breaking proteins with mutations: threads and thresholds in evolution. Mol. Syst. Biol. 3, 76 (2007)
Drummond, D. A., Iverson, B. L., Georgiou, G. & Arnold, F. H. Why high-error-rate random mutagenesis libraries are enriched in functional and improved proteins. J. Mol. Biol. 350, 806–816 (2005)
Polz, M. F. & Cavanaugh, C. M. Bias in template-to-product ratios in multitemplate PCR. Appl. Environ. Microbiol. 64, 3724–3730 (1998)
Harju, S., Fedosyuk, H. & Peterson, K. R. Rapid isolation of yeast genomic DNA: bust n’ grab. BMC Biotechnol. 4, 8 (2004)
Picard, D. & Yamamoto, K. R. Two signals mediate hormone-dependent nuclear localization of the glucocorticoid receptor. EMBO J. 6, 3333–3340 (1987)
R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2011)
Schimdt, M. et al. General atomic and molecular electronic structure system (Gamess). J. Comput. Chem. 14, 1347–1363 (1993)
Granovsky, A. A. Firefly Version 7.1.Ghttp://classic.chem.msu.su/gran/firefly/index.html
Dupradeau, F.-Y. et al. The R.E.D. tools: advances in RESP and ESP charge derivation and force field library building. Phys. Chem. Chem. Phys. 12, 7821–7839 (2010)
Zoete, V., Cuendet, M. A., Grosdidier, A. & Michielin, O. SwissParam: a fast force field generation tool for small organic molecules. J. Comput. Chem. 32, 2359–2368 (2011)
Brooks, B. R. et al. CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4, 187–217 (1983)
Bjelkmar, P., Larsson, P., Cuendet, M. A., Hess, B. & Lindahl, E. Implementation of the CHARMM force field in GROMACS: analysis of protein stability effects from correction maps, virtual interaction sites, and water models. J. Chem. Theory Comput. 6, 459–466 (2010)
Spoel, D. V. D. et al. GROMACS: fast, flexible, and free. J. Comput. Chem. 26, 1701–1718 (2005)
Hess, B., Bekker, H., Berendsen, H. J. C. & Fraaije, J. G. E. M. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463–1472 (1997)
Darden, T., York, D. & Pedersen, L. Particle mesh Ewald: an N log(N) method for Ewald sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993)
Nosé, S. A molecular dynamics method for simulations in the canonical ensemble. Mol. Phys. 52, 255–268 (1984)
Hoover, W. G. Canonical dynamics: equilibrium phase-space distributions. Phys. Rev. A 31, 1695–1697 (1985)
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 52, 7182–7190 (1981)
Nosé, S. & Klein, M. L. Constant pressure molecular dynamics for molecular systems. Mol. Phys. 50, 1055–1076 (1983)
Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38 (1996)
Valley, C. C. et al. The methionine-aromatic motif plays a unique role in stabilizing protein structure. J. Biol. Chem. 287, 34979–34991 (2012)
We thank J. Bridgham, members of the Thornton laboratory, and B. Buckley McAllister for technical assistance and fruitful discussions. We thank M. Stallcup for sharing plasmids and the University of Oregon ACISS cluster for computing resources (National Science Foundation (NSF) OCI-0960354). This work was supported by National Institutes of Health (NIH) F32-GM090650 (M.J.H.), NIH R01-GM081592 (J.W.T.) and R01-GM104397 (J.W.T.), NSF IOB-0546906 (J.W.T.) and a Howard Hughes Medical Institute Early Career Scientist award (J.W.T.).
The authors declare no competing financial interests.
Extended data figures and tables
a, Relationship between the amount of DNA in the mutagenesis reaction and the final mutation rate. Each point is an independently generated library, with its mutation rate estimated by sequencing between 5 and 24 clones. The error bar shows the expected standard error for an estimate of the mean of a Poisson distribution with the observed mean given the number of clones sequenced, calculated using the epicalc package in the R statistical environment. The library used for the screen is highlighted in red. b, Table showing the frequency of each possible nucleotide transition as a proportion of all mutations in the library (empirical) and predicted by the manufacturer (published). The standard error for an estimate of a proportion p given n samples was calculated as std. err = √[p(1 − p)/n]. c, Fraction of clones in a library containing 0, 1, 2, 3 or more amino acid replacements given varying total mutation rates; points show experimentally measured fractions, and lines show Poisson prediction. Error bars show standard errors, calculated as in a (for mutation rate) and b (for fraction). The box highlights frequencies of each class in the library used for the screen. d, Calculated library coverage for single (black) and double (red) substitutions for the library boxed in c. The dashed line shows the screening depth and completeness used in this study.
a, The number of clones containing X amino acid replacements in a 95-clone sample of the variant library: ‘experimental’ shows the number of clones in each class observed in the actual library by sequencing, and ‘expected’ shows the number recovered in simulations of samples of clones produced in silico by a Poisson mutation process with the same mutation frequency and spectrum as the experimental library (see Methods for details). A χ2 test (three degrees of freedom) was used to determine whether the observations deviated from the Poisson expectations. Classes of clones with three or four replacements were pooled to maintain adequate counts per cell; no observations were made or predicted with more replacements. b, Comparison of the number of unique amino acid replacements in classes defined by the number of clones X containing that replacement in a 95-clone sample of the experimental and Poisson-simulated libraries. Because of the low expected counts, we employed Fisher’s exact test for deviation from the Poisson expectation. c, Calculated probability of not observing a replacement in four or more clones out of 95 clones sampled (as occurred in our experimental sample of the library), given variable amounts of bias in the library, where bias ranges from 0.0 (no bias compared with Poisson expectation) to 1.0 (the same replacement is present in every clone). The probability drops below 5% at a bias of 0.064, providing a reasonable upper-bound estimate for the degree of bias in the library, given our observations.
a, Diagram of the two-hybrid primary screen for cortisol-specific activation of a mutant library of receptor LBDs. Each LBD is fused to the GAL4-DBD and transformed into yeast along with the GAL4-AD activation domain fused to the SRC-1 coactivator peptide (which binds to the active conformation of the LBD) and a selective reporter construct expressing the HIS3 gene, which is required for growth in the absence of histidine. b, LBD genotypes with different cortisol sensitivities can be distinguished by their growth in the two-hybrid primary screen. The plot shows D600 for yeast cultures as a function of cortisol concentration for AncGR1+F (black) and AncGR1+FP (red). Inset: colonies of AncGR1+F and AncGR1+FP grown on plates with no hormone/vehicle only (top panel, ethanol (EtOH)) or 1 μM cortisol (bottom panel). Points and error bars are mean and standard error from three technical replicates. This experimental result was reproduced many times with independent cultures. c–f, Full screen pipeline. Arrows denote the pipeline, with the number of positive clones recovered at each step shown in red. c, Representative plate from the primary screen for mutations that rescue AncGR1+F at 1 μM cortisol. d, Representative clones tested in the secondary screen for dose-responsive growth with increasing cortisol concentration. Each row shows the growth of six different clones from the primary screen and two reference clones; different rows show growth at increasing cortisol concentrations. The bottom row shows growth with no selection for receptor activity when histidine is supplied. Clone 1 grows better than genotype AncGR1+FP containing historical permissive mutations (green arrows); clone 6 grows worse than AncGR1+FP (yellow arrows). e, Two quality control steps were employed after the secondary screen to decrease false positives. f, Fold change in cortisol sensitivity measured with a luciferase reporter assay in mammalian cells for the 26 clones identified in the multistage screen. Sensitivity is defined as the ratio of the mutant and wild-type EC50 values. Columns and error bars indicate the mean and standard error of experimental replicates (grey circles). Historical P substitutions are shown with green bars; reversal of a historical F substitution is in red. Rescuing mutations are coloured by their location on the protein structure: near the ligand pocket (blue) or activation function helix AF-H (pink) (see Fig. 3c). Mutations that did not improve cortisol sensitivity in this assay are grey. Dots show statistical significance of the difference in fold activation relative to AncGR1+F (one dot, P < 0.05; two dots, P < 0.01).
Extended Data Figure 4 Single substitutions explain the sensitivity of clones with multiple substitutions.
Bars show fold improvement in cortisol sensitivity for every multi-substitution clone recovered from the library and engineered variants containing the individual substitutions. Columns and error bars indicate the mean and standard error of experimental replicates (grey points). Fold improvement is relative to AncGR1+F. Stars indicate the result of a one-tailed t-test (P < 0.01) assessing the difference between each mutant and AncGR1+F. Colours indicate the class of the clone: historical permissive substitutions (green) and rescuing mutations in the screen that are near the ligand pocket (blue), or activation function helix AF-H (pink).
Extended Data Figure 5 Molecular dynamics simulations reveal stabilization mechanisms of historical P mutations.
a, Snapshot of trajectory from AncGR2 simulation showing the hydrogen bond from atom OG1 of derived state Thr 26 to Val 214-O and packing of Leu 105 against the protein. b, Historical substitution n26T allows the formation of a new hydrogen bond. Radial distribution function of the distance to Val 214-O from ancestral residue Asn 26 (atom ND2) in simulation of AncGR2p (black) and from derived residue Thr 26 (atom OG1) in simulation of AncGR2 (red). Numbers show the fraction of time that a hydrogen bond was formed over each simulation using a 3.0-Å, 30° geometric criterion. The change in hydrogen-bond frequency was used to calculate ΔΔGHbond, the favourable effect of this historical substitution on hydrogen bond energy at 310 K. c, Historical substitution q105L improves packing interactions. Histogram of van der Waal’s contacts (3.5 Å cutoff) between residue 105 and other protein atoms for ancestral state Gln 105 in the AncGR2p simulations (black) and derived state Leu 105 in the AncGR2 simulations (red). d, Mutations have the same functional effects in the AncGR1+F and AncGR2p (AncGR2/N26t/Q105l) backgrounds, allowing interpretation of experiments in AncGR1+F using MD simulations starting from the AncGR2 crystal structure. Paired bars are changes in cortisol sensitivity for each mutation measured in the AncGR1+F (left) or AncGR2p (right) background. Columns and error bars indicate the mean and standard error of experimental replicates (grey points). There was no statistically significant (P < 0.05) difference in the effect of each mutation introduced in either the AncGR1+F or AncGR2p backgrounds, as assessed by a two-tailed t-test. No multiple testing correction was performed to minimize type II errors.
Extended Data Figure 6 F and P mutations have opposite effects on melting temperature but do not affect expression.
a, Change in Tm induced by F and P mutations in the AncGR1 background. Colours indicate P (green) or F (red) substitutions. Bars indicate the mean change in Tm for triplicate measurements; error bars are standard error. We were unable to express and purify soluble AncGR1/f98I (n/a); comparing n26T/q105L with n26T/q105L/f98I shows that this substitution has a very strong destabilizing effect. b, Rescuing mutations do not alter LBD expression in AncGR1+F background. Figure shows a western blot of soluble proteins extracted from CHO-K1 cells, revealed with a polyclonal GAL4DBD antibody. Expression is similar for all constructs. The small amount of variation does not correlate with sensitivity or fold activation; for example, the non-functional protein AncGR1+F exhibits expression comparable to those of the highly active AncGR1+F+M222I and AncGR1+F+L231M proteins. Molecular masses (determined by standard marker) are indicated on the right. Red arrows highlight the expected molecular masses of the GAL4DBD and GAL4DBD–LBD fusion protein products. The background band (top) is a high-molecular-mass cross-reactive protein that indicates a similar global protein expression level across samples.
Extended Data Figure 7 In MD simulations, rescuing mutation Met231 forms a sulphur–π interaction with Phe-206.
a, Snapshot from an MD simulation showing the location of the Met 231–Phe 206 stack at the C-terminal end of the AF-H (slate). b, Alternative view of the same snapshot, showing the relative orientation of Met 231 and Phe 206 as sticks. θ is defined as the angle between A (the vector normal to the Phe plane, extending from its centroid) and B (the vector connecting the Phe centroid to the Met 231 sulphur). The distance R is the length of vector B. c, Distribution of observed R over three independent 100-ns trajectories (9,200 snapshots in all). d, Distribution of observed θ over the same trajectories. The percentage at the top shows the fraction of time in which the interaction is formed by simple geometric criteria (R < 6 Å and 20° < θ < 60°).
Extended Data Figure 8 In MD simulations, rescuing pair Q114L/M197I improves packing between H7 and H10.
a, A representative snapshot from the trajectory of AncGR2p+Q114L/M197I shows the favourable interaction of derived states Leu 114 and Ile 197 (spheres). Helices 7 (grey) and 10 (blue) are shown as solvent-accessible surfaces. b, A histogram of all van der Waals contacts (3.5 Å cutoff) between H7 and H10 for trajectories of AncGR2p (black) and AncGR2p+Q114L/M197I (red).
About this article
Cite this article
Harms, M., Thornton, J. Historical contingency and its biophysical basis in glucocorticoid receptor evolution. Nature 512, 203–207 (2014). https://doi.org/10.1038/nature13410
Annual Review of Chemical and Biomolecular Engineering (2020)
Bridging non-overlapping reads illuminates high-order epistasis between distal protein sites in a GPCR
Nature Communications (2020)
Genome Research (2020)
Sex steroids as mediators of phenotypic integration, genetic correlations, and evolutionary transitions
Molecular and Cellular Endocrinology (2020)
A Second Backbone: The Contribution of a Buried Asparagine Ladder to the Global and Local Stability of a Leucine-Rich Repeat Protein