Hsp90 shapes protein and RNA evolution to balance trade-offs between protein stability and aggregation

Acquisition of mutations is central to evolution; however, the detrimental effects of most mutations on protein folding and stability limit protein evolvability. Molecular chaperones, which suppress aggregation and facilitate polypeptide folding, may alleviate the effects of destabilizing mutations thus promoting sequence diversification. To illuminate how chaperones can influence protein evolution, we examined the effect of reduced activity of the chaperone Hsp90 on poliovirus evolution. We find that Hsp90 offsets evolutionary trade-offs between protein stability and aggregation. Lower chaperone levels favor variants of reduced hydrophobicity and protein aggregation propensity but at a cost to protein stability. Notably, reducing Hsp90 activity also promotes clusters of codon-deoptimized synonymous mutations at inter-domain boundaries, likely to facilitate cotranslational domain folding. Our results reveal how a chaperone can shape the sequence landscape at both the protein and RNA levels to harmonize competing constraints posed by protein stability, aggregation propensity, and translation rate on successful protein biogenesis.

A cquisition of mutations is central to evolution but the detrimental effects of most mutations on protein folding and stability limit protein evolvability [1][2][3][4] . Molecular chaperones, which suppress aggregation and facilitate polypeptide folding 5 , are proposed to promote sequence diversification by buffering against the deleterious effects of destabilizing mutations [6][7][8][9] . However, whether and how chaperones control protein evolution remains poorly understood. RNA viruses offer an attractive model to examine such molecular mechanisms of evolution due to their relative simplicity and extreme evolutionary capacity. RNA viral polymerases generate mutations at almost the highest theoretically allowed rates 10,11 . The resulting genetic diversity of viral populations affords rapid adaptation to changing environments 12 . Recent findings link the high population diversity of RNA viruses to pathogenesis, likely due to the need to evolve rapidly in the infected organism 13 . As viral proteins meet folding and stability challenges akin to those of host proteins, and utilize the host-cell translation and folding machineries, viruses offer a unique opportunity to examine how chaperones shape protein evolution.
Poliovirus is a non-enveloped enterovirus that replicates in the cytosol of human cells. Infection leads to shutoff of host protein synthesis; thus the host translation and protein folding machineries are entirely dedicated to produce viral proteins 14 . Hsp90, an abundant ATP-dependent chaperone that facilitates the folding and maturation of many cellular metastable proteins [15][16][17] , only interacts with a single poliovirus protein, the P1 capsid precursor 18 . Hsp90 activity is required for P1 folding and subsequent proteolytic maturation into capsid proteins VP0, VP1, and VP3 18 . Importantly, P1 folding is the only process in the poliovirus replication cycle that requires Hsp90. Pharmacological inhibition of Hsp90 activity with the highly selective Hsp90 inhibitor Geldanamycin 19 (GA) specifically impairs P1 folding and maturation but does not affect viral protein translation nor the folding or function of other viral proteins 18 . The high diversity and evolutionary capacity of poliovirus, together with the fact that it harbors only a single protein that requires Hsp90 for folding, the capsid precursor P1, provide an ideal system to examine the role of Hsp90 in protein evolution (Fig. 1a).
Herein, we capitalize on this system to address how Hsp90 influences the evolution of the poliovirus P1 protein. We find that Hsp90 can modulate the fitness of variants in P1 under normal conditions, selecting against particular mutations. Moreover, we observe that under conditions of reduced chaperone function, poliovirus populations harbor more variants that destabilize P1 and less variants that increase aggregation propensity and hydrophobicity. Hence, Hsp90 appears to mediate a trade-off between key conflicting aspects of protein folding, stability, and aggregation propensity. Strikingly, we observe clusters of deoptimized synonymous codons at domain boundaries under conditions of low Hsp90 activity, linking chaperone function with translation kinetics, and cotranslational folding. Our results reveal that a key eukaryotic chaperone influences the sequence landscape of its client protein at both the protein and RNA levels, mediating the competing constraints posed by protein stability, aggregation propensity, and translation rate.

Results
Hsp90 influences the repertoire of antibody escape mutants. During infection, viral capsids are subject to intense selective pressure by the host immune system 20 . Viral protein diversity is the key to virus survival as it creates a reservoir of capsid variants enabling escape from neutralization by circulating antibodies. To assess whether Hsp90 influences viral capsid evolution, we examined the spectrum of neutralizing antibody escape variants in viral populations that were generated under normal conditions (Hsp90 N ) or following Hsp90 inhibition by treatment of infected cells with Geldanamycin (GA) (hsp90 i ; Fig. 1b). We speculated that if Hsp90 has a role in influencing the evolution of its client proteins, we would observe a difference in the type or frequency of mutations giving rise to antibody resistance between Hsp90 N and hsp90 i populations. Two neutralizing monoclonal antibodies (mAb423 and mAb427) targeting different epitopes in the poliovirus capsid 21 were employed, followed by sequencing across their respective epitope. Escape variants conferring resistance to neutralization were identified in both conditions, in two independent experiments (Fig. 1b). For mAb423, resistance in the Hsp90 N population mapped in both replicates to several different substitutions in two P1 sites: K402 and R412 (Fig. 1c, Supplementary Fig. 1a, Supplementary Table 1). In contrast, mAb423 resistance in the hsp90 i population was dominated by a single substitution in both replicates, K402E (p < 0.0001 by Fisher's test; Fig. 1c, Supplementary Fig. 1a, Supplementary Table 1). The R412W mutation, often encountered in the Hsp90 N population, was significantly disfavored in the hsp90 i population (p < 0.0001 by Fisher's test; Fig. 1c, Supplementary Fig. 1a, Supplementary Table 1).
For the other antibody, mAb427, escape neutralization variants observed in both Hsp90 N and hsp90 i populations were distributed across multiple sites in the epitope. Several distinct variants at position 235 (N235) were significantly enriched in the Hsp90 N population in both replicates (p = 0.0014 by Fisher's test; Fig. 1d, Supplementary Fig. 1b, Supplementary Table 2). In contrast, mutations at positions P239 and N216 were observed at high frequency in the hsp90 i population in replicate 1 and 2, respectively, but were nearly absent from the Hsp90 N population (p < 0.05 by Fisher's test for both N216 and P239; Fig. 1d, Supplementary Fig. 1b, Supplementary Table 2). These results suggest that Hsp90 influences the types of P1 mutations that can exist in poliovirus populations (P1 sequence space) by selecting both for and against specific sequence variants.
Hsp90 modulates the fitness of mutations in its client protein.
The near absence from the Hsp90 N population of mAb427 escape variants at sites N216 and P239 suggests that normal Hsp90 levels reduce the fitness of these sequence variants. To directly test the notion that Hsp90 negatively selects certain sequence variants, we assessed the relative fitness of viruses harboring either the hsp90 i selected variants at P1 sites 216 and 239, or the Hsp90 N variants observed under Hsp90-normal conditions. Viruses encoding the Hsp90 N sequences N216 and P239 were competed with viruses encoding the hsp90 i variants K216 or S239, under conditions of either normal or low Hsp90 activity (Fig. 1e). A mix with an initial ratio of 1:5 Hsp90 N to hsp90 i variant was serially passaged ten times with or without Hsp90 inhibitor in two independent experiments. Sequence analysis of passage 10 populations (Fig. 1e, Supplementary Fig. 2) showed that under normal Hsp90 conditions, the Hsp90 N variants N216 and P239 were strongly selected for in both independent experiments, completely dominating the population after 10 passages despite the initial excess of hsp90 i virus (Fig. 1e, normal Hsp90). In contrast, when passaged in the presence of Hsp90 inhibitor (Fig. 1e, low Hsp90), hsp90 i variant K216 was present in~50% of the population (Fig. 1e,  Supplementary Fig. 2a) and variant S239 completely dominated the population (Fig. 1e, Supplementary Fig. 2b). Importantly, the hsp90 i variants K216 and S239 do not show resistance to Hsp90 inhibition ( Supplementary Fig. 3), in agreement with previous findings that Hsp90 inhibitors are refractory to the development of drug resistance 18 . These findings demonstrate that Hsp90 levels influences the fitness of particular sequence variants in a client protein. Given the growing interest in Hsp90 inhibitors as broad-spectrum antivirals 18,22,23 , the effect of Hsp90 on the fitness of escape mutants from immune selection may have important therapeutic implications.
Hsp90 influences the diversity of poliovirus populations. To gain a better understanding of how Hsp90 shapes virus evolution, we next examined the diversity of poliovirus populations at unprecedented resolution using recent technological advancements in ultra-deep sequencing that provide a detailed description of the viral population mutation composition 24 . We considered three possible scenarios by which Hsp90 influences the functional sequence space of a protein (Fig. 2a). The first possibility, that Hsp90 has no role shaping sequence space (no-effect; Fig. 2a    Hsp90 activity influences viral capsid fitness. a Does Hsp90 activity influence the sequence space of its client, poliovirus capsid protein P1? The schematic indicates a theoretical sequence space for P1 in viral populations, with each branch indicating two sequence variants linked by a point mutation. Hsp90 N and hsp90 i represent viral populations generated under normal Hsp90 activity, or under reduced Hsp90 activity by treatment with Geldanamycin (GA). WT capsids are indicated in gray, mutant capsids in color, and question marks indicate unknown capsid mutants. b Outline of capsid variants selection assay using neutralizing antibodies mAb423 and mAb427. c, d Abundance and distribution of mAb423 (c) or mAb427 (d) escape variants identified in Hsp90 N and hsp90 i populations. Each graph represents the sum of all mAb-resistant variants identified in two independent experiments (see Supplementary Fig. 1), with the total number of mutants (n) indicated. e Evaluation of the relative fitness of the hsp90 i -enriched variants N216K and P239S identified in d under normal or Hsp90-inhibited conditions. An initial infection with a 1:5 ratio of Hsp90 N (N216 or P239) to hsp90 i variant (K216 or S239) virus was used, and viruses serially passaged under normal or Hsp90-inhibited conditions for 10 passages. The results of the competition assays were assessed by Sanger sequencing of the population, and is shown for passage 10. Significance: *p < 0.05, **p < 0.01, ***p < 0.005 by two-tail Fisher's exact test after multiple test correction using the false discovery rate (FDR) method subject less stable mutations to purifying selection due to the reduced fitness of these mutants in the absence of the chaperone 32,33 . This should favor a smaller sub-population of stable variants, resulting in reduced population diversity. Finally, a third possible model is that Hsp90 shapes the sequence space of its client proteins in a manner that has yet to be appreciated (dictates landscape; Fig. 2a, right panel).
Starting with a single viral clone, we sampled virus populations from eight different passages under conditions of normal Hsp90 activity (Hsp90 N ) or low Hsp90 activity (hsp90 i ) by treatment with the Hsp90 inhibitor GA 19 in two independent replicates (Fig. 2b). We then employed high fidelity CirSeq ultra-deep sequencing 24 to define the sequence composition of these poliovirus populations (Fig. 2b). Extensive sequence coverage was obtained for both replicas, with most sites in the coding sequence covered over 100,000 times (Fig. 2c, Supplementary   Fig. 4a). The consensus sequence was unchanged in all samples, as previously observed upon passaging poliovirus in the presence of Hsp90 inhibitors 18 . Furthermore, no global differences in the overall mutation rates of the Hsp90 N and hsp90 i virus populations were observed (Fig. 2d, p = 0.71 by Mann-Whitney-Wilcoxon (MWW) test, and Supplementary Fig. 4b). Similarly, no significant differences in overall evolutionary rates were observed for both synonymous (dS) and non-synonymous mutations (dN) as a function of Hsp90 activity level, either for Hsp90-client protein P1, or across the entire genome (Fig. 2e, dS: p = 0.21 and dN: p = 0.25 by MWW test; and Supplementary Fig. 4c). Thus, global evolutionary rates were not affected by Hsp90 activity, consistent with previous findings that Hsp90 inhibition does not affect the viral polymerase 18 . Interestingly however, protein mutational diversity in P1 was significantly higher in the hsp90 i condition, in contrast to the expectation from the buffering model   Fig. 2 Globally assessing the influence of Hsp90 on protein sequence space. a Hypothetical models for how Hsp90 influences the functional sequence space of poliovirus: no-effect (left), buffers mutations (center), or dictates landscape (right). Schemes illustrate the respective predicted outcome of Hsp90 inhibition on sequence space for each model. Gray lines indicate available mutational networks while black lines indicated those that are present under the different proposed scenarios. b Experimental strategy to examine the effects of Hsp90 inhibition on poliovirus sequence space. Poliovirus was propagated over eight passages under normal or low Hsp90 conditions via GA treatment. Population samples containing >10 8 variants each were collected from passages 2-8 and subjected to CirSeq deep sequencing and analysis. c Average sequence coverage obtained with CirSeq across the poliovirus genome for all populations Hsp90 N (blue trace) and hsp90 i (red traces). d Mutation rates for the full virus genome in Hsp90 N and hsp90 i populations. e Evolutionary rates for P1 in Hsp90 N and hsp90 i populations (full genome in Supplementary Fig. 4c). f Does Hsp90 buffering lead to reduced P1 sequence diversity in hsp90 i populations? Sequence entropy (left panel) and mutational diversity (right panel) for P1 coding region is higher for hsp90 i than for Hsp90 N populations analyzed by CirSeq. g Representation of distinct sequence diversity for P1 from Hsp90 N and hsp90 i virus populations. The original master sequence (red, middle circle) gives rise to variants (different colors and symbols) through mutation. For boxplots, the center line represents the median, the bound box the interquartile range, and the whiskers 1.5× the interquartile range. Significance: ns, not significant (p > 0.05); ***p < 0.005 by the Mann-Whitney-Wilcoxon test (Fig. 2f). Two alternative metrics, sequence entropy (Fig. 2f, p < 8 × 10 −26 by MWW test; and Supplementary Fig. 4d) and mutational diversity (1 − D) with D as Simpson's index of diversity (Fig. 2f, p < 8.7 × 10 −21 by MWW test; and Supplementary Fig. 4d) both indicated that reducing Hsp90 activity enhances population diversity (Fig. 2g). Together with the data from Fig. 1, these analyses suggest that Hsp90 influences the sequence landscape within the population (Fig. 2a, right panel).
Hsp90 influences the capsid sequence space. Molecular chaperones assist protein homeostasis by promoting folding upon translation, maintaining stability, and also preventing protein aggregation 5,34,35 (Fig. 3a). These processes are particularly relevant to viral capsids, which must be produced in high levels and assembled from numerous subunits, while avoiding aggregation 36,37 . Furthermore, once assembled, capsids must be sufficiently stable to protect the viral genome in harsh environments. To further understand the role of Hsp90 in shaping protein evolution, we compared the viral populations generated under normal (Hsp90 N ) or Hsp90-inhibited (hsp90 i ) conditions to identify variants that differed significantly in their effect on protein stability or aggregation propensity across passages (Fig. 3b). First, we used the experimentally-rooted and wellvalidated FoldX approach [38][39][40] to calculate the effects of all possible single point mutations on protein stability in the three viral proteins for which crystal structures are available, the capsid P1, the protease 3C and the RNA polymerase 3D (FoldX algorithm 38 see Methods and Supplementary Fig. 5a). We then compared Hsp90 N and hsp90 i populations to identify P1 sites that significantly differ in their destabilizing variant frequencies across passages (Site destab ). Next, we calculated the effect of mutations on sequence aggregation propensity employing the widely used TANGO algorithm [41][42][43] (see Methods and Supplementary  Fig. 5b) in these same viral proteins to uncover P1 sites that differ in their ability to tolerate variants of increased aggregation propensity (Sites agg ) between the Hsp90 N and hsp90 i population. For each of these analyses, we employed a robust statistical approach that examines the differences in the ability of each codon position to consistently accommodate destabilizing or aggregation-prone mutations across all passages (see Methods). Surprisingly, there were significantly more Site destab in the Hsp90client P1 under low Hsp90 activity ( Fig. 3c; p = 0.0009 by Cochran-Mantel-Haenszel (CMH) test; Supplementary Fig. 5d). Conversely, the number of Sites agg in P1 was significantly reduced under low Hsp90 activity (Fig. 3d, p = 0.00036 by CMH test; Supplementary Fig. 5e). The incidence of Sites agg and Site destab in the non-Hsp90 dependent viral proteins 3C and 3D was relatively unaffected by Hsp90 inhibition (Fig. 3c, d: p = 0.21, 3D: p = 1 by CMH test). These analyses indicate that Hsp90 preferentially affects the sequence space of its client P1. Paradoxically, unlike what we expected from the 'buffering' model, reducing Hsp90 activity increases the incidence of variants harboring destabilizing mutations as well as variants that reduce aggregation propensity, in support of a 'dictating landscape' model.
We next examined the distribution of Sites agg and Site destab with respect to the folded P1 structure, namely whether they map to buried (Brd), surface-exposed (Sfc), or interface (Int) positions (Fig. 3e). For both Sites agg and Site destab , major differences between Hsp90 N and hsp90 i mapped to buried (Brd) positions within the structure of the viral capsid ( Fig. 3e and Supplementary Fig. 5d, e). Under low Hsp90 conditions, buried (Brd) destabilizing variants were increased (Site destab ; p = 0.006 by Fisher's test) and aggregation-prone sites decreased (Sites agg ; p < 0.004 by Fisher's test) compared to surface-exposed ( Fig. 3e; Sfc) or interface regions ( Fig. 3e; Int). As buried regions are primarily exposed in non-native conformations, such as those generated during cotranslational folding, these findings indicate that Hsp90 activity may be critical to prevent aggregation during initial protein folding, rather than during assembly, for which interface residues are likely of greater relevance.   Hsp90 supports variants of increased hydrophobicity.
Disfavoring aggregation-prone variants should be beneficial for the virus, while favoring destabilizing mutations should be detrimental. How can the contradictory effects of low chaperone activity on these two key protein-folding properties be reconciled? We reasoned that hydrophobicity, a driving force in protein folding that influences both protein stability and aggregation 44,45 may explain these effects (Fig. 4a). Enhancing buried hydrophobicity can increase the stability of the folded core, but can also increase aggregation propensity. We thus examined if Hsp90 N and hsp90 i populations differ in their ability to accommodate sites with enhanced hydrophobicity (Sites hydro ; Fig. 4b). Strikingly, inhibiting Hsp90 significantly reduced the number of Sites hydro in P1 and to a lesser extent also in non-Hsp90 clients 3C and 3D (Fig. 4b). The effect on P1 was dramatic for both replicas (Fig. 4b, p = 5 × 10 −9 by CMH test, and Supplementary Fig. 5f) and observed both in buried, surface-exposed and interface sites ( Supplementary Fig. 5c). These results indicate that Hsp90 enables sequence variants of increased hydrophobicity to survive in the population, which can promote protein stability but also increase aggregation propensity.
To better understand the surprising finding that reduced chaperone function results in an increase of destabilizing variants, we next examined the properties of individual Site destab variants in Hsp90 N and hsp90 i viral populations. Site destab variants in P1 were examined with respect to their ability to change aggregation propensity (ΔAP) and hydrophobicity (ΔHyd) (Fig. 4c, d). Destabilizing variants observed under normal Hsp90 conditions (Hsp90 N ) were significantly more aggregation-prone than destabilizing variants present at reduced chaperone activity (hsp90 i ; Fig. 4c, d; p = 0.002 by MWW test, and Supplementary  Fig. 5g). Furthermore, Site destab variants in hsp90 i were significantly less hydrophobic than in Hsp90 N or those expected by chance (i.e., randomly selected mutations; Fig. 4c, p < 10 −7 by Location in P1 folded structure and effects on stability, aggregation propensity, and hydrophobicity are indicated. Mutations A195V and P263L (blue) observed in Hsp90 N , and L269Q (red) in hsp90 i . Structure from 2PLV, with VP2 monomer, and assembled capsid indicated. f Schematic effect of Hsp90 on sequence space. Hsp90 allows variants to explore a region in sequence space of increased hydrophobicity and aggregation propensity, but also of increased stability. Lower Hsp90 constrains sequence space variants to a region of reduced stability, hydrophobicity, and aggregation propensity. For boxplots, the center line represents the median, the bound box the interquartile range, and the whiskers 1.5× the interquartile range. Significance: *p < 0.05; **p < 0.01; ***p < 0.005 by two-tail Fisher's exact test or Mann-Whitney-Wilcoxon test MWW test, and Supplementary Fig. 5g). This indicates that Hsp90 N and hsp90 i accommodate different kinds of destabilizing mutations. Thus, destabilizing mutations in Hsp90 N have significantly increased sequence hydrophobicity and aggregation propensity compared to that expected by chance (Fig. 4c, left panel, p < 10 −5 by MWW test and Supplementary  Fig. 5g). In contrast, destabilizing mutations in hsp90 i have significantly decreased sequence hydrophobicity and aggregation propensity (Fig. 4c, right panel, p = 0.0007 by MWW test, and Supplementary Fig. 5g). More specifically, many destabilizing mutations in the Hsp90 N population resulted from the loss of proline residues (Fig. 4d, Supplementary Fig. 5h).
In contrast, destabilizing variants in the hsp90 i population were more diverse and reduced hydrophobicity by mutation to polar or charged residues (Fig. 4d, Supplementary Fig. 5h). We conclude that Hsp90 enables accumulation of destabilizing variants with increased aggregation propensity and hydrophobicity, while reducing Hsp90 activity promotes destabilizing variants that reduce hydrophobicity and aggregation propensity.
The interplay between Hsp90 action and individual mutations affecting protein stability, aggregation propensity and hydrophobicity is illustrated by a set of variants found at the interface between core beta strands in VP0, the N-terminal folding unit of P1 (Fig. 4e). For instance, a stabilizing mutation found in Hsp90 N , A195V, should be beneficial to protein folding and stability; however, this mutation also increases P1 aggregation propensity and hydrophobicity, and is thus strongly selected against in the hsp90 i population (p < 1 × 10 −4 by CMH test). Another Hsp90 N variant, P263L, also increases aggregation propensity and hydrophobicity but, in this case, reduces protein stability. This very detrimental mutation is strongly depleted under hsp90 i conditions (p < 1 × 10 −4 by CMH test). In contrast, L269Q, which reduces sequence hydrophobicity and does not alter aggregation propensity, is favored in hsp90 i , despite being a strongly destabilizing mutation (p < 10 −10 by CMH test). These analyses further illustrate that, for poliovirus P1 capsid protein, the dominant selective force under low Hsp90 activity is the reduction of hydrophobicity and aggregation propensity, even at a cost to protein stability. Our analysis exposes the role of Hsp90 in balancing an intrinsic conflict between stability and aggregation, shaping protein evolution (Fig. 4f). While increased hydrophobicity benefits the stability of the folded state, it also increases the aggregation propensity of folding intermediates, such as those generated during protein synthesis. Thus, chaperone dependence, which reduces aggregation in vivo, may allow exploration of regions within sequence space leading to more stable, and thus more versatile, proteins (Fig. 4f). This is consistent with the high aggregation propensity of obligate chaperone substrates in the cell 46,47 . Importantly, by favoring more stabilizing but also more aggregation-prone sequence variants, chaperones appear to restrict the sequence space and render proteins vulnerable to aggregation under conditions of chaperone impairment, such as cellular stress or aging.
Hsp90 activity can influence synonymous codon choice. Our analysis points to the importance of Hsp90 in preventing aggregation during initial protein folding (Fig. 3e). The vectorial nature of translation renders proteins particularly susceptible to aggregation 48 , since emerging domains that cannot yet fold will expose hydrophobic elements to the crowded cellular milieu 49,50 . Importantly, chaperones have a key role protecting nascent chains and facilitating their folding 51 . Since the speed of translation is proposed to be directly attuned to optimize cotranslational folding and processing of nascent chains 52-55 , we hypothesized that virus adaptation to low Hsp90 levels may involve optimization of the translation elongation rate. Some tRNAs are much more abundant than others and codon choice can have important effects on translation kinetics [56][57][58] (Fig. 5a). Accordingly, viruses have been found under direct selection for their synonymous codon usage 59 whose deregulation often brings direct fitness consequences [60][61][62] . We thus examined whether reducing Hsp90 levels affected the synonymous choice of codons in P1 using the established tRNA adaptation index 63 (tAI) for both the human genome and for HeLa cells used for virus passaging. Of note, the tAI has been shown to correlate with in vivo translation speed 57 . Overall, both Hsp90 N and hsp90 i populations demonstrate a clear trend to generally optimize overall codon usage to match that of the host-cell ( Fig. 5b; Hsp90 N : p < 10 −6 by MWW test; hsp90 i : p < 10 −6 ; Supplementary Fig. 6a,b). This indicates that poliovirus adapts to the host-cell to enhance overall protein production, establishing a direct link between viral infection and adaptation to the host-cell tRNA pool.
However, under conditions of Hsp90 inhibition, we also observed clusters of sites harboring synonymous mutations that deoptimized codons at specific locations along the P1 coding sequence (Fig. 5c and Supplementary Fig. 6c, d). Specifically, there were two major clusters of deoptimized codons in the hsp90 i population (Fig. 5c, highlighted in blue). Each cluster mapped to the N-terminal coding region of VP3 and VP1, respectively. The P1 polyprotein encompasses three domains that, upon completion of folding, are processed proteolytically to generate processed capsid subunits VP0, VP3, and VP1 37 . Interestingly, accounting for the length of the ribosomal exit tunnel, the deoptimized codon clusters were positioned to promote a local slow-down of translation elongation upon emergence of the complete VP0 and VP3 folding domains from the ribosomal exit site (Fig. 5c,  bottom panels). Importantly, pulse chase experiments show that Hsp90 inhibition does not affect global translation speeds in poliovirus infected cells 18 , supporting the idea that synonymous changes affect the translation rate locally. We propose that synonymous mutations are selected at lower Hsp90 activity to locally reduce translation speed at inter-domain boundaries to favor cotranslational folding. Consistent with this notion, the only member of the picornavirus family whose P1 protein does not require Hsp90 for folding, Hepatitis A virus, relies on highly deoptimized codon usage for the capsid coding region for fitness 64 . These findings suggest that achieving overall higher translation rates also requires a trade-off with efficient cotranslational folding, which can be balanced by chaperone action or in its absence, by selective codon de-optimization at specific sites (Fig. 5d).

Discussion
Although clearly essential for protein folding 15 , the role of Hsp90 in evolution has remained unclear 31 . Our study demonstrates that the chaperone Hsp90 helps shape the evolutionary path of a viral protein at the polypeptide and RNA levels. Hsp90 alters the protein sequence space by modulating trade-offs between protein stability, aggregation and hydrophobicity at the polypeptide level, and the trade-offs between translation speed and cotranslational folding at the RNA level. Hsp90 allows exploration of variants that increase aggregation propensity and hydrophobicity, which may also lead to increased protein stability (Fig. 6). In low Hsp90, the viral sequence space expands to explore variants that reduce aggregation propensity at the expense of stability.
Our experiments provide insights into how chaperones may have shaped protein evolution by modulating the energy landscape of protein folding. Upon synthesis, many globular proteins populate non-native states as well as aggregation-proneintermediates; of note, aggregates are often more stable than the native folded state 65 (Fig. 6). Chaperones guide the folding polypeptide through this landscape to prevent aggregation and favor the energetically stable folded conformation. Without chaperones, alternative solutions within sequence space must be explored that alleviate the pressure to reduce intrinsic aggregation propensity; this may preclude the emergence of more hydrophobic variants that stabilize the native state. These considerations may be particularly relevant in the crowded environment of the cell, as well as for topologically complex proteins such as viral capsids, which must assemble elaborate oligomeric structures 66  Energy Energy Fig. 6 Hsp90 modulates the energy landscape of its client protein. Role of Hsp90 in shaping protein evolution by modulating the energy landscape of protein folding and balancing trade-offs between protein stability, aggregation propensity, and folding. Hsp90 allows exploration of variants that increase aggregation propensity and hydrophobicity, which may also lead to increased protein stability. Under reduced Hsp90 conditions, the viral sequence space expands to explore variants that reduce aggregation propensity at the expense of stability. See text for details aggregation propensity is a property of all proteins 65 and we speculate that our results have general implications for further understanding how chaperones shaped the evolution of proteins. Our experiments also suggest that chaperone activity influences the RNA sequence space. Low Hsp90 activity promotes deoptimizing codons following translation of domains. Such modulation of local translation kinetics may enhance cotranslational folding by allowing more time for early folding intermediates to adopt correct conformations, avoid misfolding and aggregation, or by allowing more time for recruiting and interacting with chaperones. These findings highlight the close relationship between protein and RNA sequence space, and show that the evolutionary trajectory of both may be shaped by chaperones.
Not all proteins require chaperone assistance for folding 67 . Our results suggest that one benefit of acquiring chaperone dependence is to resolve evolutionary trade-offs that allow proteins to be translated faster and to occupy specific regions of sequence space. Chaperone dependence thus likely supported acquisition of complex folds that expand protein function, including metastable structures 17,68 , multimeric assemblies 5,69 , and multi-domain architectures 5,69 . On the other hand, chaperone substrates are constrained by chaperone-specific preferences and become sensitive to the levels and activity of that chaperone in different cells and environmental conditions. Chaperone activity can vary with stress, cell type, and even age 34 , and these changes may yield variance in the folding and activity of a chaperone-dependent protein.
In the particular case of viral proteins, where survival and adaptation is directly dependent on population diversity 13 , chaperones likely have a fundamental role in virus protein evolution, adaptation to specific environments, like tissues and stress conditions, and ultimately pathogenesis.

Methods
Strains and reagents. All experiments were conducted in HeLa S3 cells (ATCC, CCL-2.2) cultured in DMEM/F12 (Invitrogen) containing 10% FCS, penicillin and streptomycin and at 37°C. The Mahoney type 1 (WT) and Sabin type 1 strains of poliovirus were generated by electroporation of in vitro transcribed RNA into Hela S3 cells 18 . For viral infections, cells were incubated with the virus for 30 min at 37°C , after which media was added and infection allowed to continue until all cells were dead. Subsequently, two freeze-thaw cycles were performed to release intracellular viruses, cellular debris were removed by low speed centrifugation, and virus production was assessed by plaque assay. Geldanamycin (GA; LC labs) was diluted in DMSO and used at the indicated concentration. Monoclonal poliovirus antibodies were generously provided by Dr. Morag Ferguson at NIBSC, Blanche Lane, South Mimms, Potters Bar, Hertfordshire.
Antibody neutralizations and competition assays. For selection of antibody escape mutants,~10 5 (mAb427) or 10 6 -10 7 (mAb423) plaque forming units (PFU) from virus populations generated in the absence or presence of 1 µM GA were mixed with 1 µl of antibody for 1 h at 25°C and then used to infect confluent cells in 6 cm plates for 30 min. Cells were then washed and overlaid with DMEM/F12 containing 1% agar and 1:2000 dilution of antibody. Twenty-four to 30 h later, plaques were picked and amplified by inoculating onto fresh cells in 12 well plates. Upon CPE, viral RNA was extracted by Trizol or ZR Viral RNA kit (Zymo Research), RT-PCR performed with Thermoscript reverse transcriptase (Invitrogen) using random primers, and PCR performed using primers F676 (5′-GCTCCATTGAGTGTGTTTACTCTA-3′) and R3463 (5′-ATCATCCTGAGT GGCCAAGT-3′) and PfuUltra II Fusion HS (Stratagene). For competition experiments, 10 5 PFU of mutants were mixed with 2 × 10 4 pfu of WT poliovirus and passaged as above. p-values were calculated using a two-tailed Fisher's test and multiple testing correction using the false discovery rate (FDR) method 70 .
CirSeq sequencing. CirSeq was performed according to published protocols 24,71 . Briefly, Hela S3 cells were serially passaged at a multiplicity of infection (m.o.i.) of 0.1 with 10 6 PFU for 8 h (one replication cycle) in the presence (hsp90 i ) or absence (Hsp90 N ) of 0.3 µM Hsp90 inhibitor Geldanamycin. Each passage was amplified in Hela S3 cells at high m.o.i. and total cellular RNA was extracted at 6 h post infection. Viral RNA was enriched by poly(A) purification, fragmented, circularized and reverse transcribed to generate cDNAs containing tandemly repeated viral sequences. Sequencing libraries were generated from cDNA by second strand synthesis, end repair, dA-tailing and ligation of Illumina TruSeq indexed adaptors to the DNA ends. Sequencing was performed on a HiSeq2500 using HiSeq Rapid SBS V2 reagents and 300-310 bp single-end reads with indexing as appropriate. Reads were processed with the CirSeq software 24,71 , available at https://github.com/ashleyacevedo/CirSeq, to generate a consensus sequence of tandem repeats within each read. Duplicate reads with identical break points were filtered, and aligned reads converted to codon counts at each position in the coding region.
Cirseq analysis. The virus mutation rate was computed by analyzing the rate of nonsense mutations to stop codons 72 . The evolutionary rates dN and dS were computed with established count model of codon sequence evolution by Nei and Gojobori 73 . The notation of dN and dS was chosen to reflect our focus on the molecular evolution of poliovirus proteins even though we analyze virus populations 74 . To evaluate mutational diversity, two independent measures that quantify diversity were used. First, Shannon's Entropy, a metric from information theory that quantifies information content, was computed of the mutational space per position as where p i is the probability of observing mutation i among the mutations. Second, Simpson's Index of diversity 1 − D, a biodiversity measure from ecology, was used to assess mutational diversity per position as where n is the number of mutations of each type i, and N is the total number of all mutations. For both metrics, diversity profiles of the poliovirus coding sequence, considering all sites with at least 1 mutation, were computed for Hsp90 N and hsp90 i . Upon validation that the observed differences between Hsp90 N and hsp90 i are consistent for all passages, average profiles across passages are reported. Global differences in mutation rate, evolutionary rates, and mutational diversity were tested for statistical significance with the Wilcoxon-Mann-Whitney test.
To test how host-cell chaperone levels influence mutations in the virus populations that impact protein folding and stability, we considered three classes of mutations namely destabilizing mutations, mutations that increase the protein aggregation propensity, and mutations that significantly alter sequence hydrophobicity. The effect of all amino acid mutations on protein stability was predicted with FoldX 38 (foldxsuite.crg.es) for poliovirus proteins P1 (PDB structure 2PLV), 3C (4DCD), and 3D (4NLR). The effect of amino acid mutations on protein aggregation propensity was predicted with Tango 41 (tango.crg.es), and changes in amino acid hydrophobicity upon mutation were assessed by the Kyte-Doolittle scale normalized to zero-mean and unitary standard deviation 75 . Mutations were subsequently classified into effect (E) and no-effect (NE), i.e., destabilizing if the predicted stability effect ΔΔG > 1 kcal/mol, and otherwise non-destabilizing; as aggregation-prone if ΔAgg > 2, and otherwise non-aggregation-prone; and as hydrophobic if ΔHyd > 4, and otherwise non-hydrophobic else (Extended Data Fig. 4a, b). Solvent accessible surface area (ASA) for each residue was computed from the PDB structures with the DSSP program 76,77 . Residues were considered buried with ASA < 30, at the interface if the residue contributes ASA > 30 to the monomer interface in the assembled capsid, and as surface residue if ASA > 50 and not at the monomer interface.
To identify sites along the poliovirus sequence that differ significantly in their ability to accommodate destabilizing, aggregation-prone, or hydrophobic mutations, we next tested whether the mutation class E was statistically associated with the condition of either Hsp90 N or hsp90 i by constructing a 2 × 2 contingency table for each site composed of the mutation counts of E_Hsp90 N , NE_Hsp90 N , E_hsp90 i , and NE_hsp90 i . Finally, sites that are systematically enriched in destabilizing, aggregation-prone, or hydrophobic mutations across passages were identified by the CMH test on the 2 × 2 × 7 contingency table per site and all seven sequenced passages. We found this the most robust approach considering experimental design, the nature of population sequencing and the possibility of spurious adaptation due to laboratory passaging 78 . After correcting p-values with the FDR method 70 , significant sites are defined through p < 0.05 and counted Hsp90 N if the odd's ratio (OR) > 1, and hsp90 i for OR < 1. This identifies sites that are systematically, across passages, under selective pressure to differentially accommodate specific sets of mutations, namely destabilizing, aggregation-prone, or hydrophobic mutations as function of host-cell chaperone levels.
To identify individual mutations that give rise to significant sites as defined above, CMH test on the 2 × 2 × 7 contingency tables were performed for every possible codon mutation along the poliovirus codon sequence comparing the mutation count and reference sequence codon count between Hsp90 N and hsp90 i as well as across passages. After p-value correction with the FDR method, we identified individual significant mutations through p < 0.05, as well as the requirement of them being at a significant site and following the same trend as the significant site, i.e., if a site is enriched in destabilizing mutations at Hsp90 N , then the individual mutations at the site also has to be destabilizing and enriched at Hsp90 N .
The mutational effect on codon optimality was tested based on the established tRNA adaptation index (tAI) 63 that quantifies how much coding sequences are adapted to the cellular tRNA pool for efficient translation. Here the tAI for HeLa cells based on quantification of tRNA levels through tRNA microarrays 79 was used to compute a host-cell tAI. As a control, results were compared to those obtained with a tAI index computed from tRNA gene copy numbers in the human genome 80 ( Supplementary Fig. 6). To assess global changes in codon optimality, we computed the ΔtAI = tAI mutation − tAI reference for each sequenced mutation in Hsp90 N and hsp90 i and analyzed the distributions of ΔtAI. To specifically test for an enrichment in mutations to more slowly translated non-optimal codons, we considered the bottom 20% codons with the lowest tAI values as non-optimal. The CMH test was used to identify sequence positions associated with mutations to non-optimal codons at Hsp90 N and hsp90 i across passages.