Simulations of a protein fold switch reveal crowding-induced population shifts driven by disordered regions

Macromolecular crowding effects on globular proteins, which usually adopt a single stable fold, have been widely studied. However, little is known about crowding effects on fold-switching proteins, which reversibly switch between distinct folds. Here we study the mutationally driven switch between the folds of GA and GB, the two 56-amino acid binding domains of protein G, using a structure-based dual-basin model. We show that, in the absence of crowders, the fold populations PA and PB can be controlled by the strengths of contacts in the two folds, κA and κB. A population balance, PA ≈ PB, is obtained for κB/κA = 0.92. The resulting model protein is subject to crowding at different packing fractions, ϕc. We find that crowding increases the GB population and reduces the GA population, reaching PB/PA ≈ 4 at ϕc = 0.44. We analyze the ϕc-dependence of the crowding-induced GA-to-GB switch using scaled particle theory, which provides a qualitative, but not quantitative, fit of our data, suggesting effects beyond a spherical description of the folds. We show that the terminal regions of the protein chain, which are intrinsically disordered only in GA, play a dominant role in the response of the fold switch to crowding effects.

2 Supplementary Notes apply in this work to the G A /G B switch, we take the single-basin model for protein folding developed in Ref. [2].We start by describing this single-basin model along with a modification introduced here to enhance the conformational specificity of native contact interactions.We find that the enhanced contact specificity is necessary to make the two folds structurally well defined in the dual-basin model.Geometrically, the protein is represented by beads located at the C α atom positions.The conformation of an N -amino-acid chain can therefore be described by the bead positions r i , where i = 1, ..., N .Alternatively, a conformation can be described by the bond lengths, b i , bond angles, θ i , and dihedral angles, ϕ i , defined by the N − 1 (pseudo) C α -C α bonds of the chain.We denote by b 0 i , θ 0 i , and ϕ 0 i the values of b i , θ i and ϕ i in the native conformation.The potential energy E can be written as a sum of five terms: where ϵ sets the energy scale of the model and r ij = |r j − r i |.The first three terms represent bonded interactions with strengths set to ϕ = ϵ and K (3) ϕ = 0.5ϵ.The fourth term represents steric repulsions between bead pairs that do not form a contact in the native structure.The repulsion range is set to σ = 4 Å.These first four terms in Eq. 1 Figure N1: A contact between two non-terminal positions, i and j, (thick dashed line) has four different nearest neighbor-nearest neighbor distances (thin dashed lines): (A) r i−1,j−1 and r i+1,j+1 and (B) r i−1,j+1 and r i+1,j−1 .In evaluating the factor are identical to Ref. [2].
The final term in Eq. 1 represents native contact interactions, which in the previous model [2] were described by the Lennard-Jones potential Here we separate the interaction into a repulsive part (h ij ) and an attractive part (f ij ), such that they can be independently controlled.The repulsive part is described by a Weeks-Chandler-Anderson type function, where r 0 ij is the distance between beads i and j in the native structure.The attractive part takes the form where . With the construct in Eq. 3, the distance r ij as well as the two nearest neighbor distances, r ′ ij and r ′′ ij , (see Figure N1) must assume their respective native values r 0 ij , r ′0 ij and r ′′0 ij for ij to become a fully formed native contact, which then contributes −ϵ towards the total potential energy E. The parameter ξ 1 sets the width of the attractive well −ϵg ξ 1 (r ij ).The combination of this attractive well and the repulsive part of the interaction results in a function, h ij − g ξ 1 , with gross features similar to a Lennard-Jones Figure N2: The potentials h ij − g ξ 1 and f LJ (see text) as functions of r ij using r 0 ij = 6 Å. potential (see Figure N2).
The factor g ξ 2 (r ′ ij )g ξ 2 (r ′′ ij ) is included in f ij in order to increase the conformational specificity of the native interactions.For a contact between residues i and j, this factor promotes the local chain segments (i − 1, i, i + 1) and (j − 1, j, j + 1) to adopt relative orientations close to that found in the native structure.The strength of this effect is controlled by the parameter ξ 2 .It is weak when ξ 2 ≫ ξ 1 and becomes strong when ξ 2 ≈ ξ 1 .Test simulations on a few small single domain proteins show that decreasing ξ 2 leads to increased co-operativity in the folding transition (data not shown).We picked ξ 1 = 1.0 Å and ξ 2 = 5.0 Å.We note also that there are terms in Eq. 3 for which r ′ ij or r ′′ ij is undefined because i or j is a terminal bead.In those cases, we set the corresponding factor g equal to unity.
The effect from the factor g ξ 2 (r ′ ij )g ξ 2 (r ′′ ij ) in Eq. 3 is similar to so-called local-nonlocal coupling [3], which also leads to increased folding co-operativity.Our effect is not exactly the same, however, because it does not provide a direct constraint on the local internal conformation around beads i and j.Such a constraint does exist in local-nonlocal coupling.

Dual-basin structure-based model for fold switching
Next we extend the model of the previous section to a dual-basin (db) model, which provides bias towards two different reference structures "(a)" and "(b)".Such a bias can be achieved by first obtaining the two single-basin energy potentials E (a) and E (b) using Eq. 1, and thereafter merging them into a single energy surface, E (db) .Naively, one may attempt to put E (db) = E (a) + E (b) .However, this strategy is problematic for some types of interactions, as pointed out by Ramirez-Sarmiento et al. [4].For example, the sum of two quadratic bond ] is another quadratic function with minimum at (b Hence, this would abolish both minima.We combine the two single-basin potentials E (a)   and E (b) using the procedure described below, which avoids these problems.This procedure is then applied to the G A and G B folds to produce the dual-basin potential used in this work.
Bonded terms.The bonded interactions are represented by the first three terms in Eq. 1.
Consider two individual energy terms, e (a) (x) and e (b) (x), with global minimum at x = x a and x = x b , respectively.The functions e (a) (x) and e (b) (x) could be, e.g., the bond angle terms corresponding to a particular bond, in which case x = θ i .To "mix" e (a) (x) and e (b) (x) into a single function e(x), we use [5] e(x) = β −1 mix ln e −β mix e (a) (x) + e −β mix e (b) (x) , where β mix is a parameter controlling the smoothness of the mixing.We pick β mix = 10 for the bond term, and β mix = 5 for the angle and torsion terms.Examples of three different terms for the G A and G B folds are given in Figure N3.
Non-bonded terms.For the native contact term, we include all contact interactions present in either E (a) or E (b) .Although this is straightforward in principle, care must be taken to avoid double counting interactions for common contacts, i.e., contacts that occur in both (a) and (b).Moreover, we want to insert parameters κ A and κ B such that strengths of the attractive wells −ϵf term becomes , where the first two sums are taken over native contacts in (a) and native contacts in (b), respectively, excluding all common contacts, and the final sum is taken over these common contacts.Note that, for each common contact, only the energetically most favorable attraction is retained.The repulsive part, hij , is evaluated as h ij using the smallest of the two reference distances, i.e., r 0 ij = min r

Figure
Figure S1: (a) G A , G B , and total G A and G B (P tot ) fold populations for the protein G * AB , i.e., κ B = 0.92, as function of temperature T .Smooth curves are obtained from MBAR analysis [1].(b) κ * as function of T , obtained from simulations with κ B = 0.915, 0.917, 0.920, 0.923 and 0.925.(c) P tot as function of T , for the five κ B values in (b).Shown in (a) is also P tot for κ B = 0.915 (short dashed curve) and κ B = 0.925 (long dashed curve).

Figure S2 :
Figure S2: Crowder concentration (ϕ c ) dependence of the free energy of fold switching (∆F switch ) for the protein G * AB using four different potential energy functions: E (db) (original dual-basin model; see section 2.2 of this document), E (db) mod,1-7 (crowder-bead interactions turned off for the segment 1-7), E (db) mod,53-56 (crowder-bead interactions turned off for the segment 53-56), and E (db) mod (crowder-bead interactions turned off for both 1-7 and 53-56).Also shown, for ϕ c > 0, are the results obtained by adding the changes in ∆F switch for E (db) mod,1-7

Figure S3 :
FigureS3: Free energy as function of the number of G A contacts and the number of G B contacts, obtained for G * AB at k B T 0 = 0.88.In determining fold populations, we consider the G A fold formed if the number of G A contacts is > 58 and we consider the G B fold formed if the number of G B contacts is > 76, respectively (shaded areas).

Figure S4 :
Figure S4: Pair correlation function, g(r), obtained from a simulations of 1755 crowder particles in a cubic box with side 300 Å and periodic boundary conditions.The temperature is held fixed at T 0 (k B T 0 = 0.88).An effective radius of the crowders (R c = 12.5 Å) was determined based on the location of the first peak in g(r), which occurs at the interparticle distance r = 25 Å (vertical dashed line).

2. 1
Development of computational model for the G A /G B fold switch Single-basin structure-based model for protein folding As a starting point for the development our dual-basin structure-based model, which we

Figure N3 :
Figure N3: Examples of the merging of different bonded potentials for G A and G B (thin black solid/dashed curves) into a single potential (thick solid green curves) using the "mixing" equation 4 .

.
Picking r 0 ij this way is necessary to guarantee that both conformations (a) and (b) can be formed without suffering a strong steric repulsion in one of the conformations, which would otherwise happen when r also that r 0 ij for common contacts can be calculated before a simulation and that hij does not change form during the simulation.The nonnative repulsive energy term, i.e., the fourth term in Eq. 1, is evaluated over all pairs ij that are not contacts in either (a)or (b).