The mechanochemistry of copper reports on the directionality of unfolding in model cupredoxin proteins

Understanding the directionality and sequence of protein unfolding is crucial to elucidate the underlying folding free energy landscape. An extra layer of complexity is added in metalloproteins, where a metal cofactor participates in the correct, functional fold of the protein. However, the precise mechanisms by which organometallic interactions are dynamically broken and reformed on (un)folding are largely unknown. Here we use single molecule force spectroscopy AFM combined with protein engineering and MD simulations to study the individual unfolding pathways of the blue-copper proteins azurin and plastocyanin. Using the nanomechanical properties of the native copper centre as a structurally embedded molecular reporter, we demonstrate that both proteins unfold via two independent, competing pathways. Our results provide experimental evidence of a novel kinetic partitioning scenario whereby the protein can stochastically unfold through two distinct main transition states placed at the N and C termini that dictate the direction in which unfolding occurs.

is calculated by subtracting the initial folded length (Fl = 1.12 nm, which is the distance between the N--and C--termini obtained from the crystal structure) to the protein extension, (Ex = # of amino acids * 0.38 nm/aa -Fl] and adding the length of the formed disulfide bond (ss). (B) When the unfolding process occurs via an intermediate, the contribution of the disulfide bond length is dictated by which termini unfolds. In addition, the remaining folded portion of the protein must be accounted for. As the disulfide bond is located between the N--terminus and His--46, when the N--terminus unfolds the bond length is accounted for in the initial unfolding event (orange equation). (C) Conversely, when the C--terminus unfolds first, the disulfide bond remains in the folded region of the protein, and is only extended once the metallo--bond has been ruptured (blue equation).  Figure 4C) is demonstrated (D) by the results from the Hartigans' diptest, D =0.0511, p = 4.62 e --5 . Hartigan's diptest statistic for unimodality/multimodality provided with a test with simulation--based p--values was applied thanks to a package compiled in R (http://cran.rproject.org/web/packages/diptest/diptest.pdf). Raw data trajectory of three azurin unfolding events (Orange WLC) followed by the unfolding of two I27 marker proteins. Trajectories are typically acquired using a high Nyquist fraction (~0.5) and a high number of points per trace (~5,000--9,000). (B) Reducing the Nyquist fraction down to (0.1) and applying a low pass filter, which removes small fluctuations to reveal the unaltered form of the data, provides further evidence of the presence of a clear mechanical intermediate (red arrow) that occurs after the main protein unfolding event. (C) Representing the trajectory as markers instead of a continuous line, where each marker corresponds to an acquired data point, reveals a high density of data points defining the intermediate. The localisation of data points indicates that the cantilever gets temporarily 'held' at a mechanically resilient event, signature of barrier crossing. In the case of azurin, such mechanical resilience originates from one of the metallobonds coordinating the copper ion.    His89 / Cys92 I27 Supplementary Figure 11. Increment in contour length hallmarking the intermediates encountered upon unfolding plastocyanin from the N--and C-termini. (A) Histogram of contour lengths corresponding to the protein unfolding from the N--terminus up to the His39-Cu bond (brown, 13.8 ± 1.4 nm, n=82) and to the remaining extension once the bond has ruptured up to the total protein contour length (light blue, 23.3 ± 1.9 nm, n=82). Similarly, when the C-terminus unfolds up to either the Cys89 or His92-Cu bond, an increment of 6nm is obtained (green, 5.9 ± 1.1 nm n=66). Once this bond has been ruptured, the remaining 30.5 ± 1.5 nm is measured (dark blue, n=66). (B) Histogram of increment in contour length for the I27 marker protein (28.0 ± 0.6 nm, n=326) and (C) its corresponding unfolding force (230 ± 24 pN , n=326).   Tables   Supplementary Table 1 Azurin (PDB:1AZU) -Formed disulfide bond 11.6 10 18 The fact that the results obtained by the Ellman's assay (red) clearly follow the trend of the expected number of free cysteines present in the sample, and in particular that the wt protein (with the expected disulfide bond) exhibits a lower value than the C26 mutant (where the disulfide bond is not present) strongly suggests that the disulfide bond is preserved in the wt--azurin construct. b) ICP--OES experiments: Measurement of the copper content in the analysed protein samples provides quantification of the % of copper uptake for each sample (assuming that 100% of the proteins are correctly folded). (I27--Azu)4: 38.58% (I27--AzuC112A)4: 11.38 % (I27--Plastocyanin--wt)4: 55.94% The copper uptake for each of these proteins is, as expected, lower than 100%. This implies that some of the proteins within the polyprotein context did not bind the copper cofactor. This is likely to explain the two--state trajectories, lacking the mechanical intermediate, which we observed in the single molecule mechanical experiments. Crucially, the C112A mutation, while showing a lower percentage of copper uptake, does not completely abolish copper binding. These results are in line with our experimental observations showing that the C112A mutant exhibited a larger proportion of two--state trajectories. Quantitative correlation between the % of copper uptake and the % of trajectories showing a mechanical intermediate is difficult, since the copper uptake fluctuates within different individual purifications up to ±25%.

Single Molecule Force--clamp experiments:
When working under force--clamp conditions, individual polyproteins were fished by pushing the cantilever onto the surface exerting a contact force of 500--1500 pN so as to promote the non--specific adhesion of the proteins on the cantilever surface. The piezoelectric actuator was then retracted to produce a set deflection (force), which was set constant throughout the experiment (~15--30 seconds) thanks to an external, active feedback mechanism while the extension was recorded. The force feedback was based on a proportional, integral and differential amplifier (PID) whose output was fed to the piezoelectric positioner. The feedback response is limited to 3--5 ms. Thanks to the high--resolution piezoelectric actuator, our measurements of protein length have a peak--to--peak resolution of 0.5 nm. Data of the force traces was filtered using a pole Bessel filter at 1 kHz.

Circular Dichroism experiments:
Experiments were conducted in the facilities of King's College London according to the following procedure: UV & CD spectra of the 4 polyproteins in phosphate buffer were acquired on the Applied Photophysics Chirascan Plus spectrometer (Leatherhead, UK). 10mm and 0.5mm Quartz Suprasil rectangular cells (Hellma UK Ltd) were employed in the region 400--190nm. The instrument was flushed continuously with pure evaporated nitrogen throughout the experiment. The following parameters were employed: 2nm spectral bandwidth, 1nm stepsize and 1.5s instrument time per point. UV & CD spectra were buffer baseline corrected and measured at 23ºC. The far--UV CD spectra were smoothed with a window factor of 4 using the Savitzky-Golay method for better presentation. The far-UV CD spectra of the 4 polyproteins were then corrected for concentration and path-length and expressed in terms of ∆ε (M -1 cm -1 ) per amino acid residue (MWt =113). Protein secondary structure contents were assessed using the Principle Component Regression method based on 16 known protein structures.

Ellmann's Reagent experiments (Determination of free protein sulfhydryls):
Protein samples were dissolved in 0.1M phosphate buffer, pH 7.4, containing 4M guanidine hydrochloride and adjusted to a final volume of 950 µl. A reagent blank (without sample) was prepared in a similar manner. After the addition of 50 µL of 20 mM 5,5'--dithiobis(2--nitrobenzoic acid) (DTNB) the sample was mixed, incubated at RT for 15 minutes and the absorbance at 412 nm was recorded. The amount of sulfhydryl was calculated using the extinction coefficient of DTNB in the presence of GuHCl (13,700 M --1 cm --1 ). The number of free cysteines was determined using the protein and sulfhydryl molar concentration from each sample.

ICP--OES (Determination of copper content in the protein samples).
Experiments were conducted in the facilities of the University of Barcelona and the University of Edinburgh according to the following procedure: The content of each sample eppendorf tube was totally digested by using a 1mL HNO3 and 0.5mL H2O2 in a closed, Teflon reactor overnight at 90ºC. Samples were then cooled down to RT and 15 ml of ultrapure water were added to each individual eppendorf tube containing the protein samples. The Cu content of the resulting solution was performed using an ICP--OES instrument (Perkin Elmer, Optima 8300) at working wavelengths of 324.752 nm and 327.393 nm. Calibration was performed using 5 standard solutions prepared by dilution of a 1000 ppm certified standard solution.

MD simulations •
Generalities Constant--force steered molecular dynamics (MD) were performed on azurin starting from the PDB structure 4AZU solvated in a neutralized water box large enough in the direction of pulling to accommodate the partially unfolded protein.
Simulations were performed both on the native, WT protein and on an oxidised form lacking the disulfide bond between Cys3 and Cys26. In the former case, the box size was approximately 6*6*18 nm and contained 61,482 atoms. In the latter case, the 6*6*21--nm box contained 71,879 atoms. The NAMD 2.10 software was used to run the simulations on pure CPU or CPU/GPU nodes. Atomic parameters correspond to those in the CHARM36 force--field 1 , while atomic charges for copper (II) and atoms of the 5 ligands were taken from quantum calculations 2 . We also constrained the organometallic bond distances following the procedure described elsewhere 2 .
• Simulation procedure The atomic positions were first minimized for 2,000 steps using the steepest-descent algorithm of NAMD. All subsequent simulations were run in the NPT ensemble, using the Langevin thermostat (0.2/ps) and the Langevin barostat of NAMD. Long--range electrostatics were treated using particle--mesh Ewald with a grid size of 1.2 Å --1 . Simulation timestep was 2 fs and bonds between hydrogen and heavy atoms were maintained rigid. After the minimization, the water was equilibrated around the protein, which was maintained fixed for 1 ns. All protein atoms except terminal Cα 's were then allowed to move and equilibrated for 1 more nanosecond.
Starting from this previously equilibrated protein whose C--N axis was aligned along the longer edge of the box (direction of pulling), the Cα atom of the first residue was fixed while a constant force (250, 310 or 350 pN) was applied to the Cα atom of the last residue. In half of the trajectories, Cα from residue #1 was fixed and Cα from residue #128 was pulled; for the other half, Cα of residue #128 was fixed and Cα from residue #1 was pulled in the opposite direction. In the case of azurin, 20 such trajectories, lasting between 6 and 20 ns (until full possible extension was reached), were generated for the protein containing the disulfide bond (10 in each direction) in a 18--nm long box; 9 trajectories total were generated in the absence of disulfide bond, in a slightly larger box (~21 nm). Simulation results are presented in Supplementary Tables 3--7.
• Description of the unfolding dynamics in azurin MD simulations clearly revealed that, for both transition states, the rupture of the hydrogen bonds (HBs) between the involved β--sheets occurred through an unzipping mechanism because the force is applied parallel to the HBs axes (Fig. 4A). In contrast to a protein containing such HBs perpendicular to the force, HB rupture occurs here sequentially, which explains why a relatively low force is sufficient to partially unfold the protein from one side. In the C--terminal unfolding scenario, the unfolding involves the rupture of the β7--β8 and β2b--β8 hydrogen--bonds, which occur at the same time; when the protein unfolds on the N--terminal side, it unzips by rupturing the hydrogen--bonds between β2b--β8, then β3--β6 and β1--β3 partially, before the remaining hydrogen--bonds of β1--β3 break. Interestingly, after one of these partial unfolding events, the protein flips and now exhibits β--sheets HBs perpendicular to the force on the side that is still folded (snapshots in Fig. 4C). Crucially, the key HBs whose rupture would lead to unfolding are now maintained perpendicular to the force and can resist to it in a cooperative manner. This suggests an explanation to the small occurrence (10% of the cases) of consecutive 6--and 9--nm steps. Indeed the low mechanical resistance of the organometallic bonds (Cu--Cys112 and Cu--His46, 45 pN) makes them more likely to break before the HBs between β--sheets as long as they can cooperatively resist to the force.
The unfolding of the protein lacking its disulfide bond occurs preferentially from the N--terminal side. Interestingly, the time--course of β--sheets rupture events leading to unfolding slightly differs from that of the regular WT protein (detailed above). In this case, the protein generally unfolds as follows: first the hydrogen--bonds between β1--β3 break, followed by β1--β2a, β2b--β8, and finally β3--β6. This suggests that the presence of the SS bond locally rigidifies the protein and prevents the rupture of β1--β3, which clearly initiates the unzipping of the all the other β--sheets when the disulfide bond is not present. Instead, the main unfolding event in wt--Azu is the rupture of β3--β6 (which gives rise in the simulations to the presence of a short lived conformation around L=10 nm; Figure 4 in the main text).

• MD simulations protocol of plastocyanin (holo--and apo--forms)
Simulations of plastocyanin (PDB ID 3BQV) and apo--plastocyanin (#2PCY) were performed as previously described for azurin. As the active site of plastocyanin is very similar to that of azurin, with the exception of the fifth weak interaction with Gly45 of azurin being replaced by Pro38 in plastocyanin, we employed the same atomic charges as previously determined for azurin for His39, Cys89, His92 and Met97, and we took the charges on the carbonyl group and the Cα of Gly45 for the corresponding atoms of Pro38. This approximation is not expected to have any impact on the mechanical stability of the β--sheets located on the other side of the protein. As ligand--metal bonds display slightly different lengths than in azurin, we employed the equilibrium distances of the PDB structure to constrain these bonds during the simulations, with the exception of Pro38--Cu bond, which, since it is very long in the PDB (4.32 Å), we did not constrain as the interaction with the metal is probably extremely weak.
The rest of the simulation protocol was identical to that of azurin; solvation in a 6*6*18 nm box, minimization and equilibration, and then pulling at a constant force in the range of 250--310 pN by fixing the Cα from the first residue and pulling on the Cα at the other end of the protein in the z--direction, and vice versa. 10 such trajectories (5/5) for plastocyanin and 10 for apo-plastocyanin were generated following this protocol, lasting a total of 5 or 10 ns each. • Effect of the organometallic bonds In classical MD simulations, organometallic bonds are challenging to describe as they usually have a covalent character that is very sensitive to the local geometry and to the environment. In usual forcefields, an interaction between 2 atoms is either described by non--bonded interactions (electrostatic and vdW forces) or by a bonded term, usually as an harmonic potential if they form a covalent bond. However, using electrostatic and van der Waals potentials to describe the interactions of the metal with its environment (and in particular with the ligands) would cause the metal to leave the active site as it would be better solvated in water: we have checked that this happens very quickly in the simulations in the case of azurin. Therefore, additional harmonic potentials are necessary to give some covalent character to these bonds and thus to keep the geometry of the active site correct, the main drawback being that they cannot be broken during the simulation, nor changed in energy due to local changes of the environment or the ligand--field geometry.
For the purpose of the current work, such an approximation is very reasonable, for 2 main reasons. First, the mechanical unfolding events studied here all start with the breaking of key sets of hydrogen--bonds that are located on β--sheets placed on the other side of the protein, next to both termini. The breaking of metallic bonds is only involved at the very end of the unfolding events, when one portion of the protein is extended. This bond--breaking is not possible in the simulations but can be artificially accounted for when comparing the simulation observed extensions to that of the experiments (see Simulation section of azurin).
Second, we have simulated plastocyanin in its regular, metal--bonded form, but also in its apo--structure, lacking the metal. As seen in Supplementary Tables 5 and 6, in both cases we observe a very similar ratio of N--terminal to C-terminal unfolding events (around 80/20%, respectively, Supplementary Fig.  13), suggesting that the metal active site has actually none or very little influence on the unfolding mechanism of plastocyanin. Moreover, the first step of the mechanical unfolding in both scenarios is identical in each case (rupture of β1--β3 for N--terminal unfolding; rupture of β2b--β8 and β7--β8 for C--terminal unfolding). The main role of the metal in the current investigation is thus to reveal the intermediate states of the protein unfolding that would normally totally unfold in the absence of metal.
• Predictions of released contour lengths from MD trajectories MD simulations can be used to predict the increase in contour lengths of unfolding in the two scenarios (N--terminal and C--terminal unfolding). To this end, we have followed the following procedure (Supplementary table 7): we first measured, for each simulation, the extension of the protein during unfolding, i.e., the difference between the final and original end--to--end distances projected onto the z--axis. Then, since in the simulations unfolding does not occur up to Cys112 for C--terminal unfolding (rather, up to Met121) and up to His46 for N--terminal unfolding (rather up to Gly45), we artificially added the extra--lengths corresponding to the release of the amino--acids in between (Cys112--Met121 and Gly45--His46), at each simulation force, using the worm--like chain model with a