From light-harvesting to photoprotection: structural basis of the dynamic switch of the major antenna complex of plants (LHCII)

Light-Harvesting Complex II (LHCII) is largely responsible for light absorption and excitation energy transfer in plants in light-limiting conditions, while in high-light it participates in photoprotection. It is generally believed that LHCII can change its function by switching between different conformations. However, the underlying molecular picture has not been elucidated yet. The available crystal structures represent the quenched form of the complex, while solubilized LHCII has the properties of the unquenched state. To determine the structural changes involved in the switch and to identify potential quenching sites, we have explored the structural dynamics of LHCII, by performing a series of microsecond Molecular Dynamics simulations. We show that LHCII in the membrane differs substantially from the crystal and has the signatures that were experimentally associated with the light-harvesting state. Local conformational changes at the N-terminus and at the xanthophyll neoxanthin are found to strongly correlate with changes in the interactions energies of two putative quenching sites. In particular conformational disorder is observed at the terminal emitter resulting in large variations of the excitonic coupling strength of this chlorophyll pair. Our results strongly support the hypothesis that light-harvesting regulation in LHCII is coupled with structural changes.

. To model our homogeneous 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) bilayer we used parameters from the same set of lipid topologies 11 , which are available on http://lipidbook.bioch.ox.ac.uk. LHCII-membrane system modeling. Embedding of the LHCII complex into a pre-equilibrated POPC bilayer was performed via a multi-step protocol. Relaxation of the bilayer prior to inserting the whole protein-cofactor complex allows a shorter equilibration time for the final system (LHCII + bilayer + explicit solvent). Therefore, our first step was to create a pre-equilibrated lipid bilayer satisfying the following features: the membrane should be wide enough to separate LHCII complex from its periodic image with at least ~6 lipids, and should present a "pore" large enough for the insertion of LHCII. To embed LHCII in the membrane we used genbox (tool of GROMACS 1 ), which deletes any lipid or solvent molecules within van der Waals distance from the solute (in our case LHCII).
We first converted the LHCII-apoprotein structure (without cofactors) to a coarse grained force-field model (MARTINI 12 ) via martinize.py tool (available at http://md.chem.rug.nl/cgmartini/). The MARTINI-LHCII apoprotein was later embedded into a MARTINI-POPC bilayer (two homogeneous layers of pure POPC in the ratio 204:204 per monolayer) with 5376 MARTINI-water beads via the insane.py tool 13 (available at http://md.chem.rug.nl/cgmartini/). After 500 steps of minimization (steepest descent), we then relaxed the system via 30 ns of NPT simulation (at a temperature of 323 K to equilibrate the bilayer faster). During all these steps, strong isotropic position restraints (10000 kJ mol -1 nm -2 ) were applied to the whole protein to avoid shrinking of the protein structure (which in this simulation lacks all the cofactors). From the last snapshot of the MDs, we retrieved the coordinates of the bilayer alone (therefore presenting a pore in the membrane of the size of the CG-LHCII previously embedded). Finally, POPC bilayer was backmapped to the united atom resolution (GROMOS 53a6) 11 via the SUGARPIE tool 14 .
Finally, after alignment of the GROMOS-LHCII with the pore in the GROMOS-POPC bilayer (via editconf tool, GROMACS 1 ), we embedded the full LHCII in the membrane (via genbox tool, GROMACS 1 ), and consequently solvated the system with water (via genbox) and ions (via genion tool, GROMACS 1 ).
As described in the main manuscript, the final system consisted of one monomer of LHCII (including all the cofactors) embedded in a bilayer formed by 344 POPC molecules (a few lipids in van der Waals radius from LHCII were automatically deleted by genbox as previously explained), and solvated by more than 15k water molecules at neutral physiologic conditions (10 mM Na + Cl -) 15 .
Simulations. During minimization (steepest descent), NVT relaxation (10 ps), and the first part of the NPT equilibration period (40 ns), isotropic strong position restraints were applied to the protein and to its ligands. In this way, we aimed at relaxing the membrane and the solvent, minimizing perturbation of the protein and of the ligands crystal positions 3 , as done by Ogata and coworkers for Photosystem II 16 . More in detail, position restraints were applied to the protein backbone, the chlorophyll-tetrapyrroles (but not the phytol tails), the carotenoid molecules and the DPPG lipid. Position restraints were set to a starting value of 10000 kJ mol -1 nm -2 and then gradually reduced to zero every 10 ns of NPT simulation (for a total of 40 ns simulated time), following the sequence 10000à 1000à 500à 200à 0 kJ mol -1 nm -2 . This equilibration protocol was repeated for the control simulation missing the initial crystallographic water molecules and the DPPG molecule (see main manuscript and Table S1). The final snapshot retrieved after the last 10 ns at 200 kJ mol -1 nm -2 (for the simulation started with the complete LHCII structure, including crystallographic water and DPPG), was used as starting conformation for the principal simulation A, B, and C (which have all been started from different random velocities). As anticipated in the main manuscript, from this conformation we started also the three control simulations (A-, B-, C-N-term) where the N-terminus (first 39 protein residues) was allowed to relax for additional 100 ns while the other cofactors and the protein backbone were kept constrained (force constant kept at 200 kJ mol -1 nm -2 ), prior to removal of the remaining position restraints. Finally, a total of 7 independent NPT simulations were run up to ~1.1 µs (for the full set of MDs see Table S1). An integration time step of 2 fs was used for all the simulations, applying constraints on all the bonds (LINCS algorithm 17 ). Particle Mesh Ewald (PME) scheme was used to treat longrange electrostatics, and cutoff values of 1 nm and 1.4 nm were selected respectively for short-range Coulomb and van der Waals interactions. Pressure was set to 1 bar under semi-isotropic coupling to a Parrinello-Rahman barostat 18 , with relaxation time constant of 5 ps and compressibility of 4.5·1o -5 bar -1 . Temperature was kept at 300 K via a Nose-Hoover thermostat 19 -scheme, with 0.5 ps time constant and with solvent, membrane and LHCIIcomplex coupled to the thermostat. All the simulations were run with Periodic Boundary Conditions (PBC).
In one of the control simulations (simulation No Water), the buried water molecules were absent and only the bulk water molecules were present in the starting simulation box. Using this control, we tested if the water is necessary for the stability of LHCII, since the water molecules are different in number and location in the two available LHCII crystal structures 3,20 ( Figure S2.A). The absence of most of the water molecules in the latest crystal 20 might be due to limits in resolution of x-ray diffraction 21 , and not to a functional reason. MDs represent a powerful tool to investigate the effective water sites 16,22 . Our simulation initially missing all the interstitial water molecules, converged to a re-hydrated LHCII in less than 100 ns (Video S1). In all the MDs, water molecules stably occupied the same sites ( Figure S2.C), corresponding to the ones expected from the crystal of Liu et al 3 , testifying that equilibrium is reached over the explored time-range. As hypothesized before 3 , we find that most of the water molecules buried inside the protein are involved in chlorophyll coordination and stabilization of the helices ( Figure  S2.B).
Analyses. Unless otherwise stated, the analyses were run over all the simulations except for the control simulation missing the crystallographic water molecules (MD No water, see Table S1). In most of the analyses, calculations have been performed on the whole trajectories after the first 400 ns of thermal equilibration. The description of the analyses is directly embedded in the caption of the respective figure.      (   Table S1. List of the simulations performed.

Figure S2.A-C. Water occupancy in LHCII.
Here we calculated the water occupancy volume maps (at 1 Å resolution and combining all the frames by averaging) corresponding to the different populations of water molecules within 3 Å from LHCII protein, pigments and DPPG. Analysis of water occupancy in LHCII was performed via the Volmap tool (VMD) 2 , over the full trajectory of the simulations after the first 400 ns. In C the volume maps are represented as a mesh and correspond to the positions occupied by water molecules respectively for 10%, 20% and 40% of the total simulation time (after the first 400 ns). Three structures are shown: the Crystal structure (black), the simulation "No Water" (blue) and the principal simulation A (green). In panel A the structural alignment of the protein (cartoon), water and DPPG molecules from the two available high resolution crystal structures of LHCII is shown: PDB-1RWT 3 in black with the water molecules described as a grid and PDB-2BHW 20 in yellow with the water molecules described by their van der Waals surfaces. In B the final protein structure from simulation A is shown, together with the water volume map corresponding to 40% occupancy in this simulation (the same as the green structure in C bottom right). The chlorophylls found within 3 Å from the reported water volumes are also shown. We found that DPPG helps to prevent large movements of the N-terminal domain which, in the absence of this lipid (simulation "No Water"), partially inserted in the membrane (as seen from the blue structure in C and in Video S1). The observed influence of DPPG on the N-terminus folding is particularly interesting considering that DPPG was shown to be necessary for LHC trimerization 23 , which in turn is controlled by the N-terminus 24 .     (  Table S2 and Figure S5. Neoxanthin H-bond to TYR112 and Chla604. H-bond occupancy is defined as the percentage of time that the H-bond is present between two partner atoms, over the total time of the simulation, excluding from the analysis the first 400 ns of simulation. The analyses were performed for each simulation. Via the HBonds plugin of VMD 2 a search was carried out over the atoms of the whole Neo molecule for binding partners amongst any LHCII residue and LHCII cofactor (pigments and DPPG) using the following criteria: Donor-Acceptor distance should be less than 3.5 Å (cut-off distance), Donor-H-Acceptor angle should be less than 30 0 (cut-off angle) and an occupancy higher than 10% should be present at least in four simulations. The is divided in two sections describing H-bond occupancy at the Lumen side and Stroma side of the membrane. The values of occupancy (% of the total simulated time after the first 400 ns), the average value (over the simulations) and the standard deviation associated to the average (%) are given per each simulation. At the bottom of each section the atoms involved in the H-bond with the hydroxyl group of each carotenoid, as determined by the analysis, are listed. In the Figure, the Neo carotenoid binding sites is shown. For each pair (carotenoid-residue) all the atoms possibly involved in the H-bond during the MD trajectory (see Table) are shown. Figure 3 in the main manuscript, Table S3 in the SI). Carotenoid transition dipole moment (transition S 2 ßS 0 ) was considered oriented parallel to the central part of the polyene chain, as indicated by the green arrow in Figure 3 of the main manuscript and as modeled in previous studies 3,25,26 . We first employed a rotational + translational fit of the trajectories onto the protein backbone of the central helices A and B, for a scheme see Figure S2, taking as reference for the fit the same structure used for the RMSD analysis, see Figure S4. The z-axis of the box roughly coincides throughout the resulting trajectories with the 2-fold symmetry axis of the LHCII crystal (indicated in Figure 1.C, main manuscript). The angle between the dipole moment vector and the z-axis of the simulation box (often referred to as protein axis in the text), was calculated via dot product by using g_sgangle tool (GROMACS 1 ). Time evolution of the angles respect to the protein axis is reported in Figure 3, of the main manuscript.

Carotenoid transition dipole moment (
The final average angle !" has been computed for each carotenoid and for each simulation (by averaging over the whole trajectory starting from 400 ns). The average variations of the angle respect to the crystal, Δ (expressed in percentage), were calculated per each carotenoid as average over the !" computed on the different simulations as: where !"#$%&' is the value of the angle in the crystal. Additionally the final average angle, !" , averaged over the final angles ( !" ) of each simulation, and the standard deviation for this average were calculated. Also, we report the average variation Δ (expressed in percentage), respect to the crystal value which was calculated as: • 100   Table we report per each  carotenoid the average angle , calculated over the full set of average angles , the associated standard deviation (both expressed in degrees) and the average variation from the crystal value (expressed in percentage) calculated as described above.

Excitonic coupling (Figure 4 in the main manuscript, Figure S6-7 and Table S4-5 in the SI).
Q Y chlorophyll transition dipole moments were taken parallel to the nitrogen ND-NB axis, as in 3,25,27 . Carotenoid transition dipole moments were calculated as described above. Interaction energy (in cm -1 ), also known as excitonic coupling strength, was calculated using the following formula: where ! is the local field correction factor, the module of the transition dipole moment and the normalized transition dipole moment vector, ! the relative dielectric constant (here equal to 2.4 27 ), R the module of the distance between the center of the dipole moment vector and !" the normalized distance vector (in nm), is often referred to as orientation factor, k 28 . Dipole moment values were taken as 4, 3.4 and 4.5 Debye respectively for Chla, Chlb and Cars 3,25,27 . Angles between dipole vectors and distance vector (necessary for the dot products listed above) were calculated via g_sgangle tool (GROMACS 1 ). Calculations of the final coupling strengths for the strongest coupled clusters, as in 3,25 (listed in Table S4-5 together with their standard deviation), were run via homebuilt codes over the full set of trajectories, and the final average value, !" (and the associated standard deviation) was calculated discarding the first 400 ns of each simulation. Per each !" obtained in each simulation and per each Chl-Chl pair, we then calculated the variations of the coupling strength respect to the crystal (Δ ), which are expressed in percentages, similarly to what explained above for Δ . Also, the final average variation, Δ , was calculated for each Chl-Chl pair similarly to Δ , based on the final average coupling value (average over the various !" ).      30 where the partial charge q is equal to = ! ! (with being the dipole taken equal to 4 Debye as in the point-dipole approximation and d the distance between the NB-ND nitrogens of the same chlorophyll) and ! ! − ! ! is the distance between the partial charges located in this approximation on the NB and ND nitrogens of each chlorophyll. ! is the vacuum permittivity. It should be noted that, in this approximation, partial charges have been placed on the nitrogens, which might result in an underestimation of the actual dipole extent. As reported in the main manuscript, the fluctuations in coupling are reduced compared to the point-dipole method (see Figure 4 and Supplementary Figure  S6). However, the trend over time computed via the point-dipole method is the same, confirming that the rearrangement of this chlorophyll pair leads to a strong reduction of their interaction energy. Table S4. Chl-Chl excitonic coupling strengths. In the Table we report the results of the analysis described above. For each Chl-Chl pair and for each simulation, we list the average excitonic coupling strength, , the associated standard deviation (both expressed in cm -1 ) and in brackets the variation in the coupling strength ( ) respect to the crystal, expressed in percentage. On the bottom part of the Table, we report the final average coupling , calculated over the whole set of , and the associated standard deviation (both expressed in cm -1 ). Finally, we report the average variation in coupling strength (expressed in percentage).  Table S4.
The distance between the most flexible part of the N-terminus (residues 14-44, see Figure S4) and the DPPG headgroup, and the distance between the Chlb607-formyl hydrogen and the GLN131 oxygen were calculated using the g_dist tool of GROMACS 1 . The associated final average values (d N-terminus-DPPG , d Chlb607-GLN131 ) and the associated standard deviations (calculated starting from 400 ns of trajectory), the relative variations per each simulation Δd N-terminus-DPPG and Δd Chlb607-GLN131 and the final average variations Δ !!!"#$%&'(!!""# and Δ !!!"!"#!!"#!"! were calculated as described above for the carotenoid dipole moment analysis. The results are summarized in Table S6.       Table S5).    (  Table S7. Correlation between conformational changes and excitonic coupling. Here the correlation coefficients (Pearson) between a set of conformational changes and the variations of the excitonic couplings (Δ ) are reported. With this analysis we tested the hypothesis that a correlation exists between the ensemble of possible coupling states obtained from the relaxed structures of LHCII in the membrane (in the different simulations), and the various conformations reached after the systematic conformational changes reported. The conformational changes that we considered are the decrease of distance between the N-terminus and the DPPG headgroup, Δd N-terminus-DPPG , the Neo distorsion calculated as variation in the dipole moment angle respect to the z-axis of the protein, Δϑ neo , and the increase of distance between the Chlb607-formyl hydrogen and the GLN131 oxygen (H-bond loss), Δd Chlb607-GLN131 . Per each single Chl-Chl or Chl-Car exciton dimer, we calculated the Pearson coefficient over the ensemble of different Δ and Δd N-terminus-DPPG , Δϑ neo or Δd Chlb607-GLN131 reported from the simulations A, B, C and A-, B-, C-Nterm. The values used in this analysis are reported in Table S3-7. As reported in the main manuscript ( Figure 5), the (0,0) point per each serie of data represents the crystal state. Figure S11 and Table S8. Correlation between the Neo bending motion and the putative quenching site Chla603-Lut 2. For each single simulation the correlation over the full trajectory between the Neo tilt angle (reported in Figure 3) and the Chla603-Lut 2 time evolutions (reported in Figure 4 and Figure S7) is shown. For each simulation we calculated the Pearson coefficients, which are then summarized in Table S8.  Video V1. LHCII re-hydration. In this video, the full trajectory of the MD "No Water" is shown (for a total of ~1.1 µs, see Table S1). This simulation of the LHCII complex has been started in absence of the crystallographic waters (which are then present in the bulk water for a total of ~15K water molecules) and of the lipid DPPG. In the video the structural alignment of the MD-protein (red) and LHCII from the crystal 3 (1RWT PDB, chain A, black) is shown. The water molecules are shown as van der Waals-glossy beads (MD No water) and as black wireframe (crystal). The water molecules within 3 Å from the protein are shown. The re-hydration of LHCII apoprotein takes place in less than 100 ns in this simulation.
Video V2. Violaxanthin detachment and Neoxanthin distortion. In this video we show the full trajectories, after fitting them onto the C α carbon atoms of the proteins, of the three principal simulations (A, B and C). The protein structure is represented in black for simulation A, magenta for simulation B, green for simulation C. On the left side of the protein structure the Violaxanthin molecule (left) and the DPPG lipid (right) for each simulation are shown. On the right side of the protein the carotenoid Neoxanthin is shown. The color code for the cofactors is the same as for the protein. In the Video two main events are shown: the detachment of Violaxanthin (Figure 3 and Table S3) and distortion of Neoxanthin ( Figure 3 and Table S3).
Video V3. H-bond loss at the Chlb607-Chlb606 site. In the Video, extracted from the full trajectory of simulation A, loss of the H-bond between Chlb607-formyl group and the GLN131 oxygen is shown (see Figure  S8). Also, it is possible to see the switch of the GLN131 oxygen that during the simulation becomes the ligand for the central magnesium of Chlb606 (see Figure S8). Chlb607 and Chlb606 are respectively in purple and yellow. GLN131 and the formyl group of the two Chls are colored based on their atoms (red for oxygen, white for hydrogen, cyan for carbon and blue for nitrogen). The protein in the background is rendered in transparent green. Water molecules within 3 Å from the Chl607-Chl606 group and from GLN131 are also shown.
Video V4. Energy disorder at the Chl611-Chl612 site. In the Video we report the event registered at the Chla611-Chla612 site showing the full trajectory of simulation A. In this simulation a strong deviation from the crystal value of the excitonic coupling strength for this cluster of Chls is observed (see Figure 4 and S6, and Table  S4). The Video shows that this disorder is originated by a variation of the relative distance and orientation of the two Chls. It is shown that DPPG interacts with Chl611 and with various residues at the N-terminus of the protein.
Chla612 is in magenta and Chla611 is in gold. DPPG lipid is colored based on each constituent atom (see Video V3 caption). In the background the LHCII apoprotein is in transparent green. All the residues of the N-terminus within 3 Å from the DPPG headgroup are shown. It is then possible to see that going toward the end of the simulation (1.04 µs, see Table S1) the network of interactions of the DPPG with the N-terminus largely increases.