Fine control of chlorophyll-carotenoid interactions defines the functionality of light-harvesting proteins in plants

Photosynthetic antenna proteins can be thought of as “programmed solvents”, which bind pigments at specific mutual orientations, thus tuning the overall energetic landscape and ensuring highly efficient light-harvesting. While positioning of chlorophyll cofactors is well understood and rationalized by the principle of an “energy funnel”, the carotenoids still pose many open questions. Particularly, their short excited state lifetime (<25 ps) renders them potential energy sinks able to compete with the reaction centers and drastically undermine light-harvesting efficiency. Exploration of the orientational phase-space revealed that the placement of central carotenoids minimizes their interaction with the nearest chlorophylls in the plant antenna complexes LHCII, CP26, CP29 and LHCI. At the same time we show that this interaction is highly sensitive to structural perturbations, which has a profound effect on the overall lifetime of the complex. This links the protein dynamics to the light-harvesting regulation in plants by the carotenoids.

The lowest excited singlet states of the planar pigments were calculated by a full Configuration Interaction calculation within a Complete Active Space, using the semi-empirical AM1 Hamiltonian (AM1-CAS-CI) as implemented in the package MOPAC2016 2 . This method has been thoroughly tested by Kusumoto et al. 3 and shown to be consistent with more expensive methods. The method gives the lowest excited singlet state of Lut that is contributed 57.2% by double-excitation (2x HOMO-LUMO) configurations of the CI expansion. The lowest Chla excited singlet state is 81.7% HOMO-LUMO excitation. Calculated properties of the two molecules are summarized in Supplementary Table 1 20,753 14,220 a 0.49 ---a a Direct one-photon absorption of the S1 state is not observable. Energy for Lut S1 can be inferred indirectly, e.g., via two-photon absorption 6 and subsequent modelling as described in the 3.1 Section.
For subsequent coupling calculations, the transition density distribution cloud was represent by Transition Density Cubes (TDCs) using custom code 7 . The TDCs are a discretization of the continuous transition density onto a finite 3D grid of transition charge elements (cubes): where and * are the ground and excited state wave-functions, respectively, and , , define the grid size of the cube. The resultant transition densities are shown in Supplementary Figure 1. The TDCs are individually re-calculated for each pigment re-orientation in the study. there is no transition density in the phytol tail. Color coding (red and blue) is used to highlight the opposite sign of the transition charges in the corresponding regions. (b,c) Transition densities of both Lut and Chla in the mutual arrangement within LHCII (front and side views, respectively). Transition densities only marginally extend to the end groups of Lut. Note that the transition density of Lut is several orders of magnitude smaller than that of Chla, and appears similar here only because of different scaling. (d,e) The zoom-in of the region of closest proximity to demonstrate the subtle difference of the L1 (d) and L2 (e) sites (the transition density cloud is essentially the same, only the orientations differ). The VMD package 8 was used for plotting molecular structures here and in the subsequent Supplementary Figures.

Couplings form the transition density cube method.
We approximate the electronic coupling by its Coulombic part, which was calculated using the obtained TDCs via pairwise interaction between pigments m and n: The coupling calculation is carried out using the code by Krueger et al. 9 . To gauge the calculated coupling, we re-scaled the transition dipole moments (TDM) of the pigments. The Chla Qy TDM was re-scaled to 4.49 D (a recommended value given by Knox and Springs 5 ), while the TDM of Lut S1 (importantly, nonzero) was re-scaled to 0.767 D (a value reported in the literature reviewing more sophisticated approaches 10 ).
We note that at such small distances as considered in this study the coupling is sensitive to the precise intermolecular arrangement, where particular atom groups of one molecule appear in the immediate vicinity of the neighboring molecule. The fine features in the Interaction Energy Surface (IES) of the L1 site ( Fig. 4a of the main text), in comparison to the monotonic surface of L2 site reflect this sensitivity. The difference in the transition density distribution of the two sites is shown in Supplementary Fig. 1d,e, where Lut in site L1 is appears notably closer to Chla. However, it is also possible that such sensitivity to the details is overly expressed in the TDC method.
1.3 Coupling dependence on co-faciality of the pigment pair.
To study which degrees of freedom have largest influence over the Chla-Lut interaction, we investigated, among others, the rotation of Lut around its backbone axis (azimuthal rotation in Supplementary Figure  2a). This angle effectively describes the facing of the polyene plane (or co-faciality of Lut and Chla). As can be seen from Supplementary Figure 2b,c the effect of azimuthal rotations is almost negligible, especially compared to the change of Lut inclination towards or away from Chla. We note that in the Chla reference frame (the reference frame of the main article) the inclination rotation is effectively decomposed into the changes of ( , ) angles. Hence, we have excluded the planar tilting from the considerations of the main article.

Simulation procedure and parameters.
Crystal structure preparation. The high resolution X-ray crystal structure of LHCII from spinach (PDB: 1RWT 11 ) was used for MD simulation. We selected the trimer made of chains C, H, and E. The structure was cleaned by retaining the protein, the pigments, the 1,2-dipalmitoyl-sn-glycero-3-phosphoglycerol (DPPG) lipid molecule and the interstitial water molecules. The N-terminus chains are not completely resolved in the crystal structure and the first 13 residues are missing, therefore we capped Ser-14 with an acetyl group. Following the Poisson-Boltzmann calculations by Müh et al. 4 , all the titratable amino acids were considered in their standard protonation state at physiologic pH, except for His-120, which was considered protonated due to its interaction with Asp-111. All the other histidine residues axially coordinating the Mg atom of Chls were considered in δ conformation. Missing heavy atoms in Chls a604, b605, b606 and a614 phytol tails were added according to the Chl's structure by manually adjusting the torsion angles to avoid structural clashes. Finally, the position of the added heavy atoms and all the hydrogen atoms were minimized with 500 steps of steepest descent followed by 500 steps of conjugate gradient.

Membrane equilibration.
A DOPC (1,2-dioleoyl-sn-glycero-3-phosphocholine) bilayer membrane was generated by the CHARMM-Gui interface 12 with 450 lipids in each layer. The membrane was generated in a rectangular box with upper and lower water layers containing 37 water molecules per lipid molecule. The ionic strength was set at 0.1 M by randomly placing Na + and Cl -. Before placing the complex into the lipid bilayer, a pre-equilibration of the solvated membrane alone was performed according to the following protocol: i) minimization of the whole system (5000 steps of steepest descent followed by 5000 step of conjugate gradient) with periodic boundary condition; ii) first heating run (5 ps from 0K to 100K) in the NVT ensemble, under positional harmonic constraints (10 KCal mol -1 Å -2 ) on lipid molecules; iii) second heating run (100 ps from 100K to 300K) in the NPT ensemble by applying positional harmonic constraints (10 KCal mol -1 Å -2 ) on lipid molecules; iv) membrane equilibration run (55 ns at 300K) in NPT ensemble without any constraint.
Insertion of the complex in the membrane. The atom coordinates taken from the last step of the membrane equilibration run were translated in order to put the centre of mass of the lipid membrane in the origin of the Cartesian system. A LHCII shape cavity was generated at the centre of the membrane by removing the lipids that were closer than 1.5 Å from the complex. Then, the LHCII complex was inserted into the pore according to the orientation suggested by the protein membrane (OPM) database 13 . The final system contains a total of 209728 atoms.
MD protocol. The molecular dynamics of the LHCII embedded in the membrane was performed following the protocol described by Ogata and coworkers 14 in the simulation of the PSII supercomplex: 1) Three-step minimization: i) minimization of all hydrogen atoms; ii) minimization of ions and noncrystal water molecules (upper and lower water layers added to the system); iii) all atoms minimization.
1. Heating: 30 ps of heating run up to 300K in NVT ensemble. The crystallographic structure (protein, pigments, DPPG, interstitial water molecules) was constrained by a harmonic potential with a 5 KCal mol -1 Å -2 force constant.

Equilibration:
The equilibration run was performed gradually releasing the harmonic constraints (from 5 to 0 KCal mol -1 Å -2 , reducing by -0.25 KCal mol -1 Å -2 every 10 ps). This step was run in NPT ensemble with anisotropic pressure coupling. The 300 K temperature was controlled by the Langevin thermostat. When the force constraint became equal to zero, we completed the equilibration with an additional run of 2 ns without any constraint.
3. Production: 1µs of production run in the NPT ensemble at 300K was performed without any constraint. From this MD simulation, 1000 snapshots captured every 1 ns were used for the geometrical analysis (following Section 2.2) and QM calculations (Section 1.1).

Details of the MD simulations.
All MD runs were performed with the Amber14 suite of programs 15 , by using the GPU version of the code. An integration time step of 2 fs was used in combination with the SHAKE algorithm applied to the hydrogen atoms. Temperature and pressure were controlled by using the Langevin thermostat (collision frequency of 1.0 ps -1 ) and the anisotropic Berendsen weak-coupling barostat (coupling constant of 1.0 ps -1 ) respectively. Thanks to the new Lipid14 force field (see below for details), no additional superficial tension is needed 16 . The Particle Mesh Ewald method was used to treat the electrostatics, by applying a cut-off of 10 Å for long-range interactions.
Force Fields. The Amber ff14SB force field was used to describe the protein 17 . All carotenoids were modelled by an ad hoc force field described in Prandi et al. 18 Chlorophylls a were modelled with the set of parameters reported in Ceccarelli et al. 19 and modified by Zhang and coauthors 20 ; chlorophylls b were described with the same set of parameters of chlorophylls a except for the aldehyde group on porphyrin ring, whose parameters was taken from the General Amber Force Field 21 . The DOPC membrane was described with the Lipid14 forcefield 16 . Since this force field for lipid does not contain parameters for the internal DPPG lipid molecule present in the crystal structure, the previous version of the force field (Lipid11) was used for DPPG 22 . Water molecules were described through the TIP3P model 23 , and ions' parameters were taken from Joung and Cheatham 24 .

Superimposition of the planar structures onto the structures from the MD trajectory.
In order to obtain most reliable results, we calculate the couplings using the vacuum-optimized (Section 1.1) "model" Lut and Chla molecules superimposed on the originals, rather than the actual structures from either crystallography or MD simulations, which have distorted bond structure. The superimposition is performed using pair fitting routine of the PyMOL software 25 . While the four nitrogens of the chlorin ring are used for pair fitting the Chlas, for Car the four backbone carbons having attached methyl groups and the four methyl carbons (8 atoms altogether) are used. The results of the procedure are shown in Supplementary Figure 3 for a randomly selected snapshot of both L1 and L2 sites from the single monomer.
While Chla demonstrates nearly-perfect coincidence (except for the unimportant phytol tail), the MD Cars slightly deviate from the planar structures (more pronounced S-shape not only within the molecular plane, but also away from the plane) and have their β-and ε-end-groups free to rotate.

Angular distribution among the L1/L2 sites and C/E/H monomers.
The distribution of inclination angles and (see the main text) shows some deviation from the normal distribution (conf., Figure 3d,e). Therefore we analyzed several cuts of the distributions evaluating the temporal and monomeric differences. Some initial transient behavior was identified within the trajectory. In Supplementary Figure 4a-d two subpopulations within C monomer are shown separately: the sub-200 ns values (red) and the subsequent values (blue). While for the L1 site both initial and subsequent values essentially overlap, a shift can be seen among the L2 site sub-distributions. We also show the corresponding angles determined for the two sites in the LHCII crystal structure (shown in black; PDB: 1RWT 11 ). Interestingly, it appears that, while the L1 site relaxed values in the membrane roughly coincide with the distribution in the crystal structure, the L2 site crystal values coincide with the initial distribution, but not the subsequent one. This shift might indicate some reorganization within the membrane with respect to the crystal structure upon relaxation. The effect is observed to a similar or lesser degree in the other two monomers (not shown explicitly).
The distributions among all three monomers are shown in Supplementary Fig. 4g-h FRET rates between pigments m and n are given by where is the resonance interaction between the pigment pair and and are the acceptor absorption and donor fluorescence time-domain response functions, and are the molecular energies (measured energies in Supplementary Table 1), and 2 is the reorganization energy (Stokes shift). The line-shape function is given by: Here, ″ ( ) denotes the spectral density. For the Chla, the response functions correspond to physical spectra defined by the spectral density suggested by Renger et al. 26 : where 0 = 0.5, 1 = 0.8, 2 = 0.5, 1 = 0.56 cm −1 and 2 = 1.94 cm −1 .
The carotenoid spectral density is generated to fit the two-photon absorption spectra by Walla et al. 6 . It is given by a decomposition into two terms for the underdamped modes (corresponding to two highfrequency carbon-carbon stretching modes) and one overdamped mode (to account for the rest of the thermal bath): Here, the values of the double-and single-bond stretching frequencies 27 are 1 = 1500 cm −1 and 2 = 1150 cm −1 , accordingly. The rest of the parameters are determined by fitting the two-photon absorption spectrum: 1 = 2 = 900 cm −1 , 1 = 2 = 300 fs for the underdamped modes; 0 = 450 cm −1 , 0 = 53 fs for the overdamped bath.
The measured transition energies are used: 14,900 cm −1 for Qy state of Chla; 14,220 cm −1 for the S1 state of Lut from the aforementioned fitting.
To examine the sensitivity of the FRET rates in the current setup to the energy gap between the pigments, we employ the inter-pigment coupling dependence given in Fig. 2 of the main article (i.e., the dependence of the interaction energy on the rotation of Chla around its z axis). In Supplementary Figure 5 the results are shown for the main set of parameters (dark-green line) as well as for the limiting values (light-green band). As can be seen, even in the regions of fairly strong coupling the change of energy gap over 1000 cm -1 yields no significant effect. This is the joint effect of small resonance interaction and large reorganization energy of Lut.

Supplementary Figure 5 |Chla-to-Car transfer times by FRET calculation.
Values of the coupling are given in the main article (Fig. 2). The upper boundary corresponds to Chla Qy state being just above the S1 state, while the lower boundary corresponds to Qy value for chlorophyll b (15,500 cm -1 ) 4 .
The insensitivity to the energy gap has several important biological implications. Firstly, the dissipationvia-S1 channel cannot be opened or closed in LHCs by modulating the energy gap. Secondly, there is no substantial difference between L1 and L2 sites, as sometimes is anticipated based on the fact Chla of L1 site belongs to the trio of lowest-energy Chls (so-called "terminal emitter") 4 . The dynamical arguments thus do not support the conjecture of L1 pair being the preferential quenching site. And last but not least, it is important to note that the established rate independence on energy gap actually provides the robustness and stability of the functioning the pairs within the fluctuating environments.
where A is the following matrix of rates: The rates of the matrix are defined as follows: = (4 ns) −1 is the decay rate of Chla in solution, = 1 ps −1 is an approximate effective rate of hopping from Chla to Chla, which is weighted for a pool-to-Chla transfer by an entropic factor = 6, which is the number of pool (non-L1/L2) Chlas; L1/L2 is the FRET rate from Chla to Lut in the corresponding L1/L2 site (back transfer is too slow to be considered), while −1 = 14 ps is the lifetime of Lut in solution.
The eigendecomposition of matrix A, given as = −1 (where is the vector of eigenvalues and C is the diagonalizing matrix), provides us with the solution to the Equation (9), and the lifetime of the whole complex is given by where Tr denotes a trace, and (0) = (

Coupling and lifetime dependence on the inter-pigment distance
To fully explore the inter-pigment configuration phase-space, we have also studied the distance dependence of the coupling. To that end we have displaced Chla by a vector (-1 Å,+1 Å,+1 Å) from its original position within L1 site (for directions see main text Fig. 1e,f). This parallel shift effectively increased the inter-pigment distance by ~1.73 Å. The resultant IES is shown in Supplementary Figure 7b. The coupling is reduced and somewhat loses the directional sensitivity as compared to the original IES (Supp. Fig. 7a), however, it does not fall off directly to zero. As in the main text we translate this difference into the LHC lifetime maps (Supp. Fig. 7c-e): the original map from the main text is repeated (c), and the re-calculated IES (panel b) is used either for L1 site (d) or both sites (e). In the latter case the LHC lifetime approaches the 2 ns, however, the significant part of the map still covers sub-2-ns lifetimes.