## Introduction

Light-harvesting (LH) antennas and photo-chemical reaction centers (RC) provide the elementary building blocks of the photosynthetic apparatus of plants, algae, and bacteria1. Primarily these molecular aggregates consist of absorbing molecules (pigments) complexed with specific proteins to form a PPC. Despite its fundamental importance to biology, the dynamical characterization of these complexes to a degree that can reproduce all reported spectroscopic data in a single microscopic model remains an outstanding challenge.

Reduced models of excitonic dynamics subject to purely thermal fluctuations can achieve reasonable agreement with linear optical spectra2,3,4,5,6,7,8,9. The quantitative explanation of all relevant aspects of multi-dimensional nonlinear spectroscopy though requires a more detailed model of the system-environment interaction that takes into account the full complexity of the environmental structure10. Indeed, spectroscopic studies of PPCs at low temperatures11,12,13,14 reveal the presence of vibrational environments that consist of a broad spectrum of low-frequency protein modes with room temperature energy scales, and several tens of discrete high-frequency modes that originate mainly from intra-pigment dynamics11,12,15. Nonlinear optical experiments on monomer pigments in solution at both 77 K16,17 and room temperature18,19, as well as first-principles calculations20,21 further corroborate the underdamped nature of intra-pigment vibrational modes with picosecond lifetimes.

Recently, a range of vibronic models in which pigments are subject to the combined influence of a broad unstructured bosonic environment and a small number of vibrational modes with frequencies in the vicinity of excitonic transitions have been formulated22,23,24,25,26,27,28,29,30,31. In this picture, vibrational lifetime borrowing can lead to long-lasting oscillatory dynamics of coherences between excitonic states, and observations of long-lasting oscillatory features in multi-dimensional spectroscopy32,33,34,35,36,37 have been attributed to this effect38,39,40,41,42,43. Notwithstanding, the identification of a universally accepted origin of these long-lived oscillations remains a subject of active discussion34,44,45,46,47.

An important obstacle that prevents the conclusive resolution of this debate is the fact that the interpretation of spectroscopic data and their underpinning dynamical features can be influenced significantly by the specific choice of electronic and vibrational parameters that enter the PPC models. We will demonstrate that by accounting for the full environmental spectral density, involving more than 50 intra-pigment modes per site in addition to a broad background, the presence of high-frequency long-lived vibrational modes can lead to quantitatively significant modification of the calculated linear spectra of PPCs and consequently the estimated values of electronic parameters to recover a best fit with actual measurements. These corrections do not appear when considering only selected resonant modes and go well beyond predictions obtained by using conventional line shape theory48,49,50,51.

To present our results, we provide an analytical theory of renormalization effects due to multi-mode vibronic mixing in model excitonic systems of two prototypical PPCs, namely the water-soluble chlorophyll-binding protein (WSCP) of cauliflower and the special pair (SP) of bacterial reaction centers, depicted in Fig. 1. By considering realistic environmental spectral densities, we corroborate our predictions using two independent numerically exact methods (the temperature-dependent time evolving density matrix using orthogonal polynomials algorithm, T-TEDOPA22,52,53,54, and the hierarchical equations of motion, HEOM55). We show that the hybridization of electronic and vibrational degrees of freedom requires a significant renormalization of electronic couplings. Importantly, this renormalization of electronic parameters, in turn, is shown to have a significant impact on the dynamics of excitonic coherences, notably the lifetimes of their oscillatory dynamics.

## Results

Electronic and vibronic couplings of PPCs. Absorption spectra of PPCs are determined by the electronic energy-level structure of pigments, their mutual electronic interactions and the coupling of the resulting excitons to vibrational degrees of freedom of the pigment’s environment. In the following, we will restrict our analysis to the Qy transition between electronic ground and first excited states of the pigments, which suffices for the evaluation of the low-energy part of absorption spectra and is relevant for photosynthetic energy transfer1. For the dimeric WSCP and SP, the electronic Hamiltonian is then described by (see Supplementary Note 1)

$${H}_{e}=\mathop{\sum }\limits_{i=1}^{2}{\varepsilon }_{i}|{\varepsilon }_{i}\rangle \langle {\varepsilon }_{i}|+V(|{\varepsilon }_{1}\rangle \langle {\varepsilon }_{2}| +| {\varepsilon }_{2}\rangle \langle {\varepsilon }_{1}|).$$
(1)

Here $$|{\varepsilon }_{i}\rangle$$ denotes the singly excited state of site i with on-site energy εi that is in the visible (WSCP) or in the near infrared spectrum (SP). The on-site energies depend on their local environment and therefore suffer from static disorder inducing ensemble dephasing that will be included in our numerical treatment. The electronic coupling V leads to delocalized electronic eigenstates (excitons), $${H}_{e}\left|{E}_{\pm }\right\rangle ={E}_{\pm }\left|{E}_{\pm }\right\rangle$$, and an excitonic splitting $${{\Delta }}={E}_{+}-{E}_{-}=\sqrt{4{V}^{2}+{({\varepsilon }_{1}-{\varepsilon }_{2})}^{2}}$$. In WSCP, the mean site energies are identical, 〈ε1〉 = 〈ε2〉, due to the symmetry of molecular structure, while in SP, the mean site energies are different as pigments are surrounded by nonidentical local protein environments. Another difference concerns the electronic coupling strength, which is stronger in SP due to electron exchange giving rise to short-range Dexter type contributions56,57.

The exciton dynamics of PPCs is driven by vibrational modes that induce fluctuations in the transition energies εi of pigments. The full electronic-vibrational interaction, induced by N vibrational modes per site, is described by the Hamiltonian H = He + Hv + Hev where

$${H}_{v}=\mathop{\sum }\limits_{i=1}^{2}\mathop{\sum }\limits_{k=1}^{N}{\omega }_{k}{b}_{i,k}^{{{{\dagger}}} }{b}_{i,k},$$
(2)
$${H}_{e-v}=\mathop{\sum }\limits_{i=1}^{2}|{\varepsilon }_{i}\rangle \langle {\varepsilon }_{i}|\mathop{\sum }\limits_{k=1}^{N}{\omega }_{k}\sqrt{{s}_{k}}({b}_{i,k}+{b}_{i,k}^{{{{\dagger}}} }).$$
(3)

Here the annihilation (creation) operator bi,k ($${b}_{i,k}^{{{{\dagger}}} }$$) describes a local vibrational mode of frequency ωk coupled to site i with a strength quantified by the Huang-Rhys (HR) factor sk. For an environment initially in a thermal state, the ensuing dynamics is fully determined by the environmental spectral density $$J(\omega )={\sum }_{k}{\omega }_{k}^{2}{s}_{k}\delta (\omega -{\omega }_{k})$$ whose structure needs to be determined experimentally or theoretically.

Structure of the environmental spectral density. Generally, in PPCs the spectral density J(ω) consists of a broad background and multiple sharp peaks distributed across a broad range of frequencies. These can be determined by fluorescence line-narrowing (FLN) and hole burning experiments which reveal that the environmental spectral densities of WSCP and SP consist of low-frequency broad features originating from protein motions, and 55 intra-pigment modes resulting in multiple narrow peaks in the high-frequency part of the spectrum. The contribution of the protein modes of WSCP may be described by log-normal distribution functions of the form $${J}_{l}^{{{{{{{{\rm{WSCP}}}}}}}}}(\omega )={\sum }_{m}(\omega {c}_{m}/{\sigma }_{m})\,\exp (-{[\ln (\omega /{{{\Omega }}}_{m})]}^{2}/2{\sigma }_{m}^{2})$$, which provides a satisfactory description of the low-energy part of experimentally measured FLN spectra of WSCP58. Alternatively, the protein motions of WSCP have been modeled by the following functional form: $${J}_{l}^{{{{{{{{\rm{B777}}}}}}}}}(\omega )=\frac{S}{{s}_{1}+{s}_{2}}\mathop{\sum }\nolimits_{i = 1}^{2}\frac{{s}_{i}}{7!2{\omega }_{i}^{4}}{\omega }^{5}{{{{{\rm{e}}}}}}^{-{(\omega /{\omega }_{i})}^{1/2}}$$ that has been extracted from FLN spectra of B777 photosynthetic complexes59 and considered in the simulations of WSCP60. Every underdamped intra-pigment mode contributes a Lorentzian of width γk ~ 1 ps−1, resulting in J(ω) = Jl(ω) + Jh(ω) where

$${J}_{h}(\omega )=\mathop{\sum }\limits_{k=1}^{55}\frac{4{\omega }_{k}{s}_{k}{\gamma }_{k}({\omega }_{k}^{2}+{\gamma }_{k}^{2})\omega }{\pi ({(\omega +{\omega }_{k})}^{2}+{\gamma }_{k}^{2})({(\omega -{\omega }_{k})}^{2}+{\gamma }_{k}^{2})},$$
(4)

and the reorganization energy of the high-frequency modes is given by $${\lambda }_{h}=\int\nolimits_{0}^{\infty }{{{{{\rm{d}}}}}}\omega {J}_{h}(\omega )/\omega =\mathop{\sum }\nolimits_{k = 1}^{55}{\omega }_{k}{s}_{k}$$. The reorganization energy of the 55 intra-pigment modes of WSCP13 (SP15) is 660 cm−1 (379 cm−1), which is several times larger than that of quasi-continuous protein spectrum58,61 and quasi-resonant intra-pigment modes with ωk ≈ Δ (see Supplementary Note 5). The presence of underdamped vibrational modes can lead to long-lived correlations between electronic and vibrational degrees of freedom that make the rigorous numerical treatment of the ensuing vibronic dynamics very costly. In non-perturbative HEOM simulations, where experimentally or theoretically estimated spectral densities are fitted by the sum of Drude–Lorentz peaks21,62, the simulation cost of a dimeric system exceeds several hundreds of terabytes when 55 intra-pigment modes are considered per site (see Supplementary Note 4) and, therefore, is infeasible with current computer architectures. In this work, we employ T-TEDOPA method where an experimentally estimated vibrational spectral density is mapped to a one-dimensional chain of quantum harmonic oscillators whose complexity is unaffected by the number of long-lived intra-pigment modes in the spectral density. We also employ optimized HEOM method where simulation parameters are determined by fitting the bath correlation function of highly structured environments for a finite time window corresponding to the line width of experimentally measured absorption spectra. These two methods enable one to consider the full environmental structures of WSCP and SP with a moderate simulation cost of the order of a few gigabytes or less (see Supplementary Notes 3 and 4). In addition, numerically exact results obtained by these two independent methods coincide, demonstrating the high accuracy and reliability of our simulated data (see Supplementary Note 6).

WSCP homodimer. The electronic parameters of PPCs have been estimated based on a comparison of experimentally measured spectroscopic data with approximate theoretical results where environmental structures are coarse-grained or vibronic couplings are treated perturbatively. Based on a coarse-grained spectral density $${J}_{l}^{{{{{{{{\rm{B777}}}}}}}}}(\omega )$$, shown in red in Fig. 2a, a best fit to the experimental absorption spectra of WSCP homodimers implies an electronic coupling strength estimate of V ≈ 70 cm−1 61, as shown in red in Fig. 2b. Such an electronic coupling results in an excitonic splitting Δ ≈ 2V ≈ 140 cm−1 which is consistent with the experimentally observed energy-gap between two absorption peaks at 656 and 662 nm, respectively. Since all the high frequency intra-pigment modes are neglected in the coarse-grained spectral density and the energy-gap between absorption peaks is smaller than the vibrational frequencies of the intra-pigment modes (Δ < ωk), the estimated value could be interpreted as the effective coupling V00 between $$\left|{\varepsilon }_{1},0\right\rangle$$ and $$\left|{\varepsilon }_{2},0\right\rangle$$ where $$\left|0\right\rangle$$ denotes the common vibrational ground state of the intra-pigment modes in the electronic excited state manifold. As shown in Fig. 3a, the transition dipole strength between $$\left|g,0\right\rangle$$ and $$\left|{\varepsilon }_{i},0\right\rangle$$ (0-0 transition) of a mononer is reduced by a factor of $$\exp (-{\sum }_{k}{s}_{k}/2)$$, as the total transition dipole strength of the monomer is redistributed to 0-1 transitions between $$\left|g,0\right\rangle$$ and $$\left|{\varepsilon }_{i},{1}_{k}\right\rangle$$ where only the k-th mode is singly excited (see Supplementary Note 10). As a result, the effective coupling between 0-0 transitions, shown in Fig. 3b, is reduced to $${V}_{00}=V\exp (-{\sum }_{k}{s}_{k})$$ depending on the HR factors sk of the intra-pigment modes. This implies that V00 ≈ 70 cm−1 corresponds to a bare electronic coupling $$V={V}_{00}\exp (\mathop{\sum }\nolimits_{k = 1}^{55}{s}_{k})\approx 2{V}_{00}\approx 140\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$ under the full environmental spectral density $${J}_{l}^{{{{{{{{\rm{WSCP}}}}}}}}}(\omega )+{J}_{h}(\omega )$$, including the 55 intra-pigment modes shown in black in Fig. 2a. The renormalised electronic coupling V ≈ 140 cm−1 yields a best fit to experimentally measured absorption spectra, as shown in black in Fig. 2c, when all the M = 55 intra-pigment modes are considered in simulations. The energy-gap between absorption peaks is gradually reduced from excitonic splitting Δ ≈ 2V ≈ 280 cm−1 to $${{\Delta }}^{\prime} \approx 2{V}_{00}\approx 140\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$, as the number M of the lowest-frequency intra-pigment modes considered in simulations is increased from 20 via 40 to 55 (see Fig. 2a, c). The electronic coupling V ≈ 70 cm−1 estimated based on the coarse-grained low-frequency spectral density cannot reproduce the experimental results when the full spectral density is considered in simulations, as shown in Fig. 2d. The energy-gap between absorption peaks shown in Fig. 2c, d can be quantitatively well described by the splitting of 0-0 transitions, $$2{V}_{00}=2V\exp (-\mathop{\sum }\nolimits_{k = 1}^{M}{s}_{k})$$, implying that the effective couplings V01 between 0-0 and 0-1 transitions, schematically shown in Fig. 3b, are not strong enough to modify the energy-gap between low-energy absorption peaks of WSCP. However, the weak V01 couplings can redistribute the transition dipole strength from 0-0 to 0-1 transitions and significantly modify the high-energy part of absorption spectra, which cannot be described by conventional line shape theory (see Supplementary Note 10).

Multi-mode vibronic mixing in exciton basis. In contrast to WSCP, the bare excitonic splitting of SP is of the order of the typical vibrational frequencies of the intra-pigment modes and the resulting redistribution of oscillator strengths and shifts of optical lines are much more difficult to predict. To qualitatively estimate these effects, we consider second-order perturbation theory starting from the full Hamiltonian H = He + Hv + Hev in the single-exciton manifold. In that case, the vibronic mixing is induced by the relative motion of the intra-pigment modes with identical frequency ωk, described by $${b}_{k}=({b}_{1,k}-{b}_{2,k})/\sqrt{2}$$, as the center of mass motion, described by $${B}_{k}=({b}_{1,k}+{b}_{2,k})/\sqrt{2}$$, merely induces the homogeneous broadening of absorption line shapes without affecting exciton dynamics (see Supplementary Note 1). Hence, we can discard the center-of-mass part of the total Hamiltonian to find H = H0 + HI where

$${H}_{0}={H}_{e}+{H}_{v}+\cos (\theta )\,{\sigma }_{z}\mathop{\sum }\limits_{k=1}^{55}{\omega }_{k}\sqrt{{s}_{k}/2}({b}_{k}+{b}_{k}^{{{{\dagger}}} }),$$
(5)

with $${H}_{v}={\sum }_{k}{\omega }_{k}{b}_{k}^{{{{\dagger}}} }{b}_{k}$$, and

$${H}_{I}=-\!\sin (\theta )\,{\sigma }_{x}\mathop{\sum }\limits_{k=1}^{55}{\omega }_{k}\sqrt{{s}_{k}/2}({b}_{k}+{b}_{k}^{{{{\dagger}}} }).$$
(6)

Here $$\theta ={\tan }^{-1}[2V/({\varepsilon }_{1}-{\varepsilon }_{2})]$$, while $${\sigma }_{x}=\left|{E}_{+}\right\rangle \langle {E}_{-}| +| {E}_{-}\rangle \left\langle {E}_{+}\right|$$ and $${\sigma }_{z}=\left|{E}_{+}\right\rangle \langle {E}_{+}| -| {E}_{-}\rangle \left\langle {E}_{-}\right|$$ are the Pauli matrices in the exciton basis. The Hamiltonian H0 is diagonalised by the polaron transformation in the exciton basis, $$U=\left|{E}_{+}\right\rangle \left\langle {E}_{+}\right|{D}_{\theta }+\left|{E}_{-}\right\rangle \left\langle {E}_{-}\right|{D}_{\theta }^{{{{\dagger}}} }$$ with $${D}_{\theta }=\exp [\cos (\theta ){\sum }_{k}\sqrt{{s}_{k}/2}({b}_{k}^{{{{\dagger}}} }-{b}_{k})]$$. For typical HR factors of PPCs, of the order of sk 0.01, the vibronic mixing is dominated by contributions from the single vibrational excitation subspace where it leads to eigenstates of H of the form

$$\left|{\psi }_{\pm }\right\rangle ={a}_{\pm ,0}\left|{E}_{\pm },0\right\rangle +\mathop{\sum }\limits_{k=1}^{55}{a}_{\mp ,{1}_{k}}\left|{E}_{\mp },{1}_{k}\right\rangle ,$$
(7)

with $$\left|0\right\rangle$$ and $$\left|{1}_{k}\right\rangle$$ representing vibrational states where all the intra-pigment modes are in their ground states or only one mode described by bk is singly excited. In second-order perturbation theory, these vibronic eigenstates $$\left|{\psi }_{\pm }\right\rangle$$ have energies

$${E}_{\pm }^{\prime}={E}_{\pm }\pm \alpha \frac{2{V}^{2}}{{{{\Delta }}}^{2}}\mathop{\sum }\limits_{k=1}^{55}\frac{{s}_{k}{\omega }_{k}^{2}}{{{\Delta }}\mp {\omega }_{k}},$$
(8)

and the purely excitonic splitting Δ = E+ − E is shifted to a vibronic splitting

$${{\Delta }}^{\prime} ={E}_{+}^{\prime}-{E}_{-}^{\prime}={{\Delta }}\left(1+\alpha \frac{4{V}^{2}}{{{{\Delta }}}^{2}}\mathop{\sum }\limits_{k=1}^{55}\frac{{s}_{k}{\omega }_{k}^{2}}{{{{\Delta }}}^{2}-{\omega }_{k}^{2}}\right),$$
(9)

where $$\alpha =\exp (-2{\cos }^{2}(\theta )\,\mathop{\sum }\nolimits_{k = 1}^{55}{s}_{k})$$. These energetic corrections are in complete analogy to the well-known light shifts in atomic physics. The sign of these energy shifts is determined by the difference in excitonic splitting and vibrational frequency, Δ − ωk. We note that the vibronic energy renormalization can also be described in the regular electronic-vibrational basis without the polaron transformation using second order perturbation theory (see Supplementary Note 2).

For an excitonic splitting that is smaller than the vibrational frequencies, Δ ωk, the energy-gap $${{\Delta }}^{\prime}$$ between vibronic eigenstates $$\left|{\psi }_{+}\right\rangle$$ and $$\left|{\psi }_{-}\right\rangle$$ is reduced compared to the bare excitonic splitting Δ (see Fig. 4a). This is in line with our numerically exact simulations of WSCP where the bare excitonic splitting Δ ≈ 2V is reduced to $${{\Delta }}^{\prime} \approx 2{V}_{00}\approx V$$. It is notable that for PPCs consisting of chlorophylls or bacteriochlorophylls, the HR factors of the intra-pigment modes are of the order of sk ≈ 0.01, independent of the vibrational frequencies ωk. In case the excitonic splitting is significantly smaller than the vibrational frequencies of the intra-pigment modes, the detuning between them is well approximated by Δk = ωk − Δ ≈ ωk, thus exhibiting the same scaling in ωk as the electronic-vibrational coupling, $${g}_{k}={\omega }_{k}\sqrt{{s}_{k}}$$. This implies that the coupling of higher-frequency modes increases with the detuning Δk so that they cannot simply be ignored on the basis of being off-resonant.

When the excitonic splitting is larger than the vibrational frequencies, Δ ωk, the situation is reversed (see Fig. 4b), resulting in an increased vibronic splitting $${{\Delta }}^{\prime}$$ compared to the bare excitonic splitting Δ. This case cannot be described by the splitting of 0-0 transitions, since the effective coupling $${V}_{00}=V\exp (-{\sum }_{k}{s}_{k})$$ is smaller in magnitude than a bare electronic coupling V for arbitrary HR factors defined by sk ≥ 0. This implies that the mixing of 0-0 and 0-1 transitions can result in two absorption peaks with an energy gap $${{\Delta }}^{\prime}$$ being larger than the bare excitonic splitting Δ.

Special pair in bacterial reaction center. The photosynthetic reaction center which drives exciton dissociation into free charges consists of the SP and four additional pigments63. The SP is a strongly coupled dimeric unit with an electronic coupling estimated to be V = 625 cm−1, a difference in mean site energies of 〈ε1 − ε2〉 = 315 cm−1 and consequently a bare excitonic splitting of Δ ≈ 1290 cm−1. These electronic parameters have been estimated based on a best fit to absorption, linear dichroism, and hole burning spectra of bacterial reaction centers using conventional line shape theory51. In what follows, we neglect the order of magnitude weaker electronic coupling of the SP to the four additional pigments and do not aim to reproduce experimentally measured absorption spectra of the whole bacterial reaction centers and re-estimate electronic parameters. Rather we concentrate on the effect of multi-mode vibronic mixing on the SP and its consequences regarding the nature and lifetimes of excitonic coherence and long-lived oscillatory signals in 2D electronic spectra.

While in WSCP the excitonic splitting is far detuned from high-frequency modes, the situation is markedly different for the SP. Here the environmental spectral density contains high-frequency intra-pigment modes both above and below the bare excitonic gap, as shown in black in Fig. 5a. The smaller frequency differences between vibrational modes and excitonic splitting and the varying sign of their detuning makes the effect of multimode mixing harder to predict analytically. Indeed, the perturbation procedure for obtaining Eq. (9) will be inaccurate for a larger number of modes. The vibronic splitting can be estimated beyond the perturbation theory by numerically diagonalising the Hamiltonian H = H0 + HI in Eqs. (5), (6), leading to $${{\Delta }}^{\prime} \approx 1744\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$ (see Supplementary Note 7). This estimate is in line with numerically exact simulated results where the energy-gap between absorption peaks is approximately 1710 cm−1 (see 780 and 900 nm peaks in Fig. 5b, corresponding to $$\left|{\psi }_{+}\right\rangle$$ and $$\left|{\psi }_{-}\right\rangle$$, respectively) and the oscillatory dynamics of excitonic coherence is dominated by 1755 cm−1 frequency component (see Fig. 5c). We note that the difference between excitonic and vibronic splittings is significant, of the order of $${{\Delta }}^{\prime} -{{\Delta }}\approx 465\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$, and this shift cannot be described by conventional line shape theory where multi-mode vibronic mixing is ignored and as a result the energy-gap between absorption peaks is reduced to the excitonic splitting (see the inset in Fig. 5b).

Long-lived multi-mode vibronic coherence. The considerable size of the multi-mode mixing effects on excitonic energy gaps suggest a possibly significant influence on coherent excitonic dynamics. The coarse-grained spectral density shown in red in Fig. 5a, which corresponds to a vibrational lifetime of γk = (50 fs)−1, yields short-lived oscillatory dynamics of excitonic coherence $${\rho }_{\pm }(t)=\left\langle {E}_{-}\right|{\hat{\rho }}_{e}(t)\left|{E}_{+}\right\rangle$$ with $${\hat{\rho }}_{e}(t)$$ denoting reduced electronic density matrix (see red line in Fig. 5c). Even if a few intra-pigment modes near-resonant with excitonic splitting are selected to be weakly damped, γk = (1 ps)−1, the vibronic mixing with the large number of remaining strongly-damped modes, γk = (50 fs)−1, suppresses the lifetime of excitonic coherences, making the resulting dynamics essentially identical to that where all the modes are strongly damped (see Supplementary Note 7 for detailed analysis of multi-mode vibronic mixing). In sharp contrast, when the picosecond lifetime of actual intra-pigment modes is considered, γk = (1 ps)−1, the excitonic coherence dynamics is dominated by long-lived oscillations with frequency $${{\Delta }}^{\prime} \approx 1755\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$, associated with the vibronic coherence between $$\left|{\psi }_{+}\right\rangle$$ and $$\left|{\psi }_{-}\right\rangle$$ states (see black line in Fig. 5c).

In 2D electronic spectroscopy, the third-order nonlinear optical response of molecular systems is measured by using a sequence of femtosecond pulses with controlled time delays64,65. As is the case of pump probe experiments66, electronically excited state populations and coherences can be created by a pair of pump pulses, and the molecular dynamics in the electronic excited state manifold can be monitored by controlling the time delay T between pump and probe. The additional time delay between two pump pulses enables one to monitor the molecular dynamics as a function of excitation and detection wavelengths for each waiting time T. The optical transitions induced by the pump pulses can also create vibrational coherences in the electronic ground state manifold, making it challenging to extract the information about coherent electronic dynamics from multidimensional spectroscopic data46.

Our numerically exact simulations of the SP demonstrate that long-lived oscillatory signals in 2D electronic spectra can originate from purely vibrational coherences or from vibronic coherences induced by multi-mode mixing. The latter have been ignored in previous numerical studies which considered only a few intra-pigment modes quasi-resonant with excitonic splitting and neglected all the modes that are far detuned from excitonic transitions as they were deemed to have a negligible effect67. However, the correct assessment of the nature of oscillatory 2D signals requires the computation of 2D spectra under the influence of the full spectral density. In order to make such computation feasible, in Supplementary Note 8, we provide an approximate master equation for vibronic dynamics, which takes into account multi-mode mixing effects and quantitatively reproduces numerically exact absorption line shape of the SP. Figure 5d shows the resulting rephasing 2D spectra at waiting time T = 0 in the presence of inhomogeneous broadening. The 2D lineshape, shown as a function of excitation and detection wavelengths, is dominated by a diagonal peak excited and detected at 900 nm which coincides with the position of the main absorption peak (see Fig. 5b). To investigate the excited state coherence between vibronic eigenstates $$\left|{\psi }_{+}\right\rangle$$ and $$\left|{\psi }_{-}\right\rangle$$, which induce the absorption peaks at 780 and 900 nm, respectively, we focus on a cross-peak R12 marked in Fig. 5d. Figure 5e shows the transient of the cross-peak as a function of the waiting time T where the oscillatory 2D signals originating from electronic ground state manifold, shown in red, are comparable to those of excited state signals, shown in blue. The ground state signals consist of multiple frequency components below 1600 cm−1, corresponding to the vibrational frequencies ωk of underdamped intra-pigment modes, as shown in Fig. 5f. It is important to note that the excited state signals include a long-lived oscillatory component with frequency ~1800 cm−1, which is not present in the ground state signals and cannot originate from purely vibrational effects as they exceed the high-frequency cut-off of the environmental spectral density (see Fig. 5a). This component must therefore originate from long-lived vibronic coherence due to multi-mode mixing. The long-lived oscillations at $${{\Delta }}^{\prime} \approx 1800\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$ frequency cannot be described by coarse-grained environment models where only a few intra-pigment modes near-resonant with the excitonic splitting Δ ≈ 1300 cm−1 are weakly damped (γk = (1 ps)−1), while all the other intra-pigment modes are strongly damped (γk = (50 fs)−1) or neglected (sk = 0) in 2D simulations (see Supplementary Note 8). Our results demonstrate that while some oscillatory components in 2D spectra can originate from purely vibrational motions, long-lived 2D oscillations can also be the result of a strong vibronic mixing of excitons with a large number of underdamped intra-pigment modes.

## Discussion

Employing numerically exact methods and an analytical theory, we have investigated exciton-vibrational dynamics under the complete vibrational spectrum that has been estimated in earlier experiments. We considered two paradigmatic regimes. The first regime, represented by an excitonic dimer in WSCP, is characterized by an excitonic splitting that is smaller than vibrational frequencies of intra-pigment modes. In this case, one main effect of vibronic coupling to the intra-pigment modes is a reduction of the dipole strength of 0-0 transitions of monomers and of their effective coupling strength V00 that determines the splitting between absorption peaks in the low-energy spectrum. A second important effect concerns the modulation of the vibrational sideband of optical transitions by a vibronic mixing between 0-0 and 0-1 transitions. Although the vibronic mixing is not strong enough to modulate the low-energy part of absorption spectra of WSCP, it can induce a notable dipole strength redistribution between 0-0 and 0-1 transitions, which cannot be described by approximate theories where the vibronic mixing is ignored.

In the second regime, represented by the SP of the photosynthetic reaction center of purple bacteria, the excitonic splitting is located in the middle of the high frequency part of the intra-pigment vibrational spectrum. In this case, the splitting between main absorption peaks can be even larger than the bare excitonic splitting, due to multi-mode vibronic mixing effects. This regime is found to be particularly suitable for the discovery of new long-lived quantum coherences in photosynthesis. We found that the coherence time of excitonic dynamics is not simply governed by the lifetime of quasi-resonant intra-pigment modes. Rather it is determined by the lifetimes of individual intra-pigment modes involved in a multi-mode vibronic mixing. This implies that approximate theoretical models based on coarse-graining of the high frequency part of the vibrational environments21 may underestimate the lifetime of excitonic coherences and could be inappropriate to analyze quantum coherences observed in nonlinear experiments on photosynthetic systems. In addition, our results demonstrate that even if the frequency $${{\Delta }}^{\prime}$$ of oscillatory 2D signals is not well matched to one of the vibrational frequencies ωk of intra-pigment modes, the long-lived 2D oscillations can be vibronic in origin, rather than being purely electronic, as is the case of the SP where $${\omega }_{k}\,\lesssim\, 1600\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}\, < \,{{\Delta }}^{\prime} \approx 1800\,{{{{{{{{\rm{cm}}}}}}}}}^{-1}$$. This implies that the origin of long-lived oscillatory 2D signals cannot be identified based only on a comparison of the frequency spectrum of nonlinear signals with the vibrational frequency spectrum of underdamped modes. Hence, we contend that previously ignored multi-mode vibronic effects must be included in the interpretation of nonlinear spectroscopic signals before the current debate regarding the presence and nature of long-lived quantum coherences in pigment-protein complexes can be settled conclusively.

Our results suggest the possibility that the energy transfer dynamics between electronic states, such as excitons and charge-transfer states, could be governed by the multi-mode nature of the total vibrational environments, rather than a few vibrational modes quasi-resonant with electronic energy-gaps (see Supplementary Note 9). The generality of the methods employed here also suggests that our results have a broad scope and can be of relevance in a wide variety of scenarios involving strong hybridization of electronic and vibrational degrees of freedom, such as recent observations of nonadiabatic dynamics in cavity polaritonics68,69. We expect that renormalization effects considered here may open an entirely new toolbox for vibrational reservoir engineering with possible applications in information technologies and polaritonic chemistry.