## Introduction

Since the first observation of strong coupling between light and matter states1, when the radiation field is quantised inside an optical resonator, the study of cavity quantum electrodynamics (cQED) has led to a variety of remarkable achievements culminating in the 2012 Nobel Prize in Physics2, as well as the advent of circuit QED and quantum computers3,4,5,6. While the former has been awarded for the observation of strong light-atom coupling7 and the latter is based on strong coupling with inorganic quantum mechanical objects (artificial atoms, Josephson junctions, etc.) at low temperatures, strong exciton-photon coupling using organic materials is a more recent development8,9. It allows the manipulation of physical and chemical properties of matter—such as the interaction of a system with its environment10 or energy ordering between singlet and triplet states11,12—at room temperature due to the large transition dipole moment of organic molecules and their aggregates, which leads to larger coupling strengths with the electromagnetic field compared to individual atoms or inorganic semiconductors13. This manipulation of a system’s ground- and excited-state properties through the formation of hybrid light-matter states, so-called polaritons, has led to applications including polariton-assisted singlet fission14, energy transfer in donor-acceptor systems15,16, long-range polariton (and thus exciton) transport17, the enhancement of conductivity18, the formation of Bose-Einstein condensates19, and many more.

Extensive and recent reviews of molecular cQED photophysics and the impact on excited- as well as ground-state chemistry can be found in the literature20,21,22,23,24, and a more rigorous and quite informative quantum description is found in Oppermann et al.25 (and references therein). Briefly, to achieve strong coupling the resonance frequency of a microcavity is matched to the transition frequency of a two-level system (TLS) within it. In addition, the coupling strength g0 needs to exceed the damping rate of the cavity mode κ, as well as that of the matter polarization γ, which is achieved by controlling the reflectivity of the microcavity mirrors. Whenever these two conditions are satisfied, the TLS resonantly exchanges energy with the cavity mode (and vice versa) and hybrid light-matter states emerge. These polaritons are described theoretically by the Jaynes-Cummings Hamiltonian with the Eigenstates

$$\left|{{\rm{P}}}^{+}\right\rangle =\alpha \left|e,0\right\rangle +\beta \left|g,1\right\rangle \\ \left|{{\rm{P}}}^{-}\right\rangle =\beta \left|e,0\right\rangle -\alpha \left|g,1\right\rangle ,$$
(1)

where $$\left|e\right\rangle$$ and $$\left|g\right\rangle$$ are the excited and ground state of the TLS, $$\left|0\right\rangle$$ and $$\left|1\right\rangle$$ are the two lowest Fock states of the cavity, and α2 and β2 are the Hopfield coefficients, the ratios of light-matter character of the polaritonic states. Whenever N TLSs couple to the cavity, as is typical for molecular cQED systems, the system is described by the Tavis-Cummings (TC) Hamiltonian, resulting in a collective coupling strength $$g={g}_{0}\sqrt{N}$$ and the delocalisation of the system’s wave function over a large number (~105) of TLSs within the mode volume10,20. The system described by the TC model is composed of N + 1 collective states: the two polaritonic states P and P+, and N − 1 spectroscopically “dark” excitonic states. In molecular cQED the description of the matter states as TLSs is no longer sufficient, as it does not account for the complexity of molecules and molecular aggregates26. Subtleties such as disorder-induced photonic intensity borrowing27,28,29,30,31,32,33 and the presence of “bright” uncoupled excitonic states with allowed transitions from the ground state20,30 are important for the systems’ photo-induced dynamics, as arguably the excitonic states can act as intermediates during the energy relaxation26,29,30,31,32,34,35. Finally, due to the photon mode dispersion inside the cavity, the energies of P and P+, as well as the Hopfield coefficients, vary with the in-plane momentum k, which leads to optical properties that are dependent on the incidence angle of the interacting light field(s).

As a result of the light-matter character, the time scales relevant for the photo-induced non-equilibrium dynamics of strongly coupled systems are the dephasing times of both the electromagnetic field and the matter polarization inside the cavity13,36, as well as the excited-state lifetime of the excitons. The dephasing of the light field is mainly dictated by the photon lifetime inside the cavity, τc = 1/κ, which is on the order of femtoseconds to a few 10s of femtoseconds for metallic mirror cavities20. Decisive for the loss of matter polarization is the exciton dephasing time, which, at room temperature, is limited by fast stochastic fluctuations (pure dephasing) of the transition frequency. For TDBC (5,6-dichloro-2-[[5,6-dichloro-1-ethyl-3-(4-sulphobutyl)benzimidazol-2-ylidene]propenyl]-1-ethyl-3-(4-sulphobutyl)-benzimidazolium hydroxide, inner salt, sodium salt) J-aggregates in thin films, as used here, these fluctuations show a dynamic disorder of 26–30 meV at room temperature37,38,39, which corresponds to a typical dephasing time scale of  ~25 fs40. The excited-state lifetime of excitons in TDBC J-aggregates is  <10 ps in thin films41 and gives an upper limit for the non-equilibrium polariton dynamics, e.g. when the excitonic states, called J for their J-type coupling in the aggregate, serve as a reservoir during the energy relaxation, as suggested by degenerate, broadband transient transmission spectroscopy42. Results from transient absorption spectroscopy with selective excitation showed that—contrary to the energy reservoir model—for high coupling strengths, P is a quasi-bound state43, comparable with the formation of molecular excimers and exciplexes20,44,45.

Recently, polaritonic systems have been studied in even greater detail by two-dimensional Fourier transform (2DFT) spectroscopy. In the near-infrared and infrared, 2DFT spectroscopy allowed to observe multi-polariton coherences of quantum-wells coupled to semiconductor microcavities46,47 and to observe and manipulate optical nonlinearities of vibrational polaritons48. Visible 2DFT spectroscopy was used to probe the broadening and relaxation mechanisms in strongly coupled exciton-plasmon (plexitonic) systems40,49 and to obtain a detailed picture of the polariton-bath interaction in a molecule-microcavity system10. Specifically, the latter study exploited the sensitivity of lineshapes to bath-induced fluctuations of the system50. However, direct observation of the energy relaxation from P+ to P, a mechanism first considered by Agranovich et al. in hybrid organic-inorganic semiconductor microcavities8 and used by Lidzey et al. to describe steady-state photoluminescence distributions51, remains elusive.

With this in mind, we demonstrate here how broadband visible (500–750 nm; 13,300–20,000 cm−1) 2DFT spectroscopy can be used to follow the ultrafast dynamics in molecular cQED systems via cross-peak dynamics and find branching energy relaxation pathways that depend on different sample properties, such as cavity tuning, Rabi-splitting and k (incidence angles of the laser fields).

## Results

### Sample characteristics

A cross-sectional illustration of the exciton-microcavity samples under study is shown in Fig. 1a, where the empty cavity (blue shading) with Fock states $$\left|0\right\rangle$$ and $$\left|1\right\rangle$$ is coupled to N molecular excitons with ground and excited states $$\left|g\right\rangle$$ and $$\left|e\right\rangle$$ (red shading; absence of cavity), yielding the polaritonic system (purple shading). Figure 1b shows the different layers of the samples, consisting of a 100-nm silver back mirror on top of a glass substrate, a thin film layer of TDBC (see Methods for details and the inset in Fig. 1b for the molecular structure) J-aggregates in polyvinylalcohol (PVA), and a 25-nm, semi-transparent silver front mirror. The sample preparation is described in detail under Methods. All optical fields used to study the polaritonic system are incident on the sample at an angle 2φ and the absorption (A) is contained in the reflection (R) as A = 1 − R, due to the lack of transmission (Tr = 0) through the back mirror.

### 2D Fourier transform spectroscopy

We use broadband visible (500–750 nm; 13,300–20,000 cm−1) 2DFT spectroscopy to simultaneously resolve the excitation energy dependence and achieve the highest possible temporal resolution (<20 fs) of the transient response of TDBC strongly coupled to a cavity. In 2DFT spectroscopy50,52,53,54,55,56 the sample is excited by a pair of compressed, broadband visible laser pulses (with wavevectors k1 and k2) and Fourier transformation along the interpulse delay τ (coherence time) yields the excitation frequency dimension (ωτ). A third laser pulse (k3) is used to monitor the transient response of the excited system after it has evolved for a time T (evolution time) by generating a signal field (ksig) that is spectrally dispersed and heterodyne detected with a local oscillator (LO) pulse (kLO) to obtain the detection frequency dimension (ωt). This last step is the optical analogue of a Fourier transformation along the time t (detection time). The pulse sequence is illustrated in Fig. 1c and the technical details of the experimental setup are described in detail under Methods and by Al Haddad et al.57. Microscopically, a signal at a spectral position ωτ/ωt is described by a 3rd-order response function in which the system coherences during τ and t oscillate with frequencies ωτ and ωt58. Since 2D spectroscopy maximally resolves the signals from different 3rd-order response functions spectrally, the experimental data is ideal to extract energy relaxation kinetics56 and can be readily simulated by quantum mechanical models59,60.

### Spectral sample characterisation

Important for the discussion of the systems’ non-equilibrium dynamics is a thorough characterisation of their steady-state optical properties. The k-dependent absorption of sample 1 (2:1 mixture of 1 wt% TDBCaq:5 wt% PVAaq, spin coated at 1000 rpm, λ-mode cavity) is measured in reflection mode (Fig. 1b) and shown as a colourmap in Fig. 2a. The absorption (A) of the system is obtained directly from the specular reflection (R) as A = 1 − R − Tr = 1 − R, since the 100 nm thick back mirror ensures that the transmission (Tr) through the sample is zero. Losses due to scattering are neglected, which is a reasonable assumption considering the high quality of the mirror surfaces.

The lower (higher) energy band in Fig. 2a corresponds to the absorption into the P (P+) branch of the polariton. From their in-plane dispersion it is possible to obtain the vacuum Rabi splitting, the photon mode dispersion (red line in Fig. 2a, b), as well as the Hopfield coefficients (Fig. 2c–f), which describe the fractional photon-exciton character of the polariton branches, using a coupled-oscillator model (see Methods). The Rabi splitting of 1 is 0.28 eV and the anticrossing between the photon and exciton bands is found at k = 6 μm−1, corresponding to φ ≈ 33.5. At k = 0 μm−1 the energy of the cavity photon (red line) of 1 is ~0.06 eV below the k-independent exciton energy (black line) and P has a large photonic character, while P+ is predominantly excitonic in character. The cavity is said to be red-tuned, as the photon energy at k = 0 μm−1 is below the energy of the excitons. This tuning of the cavity allows modification of the photophysics of the cQED system by changing its energetic landscape via the cavity thickness13. Figure 2b shows the in-plane dispersion of sample 2 (1:1 mixture of 0.5 wt% TDBCaq:1 wt% PVAaq, 1000 rpm, λ/2-mode cavity), which has a Rabi-splitting of 0.36 eV, is red-tuned by 0.12 eV and shows an anticrossing of the P and P+ branches at k = 9.1 μm−1, corresponding to φ ≈ 50°. As discussed in the following paragraphs, these differences in cavity tuning and Rabi-splitting between the two samples lead to changes in the ultrafast transient 2D signal.

### Signals observed by 2D Fourier transform spectroscopy

Before discussing the φ-dependent differences between the 2D spectra of samples 1 and 2 and the non-equilibrium dynamics of sample 1, it is worth introducing the observed signal. The excitation energy ωτ of the 2D spectra in Figs. 35 is plotted as the x- and the detection energy ωt as the y-dimension, such that the transient signal at a certain ωτ is read out as a vertical cut through the 2D spectrum. Here, we show the magnitude of the complex 2D data (absolute 2D spectrum), which allows to follow the non-equilibrium dynamics of the excited system via the temporal evolution of lineshapes and intensities of the on- and off-diagonal peaks (vide infra). (For an absorptive 2D spectrum, phased by comparison to data from Schwartz et al.43, see Supplementary Fig. 1 and Supplementary Note 1.) The reason why the kinetics of the individual states can be followed simply by observing a change in intensity at specific ωτ and ωt is that the 2D spectrum of TDBC J-aggregates (Fig. 2 in Finkelstein-Shapiro et al.49) is a spectrally localised and well-defined 2D peak centred at the lowest energy optical transition frequency. For the TDBC-microcavity system the on-diagonal signals of the 2D spectra in Fig. 3a report on the P, J and P+ states, where the energy of J is independent of φ at 16,870 cm−1, while the peak position of the P (~15,700 cm−1 for sample 1 at 5°) and P+ (~17,850 cm−1 for sample 1 at 5°) bands varies with φ, due to the dispersion of the photon mode. The on-diagonal signal can be compared to the linear absorption spectrum of the system, which is plotted as black lines in Fig. 3a. Note that the on-diagonal peaks for P and P+ do not perfectly line up with the steady-state absorption spectrum, due to slight differences in angle φ and cavity thickness (the latter can vary across the sample due to inhomogeneities from the spin-coating process) between the 2D and absorption measurements. Optical excitation from the ground to either of the three excited states (P, J or P+) results in transient response of the sample that is encoded as a vertical cut at the respective ωτ (follow the arrows in Fig. 3b). The resulting 2D lineshapes report on the interaction of the strongly coupled system with its environment and cross-peaks between on-diagonal signals (labelled by the combination of excited and detected species P, J and P+, as depicted schematically in Fig. 3b) are indicative of the coupling between states via the common ground state, as well as energy relaxation within the system. The former manifests itself as cross-peaks above and below the diagonal due to the ground-state bleach (GSB), while the latter appears as an increase in intensity of the below-diagonal cross-peaks, as the excited-state absorption (ESA) and stimulated emission (SE) contributions change during the excited-state dynamics and energy transfer is energetically downhill for an energy splitting between states that is >kBTRT at room temperature (TRT). This ability to resolve individual signal (cross-) peaks, together with the fact that these are spectrally well isolated, allows to disentangle the kinetics of the system more rigorously than previously possible using transient absorption spectroscopy and highlights the strength of broadband visible 2DFT spectroscopy.

### Angular dependence of the transient 2D signal

After introducing the 2D signals, we now turn to the differences that can be observed between samples 1 and 2. The 2D spectrum of 1 at T = 100 fs and φ =  5° in Fig. 3a contains all the transient features described in the previous paragraph and illustrated in Fig. 3b.

But when sample 2 is measured at identical φ at T = 80 fs (Fig. 3c), the transient signal differs strongly: The on-diagonal peaks corresponding to P, J and P+ are still present, though P is shifted to lower energy due to the slightly larger red-tuning and stronger coupling of the system, but the intensity of all cross-peaks is significantly decreased. (Note that weak cross-peak signals can be observed at P/P+ and P+/P, whose peak sub-structure, mainly visible for the P+/P cross-peak of sample 2 at 5°, most likely results from the Fourier transformation of the discrete experimental data along τ → ωτ and should be considered an experimental artefact.) This is surprising, since a priori the system is also strongly coupled, i.e. P and P+ have a common ground state. However, the excitation and detection occurs further away from the anticrossing in the in-plane photon momentum space, where the coupling between light and matter states is reduced and P and P+ acquire a larger photon and exciton character, respectively (see Fig. 2c–f)13. It is thus likely that the cross-peaks due to coupling are simply too weak (below or in the range of the detection limit of the experiment) and cannot be observed easily, as discussed in more detail during the analysis of the temporal evolution of sample 1 below. It is further possible that the larger excitonic character of P+ results in an increased lifetime and thus a delayed appearance of P+/P, according to Litinskaya et al.61 and similar to Dunkelberger et al.62. When changing the photon/exciton character of P and P+ by increasing φ to 20°, while keeping the evolution time at T = 80 fs, a strong P+/P cross-peak appears below the diagonal in the 2D spectrum of sample 2 (Fig. 3d). This is indicative of an energy relaxation process that becomes possible in 2 at larger φ and in agreement with the fact that the polariton lifetime is assumed to depend on the cavity lifetime weighted by the photonic character61. Another explanation might be the presence of a relaxation bottleneck, as observed in J-aggregate exciton-microcavities63. Specifically, the energy gap of sample 2 at 5° exceeds twice the energy (first overtone) of the highest Raman-active mode of TDBC J-aggregates, while the energy gaps of sample 1 at 5° and sample 2 at 20° match overtones of Raman-active modes64. Potential effects of the k-components of the non-collinear laser pulses have been minimised experimentally by decreasing the crossing angle between the laser beam k-vectors in the 2D experiment to <1.6° (equivalent to 0.3 μm−1).

### Temporal evolution of the 2D spectrum

The evolution of the transient signal of sample 1 is measured in reflection geometry at φ = 5 and displayed as absolute value 2D spectra in Fig. 4. To ease the analysis, we will first focus on the temporal evolution of the lineshapes, before discussing the kinetics of on-diagonal signals and cross-peaks. Lineshapes in 2D spectroscopy report on the correlation between the excited and the detected transition frequency of the system, and on how it is affected by the environment10. The Lorentzian shape of the on-diagonal P signal is the result of fast, homogeneous spectral fluctuations of the polaritonic transition and is observed for the duration of the dataset (~1 ps). This indicates that the energy of P is only weakly coupled to the environment, similar to the results obtained by Takahashi and Watanabe10. The on-diagonal signal of P+ appears as an elliptical peak, characteristic of slow Gaussian spectral diffusion. Inhomogeneous disorder of the corresponding transition energy manifests itself as an elongation of the peak along the diagonal at early T and is explained by the fact that P+ is coupled to the inhomogeneously broadened exciton manifold J via the creation of a phonon27,36,65,66. Spectral diffusion leads to a loss of ωτ/ωt correlation, and is observed by the loss of diagonal elongation of the P+ peak within 150 fs. The lineshape of the strong below-diagonal cross-peak P+/P is reminiscent of P, but broadened in ωτ by the inhomogeneity of P+, as it reports on the energy relaxation from P+ to P (potentially via intermediate states).

To better assess the kinetics observed in the absolute 2D signal, time traces at combinations of excitation and detection energies ωτ and ωt corresponding to the on-diagonal peaks, as well as all cross-peaks, are displayed in Supplementary Fig. 2 and described in Supplementary Note 2. They show that the three on-diagonal signals (P, J and P+) appear instantaneously (within the temporal instrument response function of 15–20 fs) and decay on a commensurate, <100 fs time scale. Thereafter, we observe a slower, picosecond decay of the on-diagonal signals that is, unfortunately, not sampled sufficiently to extract reliable decay time constants. Cross-peaks arise due to the coupling between P, J and P+ and their more complex temporal behaviour provides detailed information about the system’s kinetics. When P is excited, the coupling with J and P+ results in a transient response of the sample at P/J and P/P+, respectively. The associated kinetic traces rise instantaneously and decay with a <100 fs time constant comparable to the three on-diagonal signals, suggesting that the same mechanism is responsible for their kinetics. Based on the predicted few 10s of femtoseconds dephasing times for both the cavity field and the excitonic coherences (discussed in the Introduction), this initial signal decay might be due to the loss of the system’s coherences, which, potentially, involves a partial recovery of the ground state via the loss of a photon from the cavity. Excitation of J results in an instantaneous cross-peak signal at J/P with a qualitatively longer decay, compared to the on-diagonal signals and P/J, hinting at an energy transfer from the excitonic states into the polaritonic state P. A corresponding cross-peak at J/P+ is absent, as evident from the time traces in Supplementary Fig. 2, as well as the 2D spectra in Fig. 4. However, its absence might simply be due to the fact that the intensity is below the detection limit of the experiment, as we do not expect a lack of coupling between J and P+. The latter assumption is based on the weak absorption cross-section of the dark states J, as well as the observation of a cross-peak at P/J, which indicates the coupling between P and J. Finally, excitation into P+ results in a weak cross-peak at P+/J and a strong one at P+/P, whose temporal kinetics are displayed in Fig. 5. The peak structure of the weak P+/J signal can be most clearly recognised at T ≈ 50 and 100 fs in Fig. 4, before it coalesces with the spectral shoulder of the strong P+/P peak. As discussed above, the transient signal at P+ (red trace in Fig. 5) shows a fast (<100 fs) decay that is also observed for all other on-diagonal signals, as well as the above-diagonal cross-peaks P/J and P/P+, and might thus be due to dephasing and the recovery of the common ground state, $$\left|g,0\right\rangle$$. The fast, <100 fs rise times of P+/J and P+/P are slightly delayed from the decay of the on-diagonal and above-diagonal signals, indicating the energy relaxation from P+ to J and P via the appearance of SE and ESA signals at P+/J and P+/P. To avoid the influence of spectral shifts of P+/P on the dynamics (see Methods), we follow its maximum intensity rather than the intensity at a certain frequency combination ωτ/ωt, as shown in Supplementary Fig. 3 and described in Supplementary Note 3.

### Discussion

An interesting question that arises in molecular cQED concerns the lifetimes of the polaritonic states and how the excitonic manifold is involved in the energy relaxation. While, at this point, we can only draw conclusions for the systems presented in this study, we allege that the methodology will provide answers to the above questions, in a systematic study with varying sample parameters. The most compelling argument for a direct P+ → P energy relaxation channel in sample 1 at 5°, is the concurrent rise of the SE/ESA signals at P+/J and P+/P within <100 fs. This suggests that relaxation from P+ into P occurs simultaneously with that from P+ into J, hinting at a branching ratio of ~50:50 based on the equal rate constants. The population of P directly from P+ can be further understood considering a rudimentary rate model (P+ →  J  → P) with and without an additional P+ → P relaxation channel (Supplementary Fig. 4 and Supplementary Note 4). This analysis assumes that the transient signals at P+, P+/J and P+/P can be associated with the population being in the respective states P+, J or P,which is reasonable considering the narrow linewidth of the transient J-aggregate signal40,49. The fact that excitation of the “bright” excitonic states results in a transient response at J/P, i.e. energy relaxes from J into the polaritonic state P, supports the hypothesis that also the energy relaxation channel P+ →  J  → P contributes to the transient population in P and thus the transient signal at P+/P. The similarity of the <100 fs energy relaxation time scale with the dephasing times of the cavity and excitonic coherences suggests that, as mentioned above, the excited-state dynamics are closely related to the dephasing processes within the system. In distinction to the downhill energy transfer described above, recent experimental10 and theoretical67 studies suggested that in similar systems population transfer happens from P to the dark, excitonic states, due to the overlapping density of states. Our results show that this effect does not happen for the present systems, since we do not observe a long-lived transient response of J after an initial population of P at P/J. This observation is rationalised by the relatively narrow energy distribution of the dark J-aggregate states and the large energy gap between P and J. The long-lived nature of P corresponds well to the observation that strong coupling can lead to the emergence of quasi-bound polaritonic states20,43,45. DelPo et al. report on strongly coupled charge-transfer transitions in 4CzIPN (1,2,3,5-tetrakis(carbazol-9-yl)-4,6-dicyanobenzene) molecules and show that the lifetime of the P+ state is significantly increased, compared to J-aggregate microcavities, due to the larger Rabi splitting68. While it is a priori not clear whether a comparison between the different systems is justified, the k-dependence observed in our study suggests that another parameter, namely the ratio of light-matter character, might need to be considered.

In conclusion, we have used 2DFT spectroscopy to observe energy relaxation channels in molecular cQED systems and show the influence of different sample parameters, e.g. varying the k-momentum or cavity tuning, on the non-equilibrium dynamics for a TDBC-microcavity system. Compared to transient absorption spectroscopy, 2DFT spectroscopy is able to resolve the transient response according to the initially excited transition, while simultaneously retaining the fastest possible temporal resolution, and allows to obtain a more detailed picture of the system kinetics. We thus find that upon excitation of the P+ transition, the energy relaxes to the P state within <100 fs, bypassing the excitonic state J initially. Our results are complementary to the recent work by Takahashi and Watanabe, who used visible 2DFT spectroscopy to investigate the interaction of molecular exciton-polaritons with the environment10. Therefore, 2DFT spectroscopy proves to be a useful methodology to study molecular cQED systems, especially with regards to systematically changing sample parameters, such as the Rabi-splitting, cavity tuning, as well as the in-plane photon momentum k.

## Methods

### Preparation of TDBC-microcavities

The reflection cavities were prepared on glass substrates, which were cleaned by sonication inside an alkaline solution (0.5% Hellmanex in distilled water), water and ethanol, for 1 h each. Prior to sample preparation, the substrates were oven-dried at >100 °C. A highly reflective, 100-nm thick silver mirror was deposited via vacuum sputtering (HEX, Korvus Technologies) on top of the glass substrates. An aqueous stock solution of TDBC (5,6-dichloro-2-[[5,6-dichloro-1-ethyl-3-(4-sulphobutyl)benzimidazol-2-ylidene]propenyl]-1-ethyl-3-(4-sulphobutyl)-benzimidazolium hydroxide, inner salt, sodium salt, Few Chemicals) was mixed with an aqueous solution of PVA (polyvinylalcohol, Sigma Aldrich, Mw ≈ 205,000 g mol−1) and the mixture was filtered using a 20-μm syringe filter prior to spin-coating (Laurell Technologies WS-650). The thickness of the TDBC+PVA film was varied by using different rotation speeds, varying from 500 to 3000 rpm, as well as different PVA concentrations, and the spinning time exceeded 180 s to allow the film to dry. A semi-transparent, 25-nm silver mirror was deposited on top of the TDBC+PVA film via vacuum sputtering and the cavities were characterised by angle-resolved reflection spectroscopy (Lambda 950, PerkinElmer). This procedure yielded a set of different thin film thicknesses, of which sample 1 (2:1 mixture of 1 wt% TDBCaq:5 wt% PVAaq, spin coated at 1000 rpm, λ-mode cavity) and sample 2 (1:1 mixture of 0.5 wt% TDBCaq:1 wt% PVAaq, 1000 rpm, λ/2-mode cavity) were chosen for 2DFT experiments. All samples were stored under vacuum to avoid oxidation of the silver mirrors.

### Static spectral characterisation

φ-dependent specular reflection spectra were measured on a standard UV-vis spectrometer (Lambda 950, PerkinElmer) with a variable angle accessory and analysed by a coupled harmonic oscillator model

$$\left(\begin{array}{ll}{E}_{{\rm{c}}}({k}_{\parallel })&\frac{\hslash {\Omega }_{{\rm{R}}}}{2}\\ \frac{\hslash {\Omega }_{{\rm{R}}}}{2}&{E}_{{\rm{x}}}\end{array}\right)\left(\begin{array}{l}\alpha \\ \beta \end{array}\right)=E\left(\begin{array}{l}\alpha \\ \beta \end{array}\right)$$
(2)

with the dispersion of the cavity photon energy Ec(k), the exciton energy Ex and the Rabi frequency ΩR, as well as the Hopfield coefficients α2 and β2. The in-plane momentum k is calculated from the incidence angle φ as

$${k}_{\parallel }=\frac{2\pi }{\lambda }\sin \varphi .$$
(3)

Diagonalization leads to an analytic expression for the P and P+ polariton branches with energies E and E+

$${E}^{\pm }({k}_{\parallel })=\frac{{E}_{{\rm{c}}}({k}_{\parallel })+{E}_{{\rm{x}}}}{2}\pm \frac{1}{2} \sqrt{{(\hslash {\Omega }_{{\rm{R}}})}^{2}+{\left({E}_{{\rm{c}}}({k}_{\parallel })-{E}_{{\rm{x}}}\right)}^{2}}$$
(4)

that is used to fit the k-dependent absorption data. Choosing Ex = 2.09 eV for TDBC J-aggregate excitons yields Ec(k), as well as the Hopfield coefficients α2 and β2 via the eigenvectors (see Hertzog et al.22).

### Two-dimensional Fourier transform spectroscopy

The 2DFT experiment has previously been described by Al Haddad et al.57. Briefly, the output (800 nm, 30 fs, 3 kHz) of an amplified Ti:Sa laser (Coherent Astrella) is frequency-broadened via self-phase modulation inside a hollow core fibre and the resulting white light (470–750 nm; 21,250–13,300 cm−1; after spectral filtering) pulse is temporally compressed using a set of dispersion compensating mirrors (Ultrafast Innovations, PC70). Fine-tuning of the pulse duration is achieved using a pair of fused silica (Suprasil) wedges (FemtoOptics, OA924) and the instrument-response function with a temporal duration of <20 fs is measured by transient-grating frequency-resolved optical gating inside a microscope slide at the sample position. After the wedges the laser beam is split into four replicas in a BOXCARS geometry, with the temporal pulse sequence depicted in Fig. 1c, by a combination of beam splitters (Thorlabs, EBS095, 1 mm thick) and mirrors, mounted on stable, home-built aluminium mounts. The design achieves a passive phase-stability of λ/60 and allows the pair-wise manipulation of laser pulse delays in a rotating frame57. A 50.8 mm diameter mirror with a focal length of 1 m focusses the four laser beams to a common spot, at which the cQED sample is positioned on an x-, y-, z-stage at an incidence angle φ. The angle between the laser beams is <1.6° to avoid spatial filtering69 and to ensure that the difference in k (the in-plane momentum of the photon field inside the cavities) is negligibly small (~0.3 μm−1). The 1 m focal length leads to a focal spot size of  ~300 μm in diameter, resulting in a laser fluence of <300 μJ cm−2 for typical laser pulse energies of <100 nJ. The reflected signal beam is spatially selected by an aperture, collimated and focused together with the co-propagating LO beam onto the entrance slit of an imaging spectrograph (Andor Shamrock 303i), equipped with a home-built line charge-coupled device (CCD) detector (Hamamatsu S11155-2048-02) that allows for shot-to-shot detection. To visualise the large dynamic range of the signals we use an arcsinh-scaling of the data, as $${I}_{{\rm{2D,rescaled}}}={\rm{arcsinh}}(F\cdot {I}_{{\rm{2D,normalised}}})$$. I2D,normalised has been normalised to its highest global value, i.e. highest value of the full dataset I2D(ωτTωt), and the factor F is chosen to yield the best visual result.