Interactions between Fermi polarons in monolayer WS2

Interactions between quasiparticles are of fundamental importance and ultimately determine the macroscopic properties of quantum matter. A famous example is the phenomenon of superconductivity, which arises from attractive electron-electron interactions that are mediated by phonons or even other more exotic fluctuations in the material. Here we introduce mobile exciton impurities into a two-dimensional electron gas and investigate the interactions between the resulting Fermi polaron quasiparticles. We employ multi-dimensional coherent spectroscopy on monolayer WS2, which provides an ideal platform for determining the nature of polaron-polaron interactions due to the underlying trion fine structure and the valley specific optical selection rules. At low electron doping densities, we find that the dominant interactions are between polaron states that are dressed by the same Fermi sea. In the absence of bound polaron pairs (bipolarons), we show using a minimal microscopic model that these interactions originate from a phase-space filling effect, where excitons compete for the same electrons. We furthermore reveal the existence of a bipolaron bound state with remarkably large binding energy, involving excitons in different valleys cooperatively bound to the same electron. Our work lays the foundation for probing and understanding strong electron correlation effects in two-dimensional layered structures such as moiré superlattices.

The sample is slightly overfilled by the laser spots, but because there is no response from the substrate this doesn't affect the measurements. The scalebar is 50 µm. b, Absorbance of the sample (red) at T=6 K, and MDCS pulse spectrum (black).
To quantify the absorption of the monolayer, reflected spectra of the local oscillator (LO) were collected from the WS 2 flake and the adjacent substrate at 6 K. The spectra were normalised, to account for slight variations in the laser spectrum, and the difference between the spectra was found and taken to approximate the absorption of the flake. While this measurement is simplistic, after accounting for the narrowing and blueshift of the excitonic resonances it agrees well with previous, more rigorous calculations performed using the transfer matrix method performed at room temperature [1]. A thin film model was also applied to calculate the reduction in the incident laser intensity at the location of the monolayer on the layered SiO 2 /Si substrate (300 nm thermal oxide) due to interference effects [2]. This found that due to the interference of the incoming and reflected waves from the substrate the intensity at the location of the WS 2 monolayer was reduced to around 41% of the incident intensity at the peak of the A exciton. This reduction was taken into account when calculating the exciton density in the WS 2 monolayer.

B. Estimating electron doping
For the MDCs measurements we have laser spot sizes on the order of 50 µm diameter, which means that to minimise scatter we need a clear monolayer area of similar size. This makes it difficult to obtain a precise measure of electron density, as having contacts on the sample becomes problematic. We took two different approaches to estimate the electron density.
First, we took gate voltage dependent measurements on another identically-prepared sample taken from the same bulk crystal. There are problems with such measurements on un-encapsulated monolayer WS 2 due to the adsorption and de-adsorption of environmental molecules which effectively change the Fermi level. Differences then appear depending on whether the sample is in vacuum or in atmosphere, and over time when an applied gate voltage increases the electron density, as discussed in detail below. With these challenges, it was only possible to place an upper limit on the difference in Fermi energy from atmospheric to vacuum conditions of 15 meV. If we assume that the atmospheric conditions effectively quench any intrinsic doping, then this also places an upper limit on the Fermi energy for the MDCS measurements, which were conducted in vacuum.
The second approach was to compare the ratio of the exciton and trion peaks in the photoluminescence and low-temperature absorption measurements with literature values, since this ratio is strongly dependent on electron density. In our absorption measurements the exciton peak amplitude is ∼ 5 times larger than the trion peak. There are limited measurements of absorption spectra as a function of doping density, but theory work looking at MoS 2 showed that for a Fermi energy of 4.9 meV an exciton absorption five times the trion absorption was reported [3]. This ratio is similar to what we see in our WS 2 monolayer under vacuum at 6K, and can provide a rough estimate. However, there has been no experimental confirmation, and the different conduction band ordering in WS 2 may change things significantly. For room temperature PL measurements, reported electron densities typically range from < 1 × 10 11 to 2 × 10 12 cm −2 but without a consistent quantitative trend for the relative peak amplitudes. Perhaps the most reliable report, where they have used two independent methods to determine the doping density, shows that even for electron densities less than 1 × 10 11 cm −2 the trion peaks remain stronger than the exciton peak in photoluminescence measurements [4]. In the room temperature PL measurements in Supplementary Fig. 2b, the PL from the trion is less than two times larger than the exciton, and consistent with a doping density ∼1 × 10 11 cm −2 . We used this value as a starting point for our modelling, but note that there remains significant uncertainty. In the modelling we varied this doping density and found that the best match with the experiment was obtained with a value of 1 × 10 11 cm −2 , and results consistent with the experiments were obtained for values up to 3 × 10 11 cm −2 .
C. Vacuum effect and gate voltage dependent measurements in monolayer WS 2 To investigate the effect of vacuum on the optical response of the WS 2 used in this study, we performed photoluminescence (PL) measurements under non-resonant excitation with a Nd:YAG continuous wave (cw) laser source (λ ex =532 nm) in both air and vacuum at ambient temperature. The respective spectra were fitted with three Voigt profiles for neutral excitons (X), charged excitons (X − ) and localized excitons (X L ) with the energies E X ≈ 2.02 eV, E X − ≈ 1.99 eV and E X L ≈ 1.93 eV, which is in agreement with previous reports [5,6]. In air ( Supplementary Fig. 2a), the PL stems mostly from neutral exciton emission with the exciton spectral weight of w X ≈ 0.91±0.05. However, the asymmetric line shape points to a non-negligible contribution of charged and localized excitons, which are represented by the fitted red and green shaded peaks in Supplementary Fig. 2, respectively.
Localized excitons can form in a WS 2 crystal at certain defects [7] and charged excitons form when the monolayer is doped [8]. In addition, defects can increase the level of doping and enhance the formation of charged and localized excitons [9]. Physisorption gating [10] by oxygen and water molecules can compensate this defect-induced doping, which makes the resulting effects on the PL less significant in air. However, when placing the monolayer into vacuum (p ≈ 1×10 −6 mbar), the physisorbed molecules are released from the sample surface, and the contributions of the charged and localized excitons to the PL spectrum increase dramatically ( Supplementary Fig. 2b) with the spectral weights of w X − ≈ 0.56±0.10 and w X L ≈ 0.18±0.08, respectively.
To measure the difference between the Fermi levels in air and vacuum, we emulated the vacuum effect (i.e. uncompensated defect-induced doping) by contacting a monolayer WS 2 sample on top of a highly doped Si/SiO 2 substrate (t SiO 2 ≈ 300 nm, ϵ SiO 2 ≈ 3.9 [11]) acting as a backgate, as illustrated schematically in Supplementary Fig. 2c, which provides the Supplementary Figure 2 Physisorption gating measurements. a,b, Photoluminescence (PL) spectrum of a WS 2 monolayer (a) in air and (b) in vacuum (orange solid lines). c, Schematic illustration of a backgated WS 2 monolayer. d, PL spectrum of a gated WS 2 monolayer in air at V G =30 V. The PL spectra in panels (a,b,d) were fitted with three Voigt profiles (black dashed lines) for the neutral (blue shaded), charged (red shaded) and localized excitons (green shaded). e, PL spectra in air at a gate voltage of 60 V measured 10 (red), 11 (black), and 12 (blue) minutes after the voltage was applied. ability to tune the Fermi level across the band gap. Supplementary Figure 2d shows the PL spectrum at V G = 30 V in air, with the spectral weight of the charged excitons, w X − ≈ 0.66±0.04, similar to their spectral weight in vacuum. The approximate difference between the Fermi levels in vacuum and air can be calculated as [12]: where the effective electron mass in the conduction band is m * = 0.35 · m e [13].
Despite the significant redistribution of the spectral weight towards a charged exciton seen in a gated sample ( Supplementary Fig. 2d), we find that the spectral weight of the neutral excitons increases over time once a fixed voltage is applied ( Supplementary Fig.   2e). A feasible explanation is that, at higher voltages, more water and air molecules are able to bind to the monolayer surface, which leads to increasing physisorption gating that counteracts the external gating. Hence, the Fermi energy difference between the samples in air and in vacuum extracted above gives an upper limit of the actual value.

SUPPLEMENTARY NOTE 2: FURTHER EXPERIMENTAL DETAILS
A. Experimental geometry and pulse sequences Supplementary Figure 3 Experimental laser and pulse geometry. a, Four-beam box geometry used in our MDCS experiments, with the wavevector of each pulse labelled. The four-wave mixing signal emitted in the direction given by -k 1 + k 2 + k 3 overlaps with the local oscillator on the fourth corner of the box. The negative contribution from k 1 arises from a complex conjugate in the equations, which also impacts the time evolution of the phase of the signal. The k 1 pulse is thus marked with (*) for reference. For polarization control k 1 and k 3 are σ + , while k 2 and k LO can either be σ + or σ − corresponding to co-circular and cross-circular polarization schemes respectively. b, Different pulse orderings and delays being scanned for the different types of MDCS measurements; one-quantum (1Q), zero-quantum (0Q), and two-quantum (2Q). The arrow above the pulses indicates the time period that is scanned.
In the main text and Supplementary Information we report results from three types of MDCS measurements, referred to as one-quantum (1Q), zero-quantum (0Q), and twoquantum (2Q). Experimentally the difference between each of these measurements is simply related to the pulse ordering and which delay is scanned, as shown in Supplementary Fig.   3b. This is also impacted by the geometry of the experiments and which four-wave mixing signal is detected. Our experimental geometry is depicted in Supplementary Fig. 3a, which shows the measured signal is emitted in the direction given by -k 1 + k 2 + k 3 . The negative contribution from k 1 arises from a complex conjugate in the equations, which also impacts the time evolution of the phase of the signal. For 1Q measurements the delay between the first two pulses is scanned. In this time period the system evolves in a coherent superposition between ground and excited states. In the pulse ordering shown, with the k 1 pulse first and referred to a rephasing pulse ordering, any decrease of the macroscopic coherence as a result of inhomogeneous broadening can be "rephased" in the third time period due to the different direction of the phase evolution. This rephasing leads to a photon echo, and emission of the signal at a time t 3 = t 1 . It is this rephasing and photon echo that causes the narrowing of the 2D peaks along the anti-diagonal direction and reveals the homogeneous linewidth, such as in Fig. 2. In the case where the k 2 pulse arrives first, the phase evolution in the first and third time periods is in the same direction, and there is no rephasing. This is referred to as the non-rephasing pulse ordering. For 0Q 2D spectra the pulse ordering doesn't change, but instead of scanning t 1 , the delay between the second and third pulses, t 2 is scanned. For the 2Q scans shown below, the pulse ordering is changed so that the k 1 pulse arrives last. In this case the first two pulses generate a coherence between the ground and doubly excited state, and its phase evolution is measured by scanning the t 2 delay.

B. Intensity Dependence
Supplementary Figure 4 Power dependent measurement of the exciton. Integrated exciton signal from co-linear polarized 1Q spectra, and pulse fluence per beam calculated from laser specifications. Both axes are on a log scale to show data points follow a square dependence (dashed line). Measurements for this work were taken at ∼0.9 µJ cm −2 . The fluence error bars are calculated from small differences in power and uncertainty in the spot size measurements for the three excitation pulses.
To ensure that the MDCS measurements were conducted at an intensity at which the third-order susceptibility dominates (and contributions from higher order effects can be ignored), we measured the signal amplitude as a function of pulse fluence. The electric field amplitude of the four-wave mixing signal (proportional to the third-order response of the sample) scales as the product of the electric field amplitude, E, in each excitation pulse, that is, as E 3 . The signal is detected interferometrically by combining with the local oscillator (LO). The amplitude measured is therefore proportional to the product of the signal and LO electric fields. Thus, in the third-order regime, the measured signal amplitude should scale as E 4 , or intensity squared (I 2 ) (which is proportional to fluence squared). A sample response which deviates from this dependence is therefore indicative of higher order contributions.
Supplementary Figure 4 confirms our data follows a square dependence (dashed line) for fluences up to at least 8 µJ cm −2 per beam, indicating an excitation regime where thirdorder effects dominate and higher order effects are minimised [14].
To obtain an estimate of the exciton density we assume one absorbed photon creates one exciton. The laser photon density can be calculated by dividing the fluence per pulse which is roughly an order of magnitude smaller than our estimated total electron density of 1 × 10 11 cm −2 .

SUPPLEMENTARY NOTE 3: FULL 2D SPECTRA
Supplementary Figure 5 Full 1Q MDCS spectra for different polarization schemes. a-d, The 2D spectra over the full signal range for polarization schemes; (a) co-circular, (b) cross-circular, (c) cross-circular with t 2 =100 fs: σ + σ − σ + σ − , (d) cross-circular with t 2 =100 fs: σ + σ + σ − σ − . The colour scales in (a) and (b) are normalised to their respective exciton peak amplitudes, while (c) and (d) are normalised to the exciton peak in (b). The dotted box indicates the axes range used in the main text. In the σ + σ − σ + σ − ordering the phase evolution of the S-T coherent superposition shifts the centre of these cross-peaks in towards the diagonal. The roughly equal amplitudes of the attractive polaron peaks evident in (c) and (d) confirm that both pathways contribute equally to the cross-peaks discussed in the main text.
In the main text, we focus our investigation on the substructure of the attractive polaron region, leaving the repulsive polaron region out of discussion. In Supplementary Fig. 5a,b we show the full spectra for the two polarization schemes in the main text. These plots are dominated by the repulsive polaron diagonal peak, which is two times larger than the attractive polaron peaks in the co-circular, and four times larger in the cross-circular.. Taking into account the laser spectrum, which has intensity at the attractive polaron energy more than double what it is at the repulsive polaron energy, the nonlinear response of the attractive polarons is ∼ 10 − 20 times weaker than for the attractive polaron. Cross-peaks between attractive and repulsive polarons can also be identified, as well as a peak due to biexcitons [15]. As discussed in the main text, the cross-circular polarization data incor-porates two pathways as a result of having t 2 = 0: σ + σ − σ + σ − and σ + σ + σ − σ − ; that is, the order of the second and third light matter interactions can be swapped. In Supplementary Fig. 5c and d, these pathways are separated by setting t 2 =100 fs. On first inspection these 2D spectra show that the two different pathways contribute approximately equally, as predicted by the modelling. They are not identical, however, because in the σ + σ − σ + σ − scheme the systems evolves in t 2 as a coherent superposition of S and T polarons, and when t 2 =100 fs the phase of these peaks has changed, which, when added to the broad background signal, causes the centre of the cross-peaks to move in towards the diagonal.

YSIS
The measurements ( Supplementary Fig. 2 in the main text) show that the co-circularly polarized 2D spectrum only shows diagonal attractive polaron peaks, while the crosscircularly polarized 2D spectrum only had cross-peaks. To understand the nature of these peaks, we list every possible (30 total) quantum mechanical (QM) pathway in our co-circular and cross-circular polarization schemes. Supplementary Figure 6 summarises the QM pathways with Feynman-Liouville diagrams [16], where the listed |⟩ and ⟨| slot into the green (or orange) boxes, and their associated shapes indicate positions on the 2D spectra.
In the absence of interactions, these pathways will all cancel and there would be no signal.
Specifically, the excited state absorption (ESA) pathways would cancel with the respective stimulated emission (SE) and/or ground state bleach (GSB) pathway. In the case of the co-circular polarization (diagonal peaks), there is only one ESA pathway, but both SE and GSB pathways. Because the ESA pathway requires the creation of a second indistinguishable polaron, the transition dipole moment of that transition is a factor of √ 2 larger, leading to a signal that is twice the amplitude, which can thus cancel with the SE and GSB pathways.
In general, the presence of interactions leads to imperfect cancellation of these pathways, and the appearance of peaks in the 2D spectra.
This statement that there is no signal in the absence of interactions, even for diagonal peaks, may seem counter-intuitive to those accustomed to thinking of these experiments in the context of two-level systems (or few-level systems), where simply the presence of a bright state generates a peak on the diagonal. In the two-level system, there is no possibility of having two indistinguishable excitons in the same state, no ESA pathway to cancel the GSB and/or SE, and thus a signal appears on the diagonal. The inability to excite a second exciton is equivalent to having an infinite repulsive interaction between identical excitons.
Hence, the presence of a signal for a two-level system is consistent with this framework of requiring interactions for peaks to appear.
It is evident from the pathways and respective peak locations shown in Supplementary these polarons. In contrast, the pathways that involve polarons dressed by the two different Fermi seas (with different spin) do not produce measurable peaks in the experiments. The implication then is that interactions between polarons involving different Fermi seas are much weaker than the interactions between polarons involving the same Fermi sea.
Comparing the cross-circular 0Q spectra (Fig. 3 in main text) to the pathway analysis in this Supplementary Fig. 6 we find that the experimental peaks in the σ + σ − σ + σ − and σ + σ + σ − σ − also align with the pathways and diagram of Supplementary Fig. 6c. This further confirms that the QM pathways presented here are indeed the ones that contribute to the measured signals.

SUPPLEMENTARY NOTE 5: POLARON MODEL
As discussed in Methods, we describe the system using the minimal Hamiltonian Here, E 0 X , E 0 S , and E 0 T are the exciton, singlet trion, and triplet trion energies in the absence of exciton-trion coupling, and the strength of the exciton-trion coupling at low doping is related to the Fermi energy and j trion binding energy [17]: As discussed in Methods, to a good approximation the exciton couples to a coherent superposition of trions in N sites within an area A: This effectively leads to interactions between trions since a given electron is only available for a single trion formation, i.e., To proceed, we make reference to the experiment, where the signal arises from processes involving either one or two excitations. We therefore now construct an orthonormal basis for all such states. For the single excitations, this is simply Here, the state |0⟩ represents the state of the medium prior to the first pulse, which we take to consist of Fermi seas in the K and K ′ valleys. For two excitations, things become more complicated since there are 21 such terms: six terms involving two identical excitations, and 15 terms involving two distinguishable excitations. Using the expansion of the trion states in Eq. (4) together with the phase-space filling in Eq. (5), we arrive at the following orthonormal basis: The normalisation of the two-exciton state in Eq. (7a) arises from the bosonic commutation relation, [X † σ ,X σ ′ ] = δ σσ ′ , and the terms in Eqs. (7b) to (7e) involving blocking effects. The remaining terms involve distinguishable particles that do not have any blocking.

Interactions
While the state space in Eqs. (6) and (7) contains a large number of terms, we emphasize that this is the minimal model containing excitons in both valleys and the corresponding singlet and triplet trions, along with their possible interactions. Indeed, the interactions are effectively encoded in the normalisation of the two-excitation terms in (7b) to (7e). To see this, consider for instance the following matrix element of the Hamiltonian The blocking effect in Eq. (5) results in the factor √ N e − 1 instead of the usual factor √ N e arising from bosonic statistics, similar to the case of two photons coupled to atoms in a cavity [18]. The difference can therefore be viewed as an interaction due to phase-space filling. Taking all such terms into account, and expanding to leading order in 1/N e , we obtain the following interaction terms in the Hamiltonian, This explicitly demonstrates that within the model (2) we only have interactions between trions that share the same Fermi sea.

Relating parameters of the model to experimental observables
In the presence of the K and K ′ Fermi seas, the frequencies of the exciton and trion peaks shift to those of the repulsive and attractive polarons. The shift of a given branch is O(E F ), and at the density of the experiment we have E F ≃ 0.3 meV. This is within the experimental uncertainty of any given peak position, and hence a shift of this order of magnitude is not measurable. We thus take the energies of the trions in the absence of the coupling, E 0 X , E 0 T and E 0 S , to be essentially the same as the energies observed in experiment, with a small shift to ensure that the peaks in the presence of coupling coincide with the two attractive and the one repulsive polaron peaks. This leads to the average (theory) values where the average is over disorder in the sample that leads to an inhomogeneous broadening, as discussed below in Sec. A 1. We emphasize that these parameters of the model are obtained directly from the experimental density and the frequencies of the observed peaks.

Exciton valley exchange
Finally, we remark on the possibility of including an exchange coupling between K and K ′ excitons due to inter-valley electron-hole exchange. This effectively leads to exchange processes that couple X ⇑ ↔ X ⇓ as well as S ⇑↓ ↔ T ⇓↓ and S ⇓↑ ↔ T ⇑↑ . This process is expected to scale linearly with momentum at low exciton momenta, and for the momentum associated with the pulses in our experiment, we estimate the rate to be ∼ 20 µeV [19]. This is much smaller than any other relevant energy scale, and therefore the exchange process can be ignored for the excitons.
The rate of exciton exchange inside a trion is more involved, since the excitons now move relative to the electron. In this case, the rate of such exchange has previously been measured in MoSe 2 to be ∼ 2.9 meV [20]. We have checked that including a trion exchange coupling of this magnitude in our MDCS spectra does not lead to any visible difference compared with the spectra in the absence of exchange; thus, in all results presented in the manuscript we have taken the exchange to zero in order to reduce the number of theoretical parameters.

A. MDCS Simulation details
We now discuss how the MDCS is simulated within the model (2). We will do this using two approaches. The first is based on correlation functions, and this has been used to calculate all results presented in the main text, as discussed in the Methods. The second approach is based on a master equation description of the system. As we demonstrate, these two approaches yield comparable results.

Approach based on correlation functions
As discussed in Methods, the third order response contributing to the rephasing experimental protocols illustrated in Supplementary Fig. 3 is due to ground state bleach, stimulated emission, and excited state absorption [21]. Their contributions to the material polarization are P (3) The trace is over the possible states of the system, with the density matrixρ 0 = |0⟩⟨0| describing the initial state of the system prior to any excitations.
The exact time evolution of the contributions in Eq. (11) using the Hamiltonian in Eq. (2) can be straightforwardly calculated since the expectation value is taken over a state that corresponds to the vacuum of excitons and trions. However, to obtain the spectra, in practice it is simplest to first Fourier transform to frequency space. For instance, in the case of single quantum spectroscopy, this Fourier transform is with respect to t 1 and t 3 at fixed t 2 . Including a homogeneous linewidth Γ, for the ground state bleach term we have The other terms are Fourier transformed similarly. Note that some care must be taken in the two-exciton terms, since t 3 can appear in multiple time-evolution operators.
Apart from the homogeneous broadening described by Γ, we also have inhomogeneous broadening. We model this as dominated by dielectric disorder due to spatial fluctuations of the substrate and the presence of impurities and adsorbates, in accordance with the results of Ref. [22]. This leads to fluctuations of the attractive and repulsive polaron energies, leading to an approximately Gaussian lineshape (see Sec. A 2) of width σ A and σ R , respectively.
Importantly, the fluctuations in the energies are correlated [22], and since σ A ≃ σ R we simply average the signal by varying the input parameters E 0 X , E 0 S and E 0 T around their average positions in Eq. (10) according to

Master equation description
An alternative approach is to model the MDCS response via a master equation description, as is more common in the literature. This approach has the advantage of including decoherence processes more generally, rather than simply via a phenomenological dephasing parameter as given above. However for the 28-level system considered, it also has considerable drawbacks. One is simply the additional computational time required to perform the simulations in the large state space of the density matrix of the system, as well as including the effects of inhomogeneous broadening. Another more subtle issue is that there is not enough information to define all the decoherence channels unambigously. The effects of various dephasing and relaxation channels all contribute to the MDCS spectra in different ways, but can not easily be distinguished without a large number of additional measurements. To address these issues, we perform key simulations of interest with both approaches, without inhomogeneous broadening. This allows us to confirm that our theoretical approach is valid without over-fitting to the experimental results.
To simulate the MDCS spectra using a master equation approach, we implement a similar approach to that detailed in Ref. [23,24]. The master equation is assumed to be of Lindblad type [25][26][27] where γ j andL j represent the decoherence rates and operators respectively.
We solve the evolution of the master equation in tetradic representation or 'superoperator' form [23,28]. This involves expressing the density matrix in vector form, making the evaluation of the time evolution simpler, at the expense of larger memory requirements.
The Lindblad equation therefore takes the form where where ⊗ indicates the Kronecker product and I is the identity matrix with the same dimensions as the Hamiltonian. The time evolution of the density matrix can then be written in where the propagator can be Fourier transformed analytically [23] G(ω) = 1 iω − L (18) to give the frequency response.
We can now define the MDCS response in terms of the following expressions where ρ 0 is the initial (steady state) density matrix of the system. These response functions R j can also be used to define the rephasing (R 1 , R 2 , R 3 ), non-rephasing (R 5 , R 6 , R 7 ) and double-quantum (R 4 , R 8 ) contributions to the MDCS spectra. In the comparison that follows, we only compute and compare to the rephasing spectrum ( 3 j=1 R j ). In the expression above we assume that the dipole operator for each transition consists of Pauli x-operators, The system Hamiltonian, as described above and in the main text can be expressed in terms of matrix elements as follows. Note that the interactions (i.e., blocking effects) are encoded in the basis states and thus we useĤ 0 . The single particle states are, where σ = {⇑, ⇓} which correspond to the K and K ′ valleys respectively.
The diagonal terms are, where both σ = {⇑, ⇓} and σ ′ = {⇑, ⇓} in the respective K-valleys but avoiding double counting of states when the state is indistinguishable on swapping valleys.
The cross terms can be slightly more complex. Specifically, we have In this basis the dipole operators are the creation and annihilation operators for the exciton states in the two valleys, ie.V + =X † σ . As the indistinguishable states are not duplicated in the state space, the corresponding creation/annihilation operators pick up an additional factor of √ 2. This is required to ensure the correct cancellation of the signal when there are no interactions between states.
To mimic the phenomenological dephasing implemented in the correlation function approach above, we include the following Lindblad operators, Each operator is then included with the same dephasing rate Γ discussed in the previous section. In principle a number of other effects could be included via additional Lindblad terms, including exciton recombination and correlated dephasing within and between valleys.
However these effects are considerably more difficult to extract reliably from the MDCS data.
We therefore limit the comparison to the pure dephasing discussed above.
In Supplementary Fig. 7 we show the simulated 1Q rephasing spectra with co-and crosscircular polarization, computed via the two approaches. For accurate comparison, these plots do not include the effects of inhomogeneous broadening. We see very good correspondence between the approaches. It is worth noting the minor differences in peak width and magnitudes which stem from the different approach to including dephasing in the calculations.
It is also worth noting that the amplitude of the exciton peaks, and the exciton-trion cross-peaks compared to the trion peaks are relatively larger in the simulations than in the experimental measurements. This can be mostly clearly seen in Fig. 2f of the main text, at the edges of the observed energy window. It is likely that these discrepancies also stem from the idealised decoherence model used, as loss processes and additional dephasing types can reduce this signal. However more detailed measurements and theoretical analysis are required to understand the details.
Supplementary Figure 7 Comparison of simulation models. a-d, Simulated 1Q rephasing 2D spectra with co-circular polarization using correlator code (a) and master equation code (b). Cross-polarized scheme using the correlator (c) and master equation code (d). To make both models easier to compare, no inhomogeneous broadening is included here. Both models run with fitted experimental values from Supplementary Fig. 13, and estimated electron/exciton densities.

SUPPLEMENTARY NOTE 6: BIPOLARON STATES IN WS 2
In this section we provide more evidence and detailed explanation behind the attribution of the bipolaron peaks in the 1Q 2D spectrum (Fig. 2d) discussed in the main text. To aid this discussion, Supplementary Fig. 8 shows the energies of and separations between all of the singly excited states and unbound doubly excited states. Also shown are the biexciton and bipolaron states.
Confirmation that these peaks come from doubly excited states The negative peaks in the real part of the 1Q 2D spectrum ( Supplementary Fig. 4a of the main text), give a strong indication that these peaks arise from excited state absorption pathways, and thus involve doubly excited states. However, there are other possible effects that can generate negative peaks, such as excitation induced shifts [29] or bandgap renor- malisation. One unambiguous means to identify doubly excited states is two-quantum (2Q) MDCS [15,30]. In this type of measurement the first two pulses create a coherent superposition between the ground state and any doubly excited states (a 2Q coherence). The third pulse can then drive the system into a (1Q) coherent superposition of singly and doubly excited states or ground and singly excited states. These types of pathways are depicted in Supplementary Fig. 9, along with the expected locations of the corresponding peaks in a 2Q 2D spectrum. The 2D spectrum correlates ℏω 2Q , the energy of the 2Q coherence, with the emission energy, ℏω 3 . Thus, peaks a 2Q 2D spectrum provide unambiguous confirmation of the presence and energy of doubly excited states.
Supplementary Figure 9 shows the experimental 2Q spectrum, possible QM pathways, and their respective peak locations. The fact that a broad peak occurs in the 2Q spectrum at ℏω 2Q = 4.105 ± 0.005 eV, ℏω 3 = 2.052 ± 0.050 eV in Supplementary Fig. 9n, shows that doubly excited states are present and contribute to the third-order signal. With reference to Supplementary Figure 9 Bipolaron 2Q pathway analysis. a-l, 2Q pathways for excitation of the bipolaron (⇑⇓↑ and ⇑⇓↓). In this case, the k 2 and k 3 pulses arrive simultaneously so the t 1 Feynman diagram rung has been removed for simplicity. m, illustration of peak locations in a cross-circular 2Q spectrum arising from pathways in (a-l). n, Experimental cross-circular 2Q spectrum.
the energy level diagram in Supplementary Fig. 8 In these 2Q measurements there is no rephasing, so inhomogeneous broadening blurs the peaks, making it impossible to resolve the different peaks. However, it is clear that there is a broad peak observed around these energies, which provides strong confirmation that the negative peaks in the 1Q 2D spectrum arise from excited state absorption pathways involving these doubly excited states. It is possible that the unbound S + S, S + T, and T + T doubly excited states identified in Supplementary Fig. 8 could contribute to this broad peak via the pathways in Supplementary Fig. 9e-l. However the ℏω 2Q energies expected are 10, 17, and 24 meV above the bipolaron energy, and all well above the centre of the peak in the experimental results. Hence even if they do contribute, they cannot explain the whole signal.
Supplementary Figure 10 Bipolaron 1Q pathway analysis. a-d, 1Q pathways for excitation of the bipolaron. e, Illustration of peak locations in a cross-circular 1Q spectrum arising from pathways in (a-d). f, Experimental cross-circular 1Q spectrum.

Pathways leading to the peaks arising from the bipolaron states
The pathways that contribute to the bipolaron peaks in the 1Q 2D spectra are shown in Supplementary Fig. 10. Pathways for both the σ + σ − σ + σ − and σ + σ + σ − σ − are shown, since the results showed that both contribute to these peaks. For each, there are two possible pathways with ℏω 1 = ℏω S and ℏω 1 = ℏω T . The emission energy for these pathways can be determined by taking the difference between the energy of the bipolaron ⇑⇓↓ (or equivalently the ⇑⇓↑) state and the final state of the system, as can be seen in Fig. 4c in the main text.
From the location of the peaks, we determine the energy of the bipolaron states to be 4.102 eV. This gives emission energies 17 meV below the S and T energies.
If we then consider the binding energy to be the minimum energy required to break apart this complex, then the lowest energy combination it could be split into would be X + S (as per the energy level diagram in Supplementary Fig. 8), all other pairs of the included quasiparticles (X + T or XX + e) are higher in energy. This gives a binding energy of 46 ± 3 meV, which is significantly higher than predicted when considering a charged biexciton as two-body exciton + trion bound state. This large binding energy is indicative of the cooperative binding between all three quasiparticles.
Another key to identifying these bipolarons as arising from cooperatively bound charged biexcitons comes from the observation of pathways in Supplementary Fig. 10a,b, that on the left hand side of the diagrams there is initial absorption to the S ⇓↑ (T ⇓↓ ) followed by excitation to the bipolaron ⇑⇓↑ (⇑⇓↓). The emission from the bipolaron then leaves behind a T ⇑↑ (S ⇑↓ ) polaron. That is, the ↑ (↓) electrons that initially dress the ⇓ exciton are left dressing the ⇑ exciton after going via the bipolaron. This process requires that in the bipolaron state both excitons are interacting strongly with the electron.
Alternative pathways that could lead to these peaks and why we rule them out In the process of identifying the pathways and nature of the bipolaron peaks, we considered alternate pathways and possible bound states, but ultimately ruled them all out. Below we detail pathways for alternate bound states and how they deviate from our experimental observations.
Supplementary Figure 11 considers the pathways for the case of a simple two-body binding of an exciton to a trion as the basis for these peaks. For each of these a different binding energy is required for XS (46 meV) and XT (53 meV), which is not so unreasonable. However, in Supplementary Fig. 11(a-d), the pathways that yield the observed peaks in the 1Q 2D spectrum, the system evolves over t 2 as a superposition between X and S (or X and T) states, which should show up as a peak at ω 2 = 36 meV (or 29 meV) in the 0Q 2D spectrum for the σ + σ − σ + σ − pulse ordering. No such peaks are observed, the only peaks in the 0Q spectrum emitting at 2.039 -2.046 eV occur at ±7 meV, consistent with the bipolaron pathways identified in Supplementary Fig. 10.
In principle the pathways in Supplementary Fig. 11e,f could generate the peaks in the 1Q 2D spectrum and are consistent with the peaks at ℏω 2 = 0 in the 0Q spectrum. However, in this case we would also expect peaks arising from the pathways in (g-h), which would appear at ℏω 1 equal to the exciton energy and ℏω 3 = 2.01 eV, and no such peak is observed.
Similarly, peaks from (c-d) should also contribute to this position, yet no peaks are observed.
We therefore rule out the possibility of a two-body bound exciton-trion state as contributing to these bipolaron peaks.
Another possibility is that these peaks arise from a doubly charged biexciton, consisting Supplementary Figure 11 Bound exciton-trion 1Q pathway analysis. a-h, Possible QM pathways for exciton-trion states. i, illustration of peak locations in a cross-circular 1Q spectrum arising from pathways in (a-h). j, Experimental cross-circular 1Q spectrum for reference.
of two trions bound together. Possible pathways for these states are shown in Supplementary T (24 meV) bound states, which is unlikely, given they both involve the same combination of electrons and holes. Furthermore, Monte Carlo calculations indicate that the 6-body bound state is not stable [31]. Thus, we rule out the possibility that these pathways, and the associated doubly charged biexciton state, contribute to the signal measured. or from the bright trion, leaving behind a dark exciton. In PL measurements there is no way to distinguish between these two possibilities, and different authors have made the case for both, which leads to vastly different binding energies being determined. In all cases it appears to be assumed, even if not explicitly stated, that the binding is between an exciton and a trion. However, the MDCS measurements presented here indicate that the charged biexciton is formed as a result of the strong interaction between each of the two excitons and electron. This picture of a cooperatively bound charged biexciton is also consistent with PL measurements, the only difference being that because one of the excitons is a dark exciton, the emission will always be from the bright exciton and leave behind a dark, inter-valley trion. That is, while in the MDCS measurements we see emission at two different ℏω 3 values, the PL will show only one spectral peak. We therefore conclude that the PL measurements that report this charged biexciton should consider the emission as leaving behind a trion, and determine the binding energy based on the spectral shift from the exciton emission.

SUPPLEMENTARY NOTE 7: FITS TO 1Q & 0Q 2D SPECTRA
Supplementary Figure 13 One-quantum linewidth fits. a, Experimental cross-circularly polarized 1Q 2D amplitude spectrum. Red (blue) arrows indicate slices at the attractive (repulsive) polaron peak for diagonal and anti-diagonal. b-c, Diagonal slices (black) of (a) along the σ line for both the attractive (b) and repulsive polaron (c), fit with Gaussian profiles (red or blue respectively). Peaks appearing to the sides of the attractive polaron (shaded green), are due to the edges of other peaks. The inhomogeneous linewidth (σ) is shown above for Gaussian FWHM in meV. d-e, Anti-diagonal slices (black) of (a) along the Γ arrows, for both attractive (d) and repulsive polaron (e). The linewidth is fit with a square-root-Lorentzian, with FWHM in meV displayed above the fits. In (d) there are two peaks from the two singlet-triplet cross-peaks. Uncertainties are derived from the 95% confidence bounds of the fits.
Further details of interactions between quasiparticles and their local environment can be obtained from the shape and width of the 2D peaks in MDCS plots. Here we provide these numbers as a step towards further analysis and understanding of these interactions.
We fit anti-diagonal and diagonal slices of the 2D spectral amplitudes with square-root-Lorentzian (Lorentzian for intensity) and Gaussian models respectively, assuming we are in the inhomogeneous limit [37]. In this limit, for diagonal peaks the width along the diagonal is the inhomogeneous linewidth, and along the anti-diagonal the width corresponds to the homogeneous linewidth. For cross-peaks with correlated inhomogeneous broadening the anti-diagonal width can be a measure of the strength of the correlation but is also related to the homogeneous linewidths of the two transitions. In Supplementary Fig. 13, the diagonal and anti-diagonal linewidths of the attractive and repulsive polaron are similar.
Supplementary Figure 14 shows 0Q 2D spectra of the two cross-circular schemes (σ + σ + σ − σ − Supplementary Figure 14 Zero-quantum linewidth fits. a-b, Experimental σ + σ + σ − σ − (a) and σ + σ − σ + σ − 0Q 2D spectra. Vertical arrows indicate the point at which slices were taken for (c-f). The peaks in (a) are from QM pathways that evolve via a population state, where the linewidth along ℏω 2 is related to the population lifetime. The peaks in (b) evolve via coherent superpositions between excitations in different valleys (e.g. T ⇑↑ S ⇓↑ , S ⇑↓ T ⇓↓ ). In this case the linewidth along ℏω 2 gives an upper limit for the decoherence rate of these inter-valley coherences. c-f, Vertical slices of the attractive (red) and repulsive (blue) polarons for both (a) and (b), fit with square-root-Lorentzian profile where Γ (FWHM) is shown in meV.
and σ + σ − σ + σ − ) with vertical slices through the peaks. For the σ + σ − σ + σ − polarization sequence, where the system evolves as a coherence between excitations in the K and K ′ valley (e.g. |T ⇑↑ ⟩ ⟨S ⇓↑ | or |S ⇑↓ ⟩ ⟨T ⇓↓ |), the linewidths are related to the decoherence time of these inter-valley coherences: T 2 = 2ℏ Γ , where Γ is the FWHM of the Lorentzian fit in meV. For the σ + σ + σ − σ − polarization scheme the system evolves via population states and the widths are related to the state lifetimes. In this case however, they are likely limited by the constraints of our experiment.