Microcavity phonoritons – a coherent optical-to-microwave interface

Optomechanical systems provide a pathway for the bidirectional optical-to-microwave interconversion in (quantum) networks. These systems can be implemented using hybrid platforms, which efficiently couple optical photons and microwaves via intermediate agents, e.g. phonons. Semiconductor exciton-polariton microcavities operating in the strong light-matter coupling regime offer enhanced coupling of near-infrared photons to GHz phonons via excitons. Furthermore, a new coherent phonon-exciton-photon quasiparticle termed phonoriton, has been theoretically predicted to emerge in microcavities, but so far has eluded observation. Here, we experimentally demonstrate phonoritons, when two exciton-polariton condensates confined in a μm-sized trap within a phonon-photon microcavity are strongly coupled to a confined phonon which is resonant with the energy separation between the condensates. We realize control of phonoritons by piezoelectrically generated phonons and resonant photons. Our findings are corroborated by quantitative models. Thus, we establish zero-dimensional phonoritons as a coherent microwave-to-optical interface.


Introduction
Coherent interactions between microwave (GHz) phonons and optical (hundreds of THz) photons enable the control of opto-electronic phenomena at the nano-and ps-scale, interconversion of optical and microwave photons for communication between distant qubits [1][2][3] as well as optical information transfer in on-chip computational devices [4,5].One strategy towards efficient interconversion uses optomechanical interactions [6], i.e., correlations between optical and mechanical degrees of freedom.In this setting, optomechanical systems relying on the coupling between high-frequency vibrations (phonons) and solidstate excitations have become relevant for advanced photonic applications, including the emerging fields of quantum communication [7] and control of various quantum states [8][9][10][11][12][13][14], e.g., qubits [15].
In general, coherent interactions between photons and phonons require a large coupling energy as well as low phonon (Γ M ) and photon (γ phot ) decoherence rates.High-efficiency coherent transduction between the particles presupposes a single-photon cooperativity exceeding unity, where g 0 is the single-photon optomechanical coupling rate.If, in addition, g 0 > {γ phot , Γ M }, the phonon-photon interaction enters the optomechanical strong-coupling (OSC) regime [6], where a novel optomechanical quasiparticle emerges -the phonoriton [16].The above requirements become relaxed for photon populations N phot > 1, for which C 0 is enhanced by a factor N phot .
Reaching the OSC regime in the solid-state faces several challenges imposed by the huge mismatch between the phonon (f M ) and the photon (f phot ) frequencies f M << f phot , typically large values of Γ M and γ phot , dissimilar spatial dimensions (wavelengths) of the optical and phonon modes, and low magnitudes of g 0 .In this context, polaromechanical systems -optomechanical setups utilizing strongly coupled excitons and photons (simply, polaritons [17] or MPs) in monolitic microcavities (MCs) -become an attractive option [18].These systems benefit from the large deformation potential exciton-phonon coupling and the simultaneous confinement of photons and phonons [19,20], enabling coherent polaromechanics [21,22] with near-unity single-polariton cooperativity [23].
In this work, we first demonstrate the OSC between Bose-Einstein condensates (BECs) of polaritons and GHz phonons confined in µm-sized potential traps within a semiconductor MC.This OSC results in phonoritons, which are evidenced by optomechanical self-oscillations (SOs) or phonon lasing.The SOs can be accounted for by the deformation potential coupling between the BEC pseudo-spin states mediated by the phonons.We then show that phonoritons can also be stimulated and controlled by piezoelectrically generated GHz phonons as well as optically.Thus, we establish these traps as a scalable and bidirectional optical-to-microwave interface.The implications of these milestones for the coherent control down to the quantum regime are discussed.

Results
The studies were carried out using an (Al,Ga)As MC with intracavity traps [24] for phonons and polaritons (cf.Fig. 1a, further details in Methods).These traps confine λ C = λ * n c = 810 nm photons (n c is the average refractive index of the MC spacer) as well acoustic phonons with wavelenghts of 3λ and λ.The LA (η = 1, 3) for the TA ones (see SM-V-B ).Here, 0.7 is the ratio between the TA and LA sound velocities along the MC z[001]-direction.The above implies that f LA .Finally, a ring-shaped piezoelectric bulk acoustic wave resonator fabricated on the top surface electrically injects long-lived (1/Γ M ≈ 300 ns) monochromatic LA phonons with frequency tunable around f (3λ) LA = 7 GHz [25].Figure 1(c) displays a spatial photoluminescence (PL) map of the sample.The bright PL spot close to the center of the resonator ring-shaped aperture corresponds to the emission of the trap.Its PL at low optical excitation power (P Exc ) displays discrete energy spectrum typical of a particle in a box.The transition to the BEC at high P Exc is accompanied by an energy blueshift and a nonlinear increase of the PL intensity, as detailed in SM-II-B.Simultaneously, the linewidth reduces to a record-low value γ MP ≈ 0.5 GHz f LA , cf.Fig. 1(c), which enables the non-adiabatic interaction regime.Dependence of γ MP on P Exc is detailed in SM-III-B.In the following we consider two 4 × 4 µm 2 traps with different polariton excitonic content of 0.05 (labelled T 1 ) and 0.2 (labelled T 2 ).
Optomechanical self-oscillations in single traps.Signatures of the polariton-phonon interaction can be readily identified in spectral PL maps of the trap T 2 ground state (GS) recorded in the BEC regime for increasing P Exc (cf.Fig. 1f).The energy axis is referenced to the main emission line (the zero-phonon line, ZPL).For P Exc < 200 mW, the map shows, in addition to the ZPL, a second line displaced by LA , which evidences the splitting of the trap GS into two components.The GS degeneracy can be lifted in an asymmetric trap (i.e., non-square) by the non-vanishing effective in-plane momentum induced by the confinement via the so-called longitudinal-transverse pseudo-spin splitting [26], described in SM-V-A.In the BEC state, the splitting can be amplified by polariton-polariton interactions between unequally populated pseudo-spin states.LA .The latter are indicated by the blue arrows in an exemplary profile for P Exc = 320 mW in Fig. 1d.These sidebands are attributed to phonon self-oscillations (SOs) -the excitation of a coherent mechanical motion by a timeindependent polariton drive.The stimulated phonons backact on polaritons by locking their pseudo-spin splitting to f (λ) TA .We can estimate the optomechanical coupling rate (g) leading to the sideband formation by taking into account the fact that the amplitude of the n th sideband is proportional to J 2 n (g/f (3λ) LA ), where J n is the Bessel function of the n th order [22].We point out that the ratio of the peak intensities of the sideband at E/hf (3λ) LA = 1 and the ZPL is J 2 1 /J 2 0 ≈ 0.3.This ratio implies that g ∼ f LA .Thus, g > {Γ M , γ MP }, which confirms the OSC character of the coupling and gives a lower estimate of C ≈ 10 4 according to Eq. 1.Therefore, the OSC evidences the formation of a phonon-exciton-photon quasiparticle -the phonoriton [16].Interestingly, phonoritons involving λ and 3λ phonons can appear simultaneously, indicating that more than one phonon mode can enter the OSC regime.
The pseudo-spin locking at f (λ) TA (rather than at 2 × f (3λ) LA ) is further corroborated by the GS PL from polaritons with a reduced exciton content, as illustrated for the trap T 1 in Fig. 1c.The PL from the pseudo-spin state is weaker and sidebands are not observed.The GS splitting remains, nevertheless, locked at f (λ) TA over a wide range of excitation powers (cf.additional data in Fig. SM-6).SOs are ubiquitous in optomechanics [27,28].In polariton systems, they have been reported for processes of optoelectronic [29] and optomechanical [30] nature.In contrast to the former, the SOs demonstrated here involve transitions between the GS pseudo-spin states rather than between confined levels with different orbitals and larger energy separation.Unlike the report [22] -in the present case, SOs are of the first-order nature and, more importantly, emerge in a single trap rather than in an array.
The optomechanical couping between the GS pseudo-spin states leading to SOs requires confined phonons with shear strain components, which are intrinsic for TA modes but absent in bulk LA ones propagating along [001] GaAs.However, if the traps are not perfectly square, the lateral confinement imparts a small shear component to the confined LA modes, which is proportional to the trap asymmetry as shown in SM-V-C.Furthermore, a first-order deformation potential interaction between phonons and polaritons of the split GS can provide both the interlevel coupling (g 0,↑↓ ), corresponding to the coupling between the pseudo-spins, which is required to trigger SOs, as well as the intralevel one (g 0,↑↑ ) leading to energy modulation and sideband formation.These coupling rates for the TA and LA confined modes are summarized in Table SM-V for a trap with a = 4 µm and asymmetry ∆a/a = 0.1.In essence, for the GS, the interlevel coupling is considerably higher for TA modes g 0,↑↓,TA ≈ 1 MHz ≈ 35 × g 0,↑↓,LA .
Hence, TA-like f (λ) TA -phonoritons can form for polariton populations N MP ≤ 1000, significantly lower than the BEC threshold of ∼ 10 5 − 10 6 , as estimated in SM-II-C.Since g 0,↑↑,TA is negligible for the TA modes, TA-related SOs are normally not accompanied by sidebands.In contrast, LA-like SOs are usually accompanied by sidebands due to the large on-site coupling energy g 0,↑↑,LA ≈ 7 MHz, but require a large BEC population.These predictions are in qualitative agreement with the results in Figs.1c and 1d.
Phonoriton formation also requires a pseudo-spin energy splitting ∆E matching the phonon energy.
The following picture emerges for the onset of the phonoriton-related SOs: ∆E depends on the trap geometry and can change with polariton density to match the phonon energy and trigger a particular phonoriton mode.The matched phonon mode may vary with P Exc leading to the behaviour illustrated in Fig. 1f.The energy locking between the pseudo-spin states is attributed to the phonon-mediated transfer of particles between them.The strong dependence of the transfer on ∆E tends to equilibrate the difference in populations leading to the locking [31,32].
Lastly, SOs can also be induced by interactions between higher-energy (excited) BEC and phonon modes.Some of these phonons can trigger oscillations with just a few polaritons (cf.Table SM-V and the discussion in SM-V-D), thus opening the way to SO in the single-particle regime.Furthermore, phonons also affect non-linear polariton interactions [33,34], including those involving the excitonic reservoir [29,35].Combined with the optomechanical coupling proposed here, these mechanisms may additionally enhance the polaromechanical coupling.
Electrically stimulated sidebands.A unique feature of our platform is the ability to electrically inject GHz LA bulk acosutic waves (BAWs) into traps using bulk acoustic wave resonators (BAWRs).LA .This demonstrates the non-adiabatic control of the polariton BEC by the tunable phonon amplitude.
The evolution of the PL spectrum of the trap T 1 for increasing acoustic amplitudes A BAW is illustrated by the color map of Fig. 2a and cross-sections in Fig. 2b-e.A BAW is expressed in terms of the squareroot of the nominal RF power (P 0.5 RF ) applied to the BAWR.Spectra at low acoustic amplitudes P 0.5 RF < 0.01 W 0.5 are dominated by the strong ZPL with a weaker pseudo-spin state locked at f indicated by the red arrow in Fig. 2b.At P 0.5 RF ≈ 0.01 W 0.5 , the first two symmetric sidebands appear on either side of the ZPL.At higher acoustic amplitudes, additional symmetric sidebands emerge, reaching up to ±5 × f (3λ) LA -sidebands.For the intermediate P 0.5 RF values, such as in Fig. 2d, the intensity of the ZPL line becomes strongly suppressed.This suppression is a form of optomechanically induced transparency.
The solid blue lines in Figs.2b-e are fits given by a sum of Lorentzians with linewidths (δE) weighted by squared Bessel functions J 2 n (χ), where χ -is the modulation amplitude (see SM-IV-B ).The fits show that the acoustic modulation redistributes the oscillator strength (initially at the ZPL) among the sidebands while conserving the overall PL intensity.Figure 2g shows the dependence of the fitted sideband linewidths δE on the normalized A BAW .Remarkably, δE(A BAW ) sharply decreases by a factor of two from δE(0.1 LA and then remains constant.The reduction coincides with the appearance of the first sidebands, i.e., when χ ≈ f (3λ) LA , cf.Fig. 2f.A similar linewidth reduction is also observed for the first excited state of the trap (SM-IV-C ).
The OSC between particles with largely dissimilar lifetimes leads to quasiparticles with lifetimes approximately twice the one of the shorter-lived component.Such a behavior has been previously reported for quasiparticles resulting from the strong-coupling of excitons and photons [36], photons and phonons [6], as well as phonons and superconducting qubits [37], but not between polaritons and phonons.Stimulated multimode OSC has been recently demonstrated for a system of two optical modes coupled to multiple mechanical modes and driven using an external laser [38].In contrast, in this work, polariton BEC states are intrinsic to the MC.Conceptually, the RF-generated phonon field drives coherent oscillations between the polariton states.In the weak-coupling limit, the states do not swap before decaying with the rate γ MP = 1.4 GHz.In the OSC limit, one enters the stimulated phonoriton regime, where the condensate swaps between the pseudo-spin states at a rate γ MP .In effect, phonoritons spend half of the time as phonons with Γ M γ MP , thus leading to a decay rate γ MP /2.The linewidth narrowing thus directly proves phonoriton stimulation.
The above picture can be described using the Hamiltonian (derived in SM-V-E ) for a phonon mode LA and two BEC modes ω i with i = {l, u} for the lower energy and the upper energy mode of the spin-split GS, respectively: , since the pseudo-spin splitting matches twice the energy of the injected phonons, and the coupling strength is tuned by the amplitude of the injected BAW.A detailed analysis of the interaction (cf.SM-V-E ) yields a coupling strength where N MP and n b are the polariton and RF-generated phonon populations, respectively, and LA , where we assumed g ∆ ≈ g 0,↑↑ , see SM-V-E.Now, the OSC condition becomes g 2 > γ MP /4.

This model predicts a linewidth δE
displayed by the solid line in Figure 2g.In the calculations, we used the experimentally determined polariton and phonon decay rates γ MP = 1.4 GHz, Γ M = 3 MHz, respectively, and an estimated BEC population N MP = 5 × 10 6 .The phonon population n b was determined for each A BAW as described in SM-II-E.The calculations reproduce well the linewidth narrowing.However, the fitted G 2 value is ∼30 times larger than the one deduced for the pure optomechanical rates in the Table SM-V.The required coupling enhancement can be provided by some of the mechanisms listed at the end of the previous section.
Coherent optical control of phonoritons.Finally, we demonstrate optical control of phonoritons in a trap using the setup depicted in Fig. 3a, which is complementary to the mechanical control addressed in the previous section.For that purpose, a weak single-mode control laser with tunable energy ∆ L was scanned with energy steps of 2.3 GHz across the GS of the trap T 1 .PL spectra were then recorded for each ∆ L as displayed in Fig. 3b.The weak curved stripes separated by f LA .This relatively large red-shift is attributed to a renormalization of the phonoriton GS energy under the increased phonon population induced by the control laser [39].
The most important feature in Fig. 3b is the appearance of δ-like PL peaks whenever the control laser energy matches a sideband, i.e., when ∆ L = n s × f (3λ) LA − ∆ rs , where n s is an integer.The latter condition corresponds to the observed enhancement of the integrated emission of the sidebands, cf.Fig. 3(c).
Panels (e-h) of Fig. 3 show exemplary profiles recorded for n s = {3, 1, 0, −5}.At these laser energies, the amplitude of the sidebands increases up to an order of magnitude as compared to the reference spectrum of Fig. 3d recorded without the control laser, while their linewidth reduces to the resolution limit of 0.5 GHz.The resonant changes of the sidebands around ∆ L are analogous to the optomechanical heating and cooling, where the scattering of the control laser photons to the sidebands is accompanied by the emission or absorption of phonons.Under RF-generated phonons, however, both scattering processes occur simultaneously, preserving coherence and leading to blue-and red-shifted sidebands with amplitudes exceeding the laser pumped one.We note that the enhancement is selective and affects only some of the sidebands.This behavior is attributed to the interference of two RF-induced sideband combs: one around the phonoriton ZPL and the second one around the control laser energy.The situation is conceptually close to the interference of optical sidebands induced by acoustic waves with different frequencies reported in Ref. [40].

Discussion and perspectives
In summary, we demonstrated a compact polaromechanical platform for the coherent conversion between microwave and optical domains.We established that the polariton-phonon interactions are in the OSC regime leading to phonoritons and demonstrated the control of the phonoriton spectrum by electrically generated GHz phonons as well as by an external resonant laser beam.Microscopic models for the optomechanical interaction have also been provided.
The polaromechanical platform opens a new area of GHz phonoritonics.In addition to the optical generation of GHz phonons and the coherent microwave-optical interconversion, we envision that the polaromechanical platform will be attractive for other conventional and emerging applications.Examples include the amplification of optical signals, the generation of tunable (and symmetric) optical frequency combs for atomic clocks, high precision spectroscopy, optical synthesizers as well as for the preparation of quantum states [41].
Our results also hint at a far richer and previously unexplored physics, which paves the way to the new interaction regimes between optical, electronic and mechanical degrees of freedom in the solid-state.The large amplitudes of electrically generated GHz strain open the way to phonon nonlinearities, which can be applied for harmonics generation and mixing as well as for parametric processes and phonon squeezing [42].
The observed GS splitting into two pseudo-spin states suggests that the acoustic strain can be used as a source of synthetic magnetic fields for polariton-based topological structures.The results of the work challenge the existing understanding of the polaromechanical interactions, in particular regarding the role of non-linear interactions.
Lastly, a further challenge is to reach single-polariton cooperativities C 0 ≥ 1 at GHz frequencies by exploiting the large polariton-phonon coupling [18,43,44].For 20 GHz phonons [22,25], the thermal phonon occupation is n th ≈ 1 at 1 K, which enables single phonon manipulation at relatively high temperatures.The present structures can already reach C 0 > 1/20 (cf.Table SM-V): the current understanding provides pathways to increase C 0 by optimizing the trap geometry and material properties.
The platform can thus provide coherent control at the single particle level, which can be applied for the generation of non-classical light at GHz rates [45] as well as quantum interfaces between remote polariton qubits [46].
The authors thank Dr. Stefan Fölsch for discussions and for a critical review of the manuscript as well as the technical support by R. Baumann, S. Rauwerdink, and A. Tahraoui.Data underlying the reported results are included in the main text and supplementary material.

Figs. S1 to S12
Tables I to V Methods Microcavity sample.In the (Al,Ga)As material system, the sound and light acoustic impedances as well as the ratios between sound and light velocities are almost identical.As a consequence of this "double magic coincidence" [19], an (Al,Ga)As MC designed to confine near-infrared photons also efficiently confines GHz phonons.
Studied MCs consist of the the lower and upper distributed Bragg reflectors (DBRs) and the MC spacer region containing six 15 nm-thick GaAs QWs separated by 7.5 nm-thick Al 0.1 Ga 0.9 As barriers.
The position of the QWs is optimized in order to maximize the coupling to photonic and phononic modes of the MC.The lower and upper DBRs consist of triple pairs of [(58.1 nm) Al 0.1 Ga 0.9 As/ (63.1 nm) Al 0.5 Ga 0.5 As], [(58.1 nm) Al 0.1 Ga 0.9 As/ (67.6 nm) Al 0.9 Ga 0.1 As] and [(63.1 nm) Al 0.5 Ga 0.5 As/ (67.6 nm) Al 0.9 Ga 0.1 As].The DBR design provides confinement for optical and acoustic modes with wavelengths λ o ≈ 809/n GaAs nm and λ a = 3λ o , respectively.Here, n GaAs is the GaAs refractive index.The spacer is 3/2λ o cavity for photons.The acoustic wavelength corresponds to bulk phonons of ∼ 7 GHz.
Optical and phonon response of the MCs is detailed in SM-I-B.The coupling between polaritons and phonons is dominated by the deformation potential interaction.
Polariton and phonon confinement.Structured MCs are fabricated by interrupting the molecular beam epitaxy growth after the deposition of the cavity spacer embedding the QWs and structuring it by photolithographically defined shallow etching [47].Zero-dimensional confinement regions (the traps) for photons and phonons are defined by nm-high and µm-wide regions created within the MC spacer [24] (cf.Fig. 1a).We present experimental results recorded on two square polariton traps (traps T 1 and T 2 ) with side a = 4 µm and excitonic contents X 2 = 0.05 (corresponding to a detuning δ CX = −10 meV between the bare photon and exciton energies) and X 2 = 0.2 (δ CX = −5 meV), respectively.
Bulk acoustic wave transducers.The phonon generation relies on the transduction of a superhigh-frequency (SHF, 3-30 GHz) radio frequency (RF) voltage to sound waves achieved in capacitor-like piezoelectric structures -bulk acoustic wave (BAW) resonators (BAWRs) [48].A ring-shaped piezoelectric bulk acoustic wave resonator (BAWR) [25] was fabricated on the top surface of the MC to inject monochromatic LA BAWs with frequency f M tunable around f (3λ) LA = 7 GHz into the trap.An important feature of SHF BAWs is the very weak and essentially frequency-independent acoustic attenuation at temperatures below ∼ 30 K.This leads to exceptionally long BAW propagation lengths, which reach up to a cm.Substrates with polished back-surfaces thus become efficient acoustic cavities with enhanced acoustic amplitudes: here, the BAWs experience specular reflection at the surfaces and make several round trips through the MC spacer before they attenuate.This phonon backfeeding to the MC region boosts the effective quality factor (Q a,eff ) to values Q a,eff > 10 4 in the 5-20 GHz range and, hence, to very large Q a,eff × F products exceeding 10 14 .
Optical characterization.The spatially-and energy-resolved photoluminescence (PL) measurements were carried out at 10 K temperature in a cryogenic (liquid He) cyryostat.The condensates were excited using a single-mode continuous wave (cw) cavity-stabilized laser with the wavelength tuned in the 760-780 nm range.The Gaussian-like excitation spot was chosen to have the diameter of ∼ 40 µm on the sample and was centered on the trap.The excitation beam had incidence angle of ∼ 15 • .For the standard measurements with the spectral-resolution of about 0.1 meV, the magnified PL image of the sample was transferred on the entrance slit of a single grating spectrometer and recorded using a Nitrogen-cooled CCD camera.
High resolution spectroscopy of condensates.Figure SM. 3 sketches the high-resolution optical setup used to detect phonon sidebands in the emission of confined condensates.A part of the collected PL was diverted using a mirror and coupled into a single mode (5 µm core diameter) fiber.A longpass filter blocked the scattered light from the pump laser.The fiber guided the PL to a piezo-tunable Fabry-Perot etalon (FP).The FP has a finesse of ∼ 240 and free spectral range (FSR) of 68 GHz.The transmission wavelength of the FP was tuned by an external voltage source, controlled from a PC.The PL signal filtered by the FP was guided by another single-mode fiber to the entrance of a single grating spectrometer.The latter resolved the PL from the different FSRs of the FP, which was then detected by a nitrogen-cooled CCD.A custom made software was used to control the voltage applied to the FP, which allowed to conduct scans with a resolution of 0.28 GHz.In order to avoid temperature induced drifts, the FP was actively stabilized with an external heater.In this configuration, we loose the spatial information.In order to avoid collecting PL from other traps on the sample, we measured on the sample area, which contained an isolated trap.
Resonant optical excitation.Some experiments were carried out with the simultaneous excitation of the trap with two lasers: the non-resonant pump and a control laser.The wavelength of the single mode (linewidth γ L ≤ 300 kHz) cw control laser was precisely tuned using a feedback signal provided by a high resolution wavelength meter.The control laser was focused on the same area with the trap into a spot of ∼ 40 µm diameter.As schematically shown in Fig. SM 3, the direct reflection of the control laser was blocked.
Optomechanical coupling in intracavity traps.For the determination of the optomechanical couplings, we first calculated the eigenmodes for photons as well as coupled TA and LA phonons confined in a trap with infinite potential barriers and dimensions a x = a + ∆a/2 and a y = a − ∆a/2 along the x||[1 10] and y||[110] directions, respectively (cf.SM-V-A and SM-V-B ).For that purpose, the photon (phonon) wave functions were expressed in a basis of sinusoidal orbitals with p η (m η ) lobes along η = {x, y}.The strain field determined from the phonon wave functions was then used to determine the deformation potential coupling to QW excitons using the Pikus and Bir (PB) Hamiltonian [49].In the last step, we introduced the Rabi coupling between photonic and excitonic modes to determine the polariton eigenstates as well as the optomechanical coupling energies and cooperativities (cf.SM-V-C and SM-V-D).The quadratic interaction leading to stimulated phonoritons was determined by solving the optomechanical Hamiltonial for two states coupled by the phonons.Expressions for the effective quadratic coupling were then determined in the rotation-wave approximation (cf.SM-V-E ).The structured polariton microcavity (MC), which is displayed in Fig. 1 of the main text, was grown by molecular beam epitaxy (MBE) on a nominally intrinsic, double-side polished GaAs (001) wafers (Wafer Technology Ltd.).The layer structure was designed to simultaneously confine photons with an optical wavelength λ (defined as the ratio λ = λ L /n s between the free space wavelength λ L = 810 nm and the effective refractive index, n s , of the MC spacer) and acoustic phonons with wavelengths λ and 3λ [1].Phonons with these wavelengths will be referred to as λ-phonons and 3λ-phonons, respectively.For longitudinal acoustic (LA) phonons propagating along the z||⟨001⟩ direction of GaAs, these wavelengths correspond to phonon frequencies of approx.21 and 7 GHz, respectively.The layer structure of the sample is summarized in Table I.The spacer region of the MC has an optical thickness 3λ/2 (corresponding to half the LA phonon wavelength) and includes six 15-nm thick GaAs quantum wells (QWs) placed close to the antinodes of the optical field and the acoustic field.
The spacer is sandwiched between a lower (LDBR) and an upper (UDBR) distributed Bragg reflectors (DBRs).Each DBR period consists of a stack of three pairs of Al x1 Ga  I.This configuration yields a strong modulation of the optical and acoustic properties with periodicities of λ/2 and 3λ/2, which enables the simultaneous reflection of both photons with optical wavelength λ and phonons with wavelengths λ and 3λ.The intracavity traps are created by interrupting the MBE growth after the deposition of the upper part of the spacer layer (i.e., after the growth of the QWs) [2].The sample was then extracted from the growth chamber and its surface was photolithographically patterned by shallow (17 nm) etching.It was subsequently reintroduced into the growth chamber for the overgrowth of the UDBR.Due to the conformal nature of the MBE growth, the photolithographically imprinted profile is maintained during the overgrowth, leading to the formation of a thicker spacer region at the mesa sites (denoted here as the non-etched regions) in-between etched areas.The lateral trap confinement results from the lower acoustic and optical cavity resonance energies at these positions.The overgrowth was carried out at a lower deposition temperature (420 • C as compared to the temperature of 640 • C employed for the first growth cycle) to minize the smoothing of the lateral interfaces due to surface diffusion of impinging atoms.

B. Optical and acoustic response
The reflection and transmission of optical and acoustic waves through the etched and non-etched regions of the sample were determined using a transfer matrix procedure to calculate the optical and acoustic fields following the excitation (i.e., optical or acoustic) at the sample surface.The results are summarized for the optical and acoustic modes in the left and right panels for Fig. 1.
Figure 1(a) displays the optical reflectivity spectrum R ph for photons impinging at normal incidence on the sample surface.The calculations of the optical response do not include the effects of the excitonic resonances: they yield, therefore, only the bare optical MC mode (C).The blue and red curves apply for the etched and non-etched regions of the sample.The high reflectivity band between 1.48 and 1.57 eV is the photon stop-band introduced by the MC DBRs.The sharp dip is the photon cavity mode C, which in the etched areas (blue curve) is blue-shifted by 16 meV with respect to the non-etched ones (red curve).The dependence of the optical field for the C mode is illustrated in Fig. 1(d) and in the close-up around the spacer region of Fig. 1(g).Here, the vertical axis displays the squared The previous picture for photons also applies to phonons.The central and left panels of Fig. 1 show the acoustic reflectivities of the longitudinal acoustic modes with wavelenghts 3λ and λ, respectively.The DBRs introduce acoustic stop-bands in the phonon reflectivity (R a ) spectrum [Fig.1(b-c)] extending from 6.6 to 7.3 GHz and between 20 and 21.5 GHz for the 3λ and λ modes, respectively.The dips within the stop-bands are associated with the excitation of modes confined within the MC spacer.The reduced spacer thickness in the nERs blue-shifts the phonon frequency by approximately 27 MHz for the 3λ-phonons and by ∼ 100 MHz for the λ-phonons.The profiles for the squared strain field [Figs.1(e-f)] are again confined within the spacer: the penetration length into the DBRs is much larger for the 3λ phonons than for the λ ones, where this penetration is comparable to the one for photons in Figs.1(d).The longer penetration is due to the lower number of effective DBR stacks with 3λ/2 periodicity: as a result, the acoustic quality factor for 3λ phonons of Q a ∼ 200 is much smaller than for photons and for λ-phonons.
The higher Q a for the λ-phonons leads to much narrower resonances in Figs.1(b-c).Despite the fact that the overlap of the strain field and the QWs has not been optimized for λ phonons, their average squared strain field over the QWs exceeds by approximately two orders of magnitude the one for the 3λ-modes cf.Figs.1(e-f)].The stronger coupling to excitons resulting from the higher amplitudes of the λ phonons has important consequences for self-oscillation effects.As will be discussed later, the phonon backreflection at the sample surfaces considerably increases the quality factors for both types of phonons [1].

II. Photoluminescence of confined polaritons A. Light-matter coupling
Variations of the MBE fluxes along the sample surface create a slight reduction of the thickness of the MBE layers as one move from the center to the border of the substrate wafer.The thickness reduction blue-shifts the optical mode of the MC, while the excitonic resonances remain approximately constant.This relative variation between the optical and excitonic energies enables a precise determination of the Rabi (light-matter) coupling at different positions on the sample surface.The solid lines are fits to a photon-exciton coupled oscillator model for the interaction between the three modes.The model assumes that the reduced spacer thickness blue-shifts the optical modes in the etched regions relative to the non-etched ones, while the excitonic energies remain constant.The fits yields three polariton modes LP, MP, and UP.The parameters used for the fits are summarized in Table II.The dashed lines show the fitted bare X hh and X lh exciton energies, as well as the radial-dependence of the photon energy (C) in the non-etched regions.LP is highly photonic at the center of the wafer (r = 0).The zero detuning between the C and X hh energies is reached at r = 15 mm.Note that the lateral confinement potential for the intracavity traps, which is equal to the difference between the UP energies in the etched and non-etched region, reduces with r.T 1 and T 2 mark the position of the intracavity traps investigated here.Figure 3b presents the dependence of the trap spectrum on the excitation power of the non-resonant laser.The upper-section shows the integrated intensity of the ground-state (GS).Around ∼ 50 mW power (dashed vertical line), polaritons transition to the polariton Bose-Einstein condensate (BEC).In the color map, the dashed horizontal line (labelled E C ) denotes the energy of the bare cavity mode that was used to fit the spatial spectrum of the trap.The energy difference between the E C and the condensate in the GS indicates that above the condensation threshold the system is still in the strong-coupling regime.

C. Estimation of the polariton population
We estimate the optically excited polariton population in a trap as N M P = (1 − R)r A η PExc ℏωExc τ .In the latter expression, R is the MC reflectivity at the excitation wavelength, r A is the ratio between the trap and the excitation spot area, η is the fraction of excited electron-hole pairs that are stimulated into the BEC from the exciton reservoir, P Exc and ℏω Exc are the excitation power and photon energy, and τ is the exciton reservoir lifetime, which is estimated to be on the order of τ = 1 ns.Assuming R = 0.5 for the excitation wavelength of 760 nm, r A = 0.01 for a 4 × 4 µm 2 trap and a Gaussian excitation spot with the radius 20 µm, and η = 0.1, we obtain N M P = 2400 × P Exc for P Exc in mW .Hence, at the condensation threshold P Exc = 50 mW (cf.Fig. 3), we deduce N M P ≈ 10 5 for the BEC population.

D. Acoustic frequency comb in PL of confined polaritons
Figure 4 shows the dependence of the PL spectrum of a trap below the condensation threshold on the radio frequency (F RF ) applied to the transducer.The spectra were recorded for a fixed RF power.The displayed F RF range corresponds to the range of the 3λ acoustic mode of the MC.The data shows pronounced energy-modulation of all confined levels, which appears as a comb of narrow diamond-like shapes.Phonons escaping from the spacer region can be re-fed to this region via acoustic reflection at the polished sample surfaces.As a consequence, the acoustic response of the MC is given by a combination of the relatively low quality (Q M C ≈ 180) MC mode and the higher quality (Q comb > 5000) Fabry-Perot modes of the cavity formed by the whole sample thickness [1].The modulation amplitude increases for the higher confined levels, which have a higher polariton excitonic content: this behavior is consistent with a modulation mechanism dominated by the deformation potential interaction.The envelope of the modulation amplitude gives the shape and the width of the acoustic mode of the MC (localized within the spacer).

E. Population of rf-generated phonons
First, we determine the rf-induced strain from the high-resolution PL spectrum.Specifically, Fig. 2(a) of the main text shows ±4 phonon sidebands for an rf amplitude of P 0.5 RF = 0.14 W 0.5 .This corresponds to the modulation amplitude of ∆E = 4 × ℏΩ M ≈ 114 µeV for the phonon frequency of Ω M = 2π × 7 GHz.The value of the zzcomponent of the strain corresponding to ∆E can be found using ϵ zz ≈ ∆E/(X 2 × a h ) (cf.Eq. 19 and neglecting the small contribution associated with the deformation potential.For the lower polariton Hopfield coefficient X 2 = 0.08 and GaAs hydrostatic deformation potential a h from cf.Table III, we obtain ϵ zz = 1.6 × 10 −4 .
The phonon population in a 4 × 4µ m 2 trap can be determined as , where u ZP M zz ϵ 0 = 1 × 10 −8is the strain of a single 7 GHz phonon, determined from a numerical simulation described in SM Sec.V B (cf. Fig. 12).Thus, we estimate phonon population N phon = 16000 for P 0.5 RF = 0.14 W 0.5 .

III. High-resolution spectroscopy of polariton condensates
A. Optical setup Figure 5 sketches a high-resolution optical setup used to detect phonon sidebands in the emission of polariton BECs.The sample is mounted in an optical Helium-flow cryostat with rf-connections for the excitation of the BAWRs.The pump laser and control laser beams are used to excite the sample.A part of the collected PL is diverted using a mirror and coupled into a single mode (5 µm core diameter) fiber.A long-pass filter blocks the scattered light from the pump laser.The fiber guides the PL to a piezo-tunable Fabry-Perot etalon (FP).The FP has a finesse of ∼ 240 and free spectral range (FSR) of 68 GHz.The transmission wavelength of the FP is tuned by an external voltage (V piezo ) source, controlled from a PC.The PL signal filtered by the FP is guided by another single-mode fiber to the entrance of a single grating spectrometer.The spectrometer resolves the PL from the different FSRs of the FP, which is then detected by a nitrogen-cooled CCD.A custom-made software was used to control the voltage applied to the FP, which allowed to conduct scans with a resolution down to 0.28 GHz.In order to avoid temperature induced drifts, the FP was actively stabilized with an external heater.In this configuration, we loose the spatial information.In order to avoid collecting PL from other traps on the sample, we carried out our measurements on the sample area containing an isolated trap.

B. Polariton BEC linewidth
Figure 6 displays a high-resolution PL spectrum of the trap ground state (GS) as the function of the pump laser power above the condensation threshold.The trap GS is split into two levels (denoted L and U ), separated by 2 × ℏΩ M , where ℏΩ M is the phonon energy with Ω M /2π = 7 GHz.While their splitting remains constant, L and U modes redshift above ∼ 120 mW due to the optically-induced heating at high laser excitation.The upper-section of the Fig. 6 shows the decoherence rate (γ GS ) of the resonances determined from their spectral linewidths.The black triangles were determined from a low-resolution measurement (without the FP etalon).These measurements yield γ GS ∼ 50 GHz below the threshold (of P Th. ≈ 50 mW, as indicated by the orange arrow), which reduces to the resolution limit of ∼ 20 GHz above the threshold.The blue diamonds and red circles give the corresponding γ GS s of the L and U modes obtained with the high-resolution technique based on the piezo-tunable FP etalon.In the range 90-200 mW the L mode has linewidth of γ L ∼ 1 GHz, which corresponds to a coherence time of τ L ∼ 1 ns.Around 250 mW, the linewidth reaches its minimum γ L ∼ 0.5 GHz (2 ns).The increase of the linewidth for excitation above 250 mW is attributed to heating.

C. Self-induced sideband modulation
The color coded plot of Fig. 7(a) displays as-measured high-resolution spectra of the condensate emission of the 4 × 4 µm 2 trap as a function of the optical excitation power, P exc .These high-resolution measurements were recorded using the FP setup described above.The energy scale is specified in terms of the voltage applied to the piezo controlling the etalon.The figure shows a series of sidebands around the main emission line -the zero-phonon line (ZPL, marked by a red arrow).The spectroscopic features repeat with a periodicity determined by the free spectral range (FSR) of the etalon.All lines experience a small blueshift (less that the FSR) with increasing P exc , which is attributed to polariton-polariton interactions.Figure 7(b) displays the same data after artificially aligning the ZPL of all the curves.
The same measurement also captures the emission of the excited state (ES) of the trap, as illustrated in Fig. 7(c).For this particular detuning, the condensation threshold for the ES is ∼ 4 times higher than the threshold for the GS in Fig. 7(b).Both the GS and ES spectra show a series of week sidebands around the ZPL.The origin of these lines is discussed in the main text.

A. Raw sidebands data and correction
Figure 8(a) shows as measured high-resolution spectra of a trap GS as the function of RF power (P 2 RF ) and fixed F RF ≈ 7 GHz.The imperfect transduction of RF power into acoustic leads to local heating of the sample.The latter is the origin of the redshift at higher P 2 RF .To simplify the analysis and presentation each spectrum was shifted horizontally by a small amount in order to match the energies of the zero-phonon line, cf.Fig. 8(b).

B. Bessel-function fitting
Figure 9(a) displays a PL spectral map of the first excited state of a trap recorded for different amplitudes of the bulk acoustic wave (BAW) A BAW = √ P rf .Here, P rf is the nominal rf power applied to the BAWR.Individual PL spectra for different acoustic amplitudes are displayed by the symbols in the central panel.With increasing BAW amplitudes, one observes the appearance of an increasing number of well-defined sidebands displaced by multiples of the BAW quantum, hf BAW .
In the presence of a coherent harmonic driving of the BEC, the PL spectrum is expected to be proportional to P [ω]: [3][4][5] that is, a sum of Lorentzians with linewidths Γ, weighted by squared Bessel functions J 2 n (χ).χ is the amplitude modulation of the harmonic drive ω 0 stated in units of the driving frequency ω a (i.e., χ = ∆ω 0 /ω a ).The Lorentzians have maxima at frequencies ω = ω 0 − nω a , where n is an integer.Note that since wolfram.com/Bessel-TypeFunctions/BesselJ/23/01/]), the effect of the modulation in Eq. 1 is thus to distribute the oscillator strength among the sidebands without affecting the overall PL intensity.
The solid lines in Figure 9(c,d,e) are fits of Eq. 1 to the experimental data.The latter yield the modulation amplitudes χ and linewidth Γ displayed as a function of A BAW in Fig. 9(f) and (g), respectively.χ is proportional to A BAW .

C. RF-induced sidebands in the excited state
The experimental curves for the first excited state of the trap, shown in Figs.9(c,d,e), are fitted well with a sum of Lorentzians with linewidths δE, weighted by squared Bessel functions J 2 n (χ), where χ -is the modulation amplitude, as described above.
Figure 9g shows the dependence of the sideband linewidth on the normalized acoustic amplitude (A BAW ).Similarly to the GS (see Fig. 2g in the main text), the linewidth [δE(A BAW )] extracted from the fits decreases sharply by a factor of two from δE(0.1) = 0.22ℏΩ M = 1.75 GHz to δE(0.2) = 0.1ℏΩ M = 0.8 GHz and remains almost constant afterwards.

V. Theoretical Background
This section theoretically analyses the coupling between confined phonons and polaritons, which can lead to selfoscillations in an intracavity trap.We start by describing the confined polariton and phonon in the trap and then proceed to the calculations of the interaction between them mediated by the deformation potential mechanism.
Both polaritons and phonons are tightly confined within the traps, which is assumed to have a thickness m z λ BAW /2, where λ BAW is the acoustic wavelength of the 3λ LA BAW.In order to describe the polariton and phonon confined fields, we use a reference frame with the x, y, and z axes along the [1 10], [110], and [001] crystallographic directions, respectively.In the discussion to follow, a reference will also be made to the conventional cartesian frame with axes   III.
In order to enhance the optomechanical coupling, the QWs (with a thickness of 15 nm much smaller than λ BAW ) are placed close to an antinode of the light field, which is assumed to be at a position z = 0 [cf.Fig. 1(c)].The MC was designed to ensure that the uniaxial strain along z||[001] reaches its maximum amplitude u zz,0 close to the same position [cf.Fig. 1(f)].We will consider here only intra-cavity traps with a nominally square shape with the side length (a) in the x-y plane.The MBE growth dynamics, however, distorts the trap geometry [2].As a consequence, the lateral sizes of the trap will be taken to be a x = a + ∆a/2 and a y = a − ∆a/2 along the x and y directions, respectively, as illustrated in Fig. 10.The small size difference ∆a ≪ a takes into account the anisotropic nature of the MBE overgrowth process on a structured surface, which yields traps with different sizes even for the overgrowth on a perfectly square mesa [2].The trap dimension along the z direction will be taken equal to the thickness of the , where m z is an odd integer and n s is the average refractive index of the spacer layer.

A. Confined polariton modes
The envelope function of the confined polariton field in a trap can be written as: Here, k z = π/λ BAW , k i = π/ℓ i (i = x, y), and the indices (p i , p j ) (with i = x, y) are the number of the transverse lobes (i.e., perperdincular to z) of the polariton mode along the x and y directions, respectively.The previous equation applies for states with confinement energy shifts much smaller than the height of the trap potential barrier.The prefactor within the square root is a normalization factor ensuring that |⟨Ψ (pxpy) ⟩| 2 = 1.As will be discussed in detail later, there are two polariton modes for each (p x p y ) pair labeled by the pseudo-spin superscript index s.
For small asymmetries ∆a/a, the energy of the (p x p y ) polariton polariton state can be written as: where denotes the energy difference between the ground (GS) and the first excited state (ES) of the square potential.In the previous expression, k ⊥ = 2π/a, and m p the polariton mass.

Confinement effects on excitonic states
The high-resolution spectroscopic measurements on Fig. 6 shows a splitting of the polariton GS into two pseudo-spin levels (index s in Eq. 2).This splitting calls for a more detailed description of the coupling between excitonic and photonic levels confined in the lateral trap potential.In the case of bare excitons, while the electron states have a simple s-type electronic wavefunction, the valence band states are superpositions of p-like ⟨X⟩, ⟨Y ⟩, and ⟨Z⟩ orbitals, which can be mixed by the lateral confinement potential.The ⟨X⟩ and ⟨Y ⟩ orbitals are degenerate for a square trap.This degeneracy can be lifted by a difference ∆a between the trap dimensions along the x and y directions according to: where m X is the effective excitonic mass.For the right-most term of this equation, the factor has been expressed in terms of the energy difference ∆E 01 between the first excited and ground polariton states of the trap.Since m pol ≪ m X , the lateral confinement shift of the exciton states can be neglected and ∆ XY assumed to vanish.Furthermore, we will neglect fine splitting effects related to the exciton spin levels based on the fact that the lightmatter interaction considerably increases the energetic splitting between these dark and bright exciton levels.

Confinement effects on the photonic states
We now introduce the bare photonic states and their coupling to the excitons.The bare states will be described in a basis ) and left circularly polarized light (L L ) propagating with a wave vector component k ⊥ ∝ sin θ L perpendicular to the z-axis.θ L is the propagation angle between the light beam and the z axis, while L TM and L TE are the corresponding light modes with polarization in the x − z plane (TM) and perpendicular to it (TE), respectively.Due to the dependence of the light propagation through the DBRs on polarization direction, the L TM and L TE eigenstates are, in general, non-degerate for θ L ̸ = 0 [6].The energy of the bare photon states in a trap can be described by the following Hamiltonian in the basis (L TM , L TE ) [7]: where m ph is the bare photon mass.In the second line of Eq. 6, the confinement energy shift of the bare photonic level is expressed in terms of the polariton interlevel splitting (in a manner analogous to the excitonic states in Eq. 5) assuming that the bare photonic mass in the MC is half the polariton mass m p (i.e., m ph ≈ m p /2).
The longitudinal-transverse splitting ∆ LT (k av , 0) depends on the photon wave vector as well as on the layer structure of the MC.When averaged over the wave function of a polariton mode (p x p y ), the energy splitting between the L TM (with energy ℏω TM ) and L TE (ℏω TE ) states becomes equal to δ (pxpy) LT given by Eq. 7.This approximation on the right-hand side applies for δa/a ≪ 1.The pseudo-spin splitting ∆ LT in Eq. 7 corresponds to the splitting between the bare photonic states for light propagation with an in-plane wave vector k x .Figure 11 displays the average energy (ℏω TM ( ⃗ k) + ℏω TE ( ⃗ k))/2 as well as ∆ LT ( ⃗ k) = (ℏω TM ( ⃗ k) − ℏω TE ( ⃗ k)) calculated for the (p x p y ) = (1, 1) polariton mode with a wave vector ⃗ k = (k x , 0) in the sample structure used in this work.The calculations were carried out using a transfer matrix approach to determine the energy of the light states in the x − −z plane with polarization along y (TE-polarization) and in the incidence plane (TM-polarization) in an empty MC, i.e., without including excitonic contributions to the dielectric response.For small propagation angles, both the average energy and the pseudo-spin splitting increase quadratically with the wave vector.The dashed vertical line marks the wave vector component k x = π/a x corresponding to the polariton GS of a 4 × 4 µm 2 trap.From the splitting ∆ LT (k) = 9.5 µeV indicated by the dashed line, one obtains from Eq. 7 energy splittings δ (1,1) LT = 0.48 µeV and δ (2,1) LT = 3.6 µeV for the GS and first excited state (ES) of a trap with ∆a/a = 10%.In both cases, the splitting is much smaller than the 3λ LA phonon quantum of ℏω m = 28 µeV.

B. Confined phonon modes
Similarly to the case of polaritons, we will assume the displacement field ⃗ u(x, y, z) of the phonons to be confined within a box with lateral sizes corresponding to the trap size and dimension along z equal to the thickness of the cavity spacer m s λ BAW /2, as illustrated in Fig. 10.
The confined phonon modes at the QW plane can be approximated by those of a rectangular box with dimensions a x ∼ a y along the x and y directions, respectively.If the lateral surfaces of the box at x = ±a x /2 and y = ±a y /2 are free to move, the displacement field ⃗ u(x, y, z) for the three eigenmodes is with a wave vector k z = m z π/λ BAW along z||[001].The indices (m x m y ), m i = 1, 2, . . ., i = x, y describe the number of the transverse lobes (i.e., perperdincular to z) of the acoustic mode.m z is the index of the longitudinal mode being equal to 1 and 3 for the λ-and 3λ phonon modes, respectively.The displament field of these modes can be stated as the columns of the following matrix: kxmxx+kymyy+kzz) , where is the ratio between the uniaxial deformation perpendicular and parallel to the z axis.For GaAs, r T = −0.31.The amplitude of the acoustic field in Eq. 8 is stated in terms of the amplitude u zz,0 of the uniaxial strain along z, which, as previously mentioned, has a maximum close to the QW position.The ⃗ u T A (mxmy) (x, y, z) eigenmode in Eq. 8 is a pure transversal acoustic (TA) mode polarized in the x-y plane of the trap and angular frequency f T A ∼ 1 2π c44 ρ k z .The two other modes have mixed transversal and longitudinal acoustic (LA) character induced by the lateral confinement.The mixing is dictated by r T as well as by the dimensions of the trap.For large traps (i.e., a ≫ λ BAW ), ⃗ u T A ′ (mxmy) (x, y, z) can be well approximated by a TA wave with a small displacement component along z.Its angular frequency is slighly smaller than f T A .⃗ u LA ′ (mxmy) (x, y, z) is essentially a LA mode with a small transverse component and angular frequency The frequency of the 3λ phonon modes for different (m x m y ) combinations are listed in the first two columns of Table V.These frequencies were determined using the elastic properties of GaAs.Since the spacer of the MC also includes (Al,Ga)As layers, which have higher acoustic velocities, the listed values sligthly underestimate the measured resonance frequencies.The corresponding frequencies for the λ phonons are three times as large.To a very good approximation (i.e., with deviations less than ∼ 5%), the relationship between the mode frequencies is f T A ′ ≈ f T A ≈ 0.7f LA for modes with m i ≤ 2 (i = 1, 2).It is interesting to note that the frequencies f T A ′ and f T A for the λ phonons are approximately twice as large as the frequency f LA for the 3λ phonons.By solving the elastic equations for a 4 × 4 µtm 2 trap, we obtain the following frequency ratios:  The vibrational eigenmodes of the trap are superpositions of the modes of Eq. 8 satisfying the boundary conditions at the trap borders.Here, we will assume that the uniaxial strain components vanish at the lateral trap borders.In this case, we obtain the following expression for the displacement field of mode ⃗ u LA ′ (mxmy) (x, y, z), which is also the one that can be efficiently excited by the BAWRs (similar expression can be easily derived for the other modes from Eq. 8): The changes in the lattice induced by the phonon field are proportional to the displacement gradient ∇⃗ u (mxmy) (x, y).The latter can be decomposed into a symmetric (ε mxmy = 1 2 ∇u (mxmy) + (∇u (mxmy) ) T ) and an anti-symmetric (⃗ ω (mxmy) = 1 2 ∇u (mxmy) − (∇u (mxmy) ) T ) contribution.The first corresponds to the strain field while the second yields a pure rotation of the lattice (i.e., without the lattice distortion).⃗ ω (mxmy) is much smaller than ε mxmy and will be neglected in the next sections.
The amplitude of the strain field u zz,0 in Eq. 12 can be determined by calculating the energy stored in the strain field.This amplitude u zz,0 can be stated in terms of the phonon frequency ℏω mec , and number n mec , and the zero-point motion u ZP M zz,0 as: The second term in the denominator within the square root is the correction introduced by the reduced trap size.), respectively, calculated using Eq. 13 for a single phonon confined in a 4 × 4 µm 2 trap with different thicknesses λ BAW /2 (and, thus different frequencies f BAW ).For short phonon wavelengths (i.e., frequencies above ∼ 5 GHz) all modes indexed by (m x m y ) have approximately the same frequency dependence with the strain (displacement) field proportional to f ).For a 7 GHz phonons investigated in this work, the strain amplitudes u ZPM zz ∼ 10 −8 are expected to induce energy shifts of the excitonic electron-heavy hole transition of approximately a h u ZPM zz ≈ 0.10 µeV= 2πℏ 25 MHz, where a h is the hydrostatic exciton deformation potential (cf.Table III).

Coupling mechanisms
The phonon field acts on the polariton states in the three different ways illustrated schematically in Fig. 13.The solid and dashed lines for a polariton with orbital index p x p y indicate the states with up (↑) and down (↓) pseudo-spin indices, respectively: • The phonons can modulate the energy of the individual polariton states with the coupling strength δE (pxpy) hh,(mxmy) : if the energy modulation frequency exceeds the polariton linewidth, this modulation leads to the formation of emission sidebands.
• Alternatively, they can induce an inter-mode coupling between polaritons with different orbital indices.The coupling strengths for inter-mode spin-conserving and spin-flipping processes are denoted by g Details of the coupling mechanisms will be addressed in the next sections.

Averaged strain field for polariton coupling
We will restrict ourselves to phonon-polariton coupling relying on the deformation potential mechanism, which has been shown to dominate the coupling in (Al,Ga)As MCs [1].In this framework, the effects of the acoustic field on the coupling of polariton mode Ψ  x p ′ y ) 0,(mxmy) ) as well as coupling between the pseudo-spin states (g ↑↓(mxmy) and g ↑↓(mxmy) ).The inter-level coupling is mediated by the spin-conserving and non-conserving factors g The coupling elements can be stated as: ⃗ ω (pxpy) (x, y)dxdy.
From this point on, we will implicitly assume that the optoelectronic coupling is mediated by the excitonic component and no longer distinguish between χ (s) * (pxpy) and Ψ (s) * (pxpy) .It can be shown that the right-hand side of Eq. 15, which corresponds to the antisymmetric part related to rotation, vanishes when averaged over the excitonic wavefunction of the quasi-LA mode of Eq. 12.
We will be first interested in the effects of the strain from the ⃗ u LA ′ (mxmy) confined phonons on the lower lying polariton levels corresponding to the pseudo-spin split ground state (GS) (p x p y ) = (1, 1) and first excited states (ES) with (p x p y ) = (2, 1), (1, 2), or (2, 2).These modes are expected to have the longest coherence time and smallest energy separations, which can be comparable to the phonon quantum.The relevant phonons are those with the lowest energy, which are well confined within the cavity [i.e., the (m x m y ) = (1, 1) GS mode and the first ES modes with (m x m y ) = (2, 1) or (1,2)].
Expressions can be derived from Eqs. 12 and 15 for the strain field components ε induced by a (m x m y ) phonon and averaged over the polariton modes (p ′ x , p ′ y ) and (p x p y ).As an example, Table IV summarizes the average strain fields associated with GS phonon mode (m x m y ) = (1, 1).The strain tenson is listed in engineering notation with ε c  associated with the phonon mode ⃗ u LA ′ (mxmy) with (m x = 1, m y = 1).The numerical values were calculated for a 4 × 4 µm 2 trap with a shape anisotropy ∆a/a = 0.1.Entries marked as "tr" in the matrix are equal to their transpose.
The following points related to the ε are worth mentioning: • A phonon mode with an uneven (even) index m i induces a strain field with even (uneven) orbital parity along the i axis.In a perfectly square trap, this phonon can thus only efficiently couple polariton modes with the same (different) parity with respect to the i axis.The vanishing ε terms in Table IV result from combinations of indices, for which this condition is not satisfied.
• As a consequence, phonons with uneven indices can efficiently modulate the energy of the individual states by inducing a strong coupling g (pxpy) 0,(mxmy) (cf.Fig. 13) .This modulation amplitude, which is normally dominated by the e c zz component, has an important role in the formation of sidebands.
• Phonons with even indices, in contrast, lead to low g (pxpy) 0,(mxmy) couplings.These phonons can, however, induce couplings between levels with different orbital indices with a strength g pxpy↔p ′ x p ′ y ↑↑(↑↓),mxmy mainly determined by the e c zz and e c xz components for spin-conserving and spin-flipping coupling (see below).As an example, a (m x m y ) = (2, 2) phonon can couple both ES modes (such as (p x p y ) = (2, 1) and (1, 2)) as well as the GS and ES polaritons (p x p y ) = (1, 1) and (2, 2) (see Table V).The latter creates a channel for polariton relaxation from the ES to the GS with phonon emission.The coupling is not resonant since the splitting between the polariton states is normally much larger than the phonon energy.
• The uniaxial components e c xx , e c yy , e c zz do not reduce the symmetry of traps below the tetragonal one and can, therefore, not couple states with different pseudo-spins.This coupling requires shear components.In a perfectly square trap (i.e., with ∆a = 0), the only non-vanishing shear component is the e c xy one induced by the TA mode (this result also applies for other phonons).The TA phonons are, therefore, the only ones able to generate a g pxpy↔p ′ x p ′ y ↑↓,mxmy coupling for the efficient interaction of states with different pseudo-spin.A trap distortion can introduce a small component e c xy ∝ ∆a/a for the LA' mode.

Strain effects on excitonic states
The electron states forming the excitons have a simple s-type electronic wavefunction, which is only sensitive to the hydrostatic component of the strain field.The valence band states, in contrast, are superpositions of p-like ⟨X⟩, ⟨Y ⟩, and ⟨Z⟩ orbitals, which become mixed by the strain.The relevant excitonic levels are those associated with the lowest lying electron-heavy hole (hh) and electron-light hole (lh) transitions in the QWs formed by linear superpositions of the s and p-orbitals.We describe these states in a basis of valence band states angular and spin momenta ⟨m j , s⟩ with j = 3/2 and m = 1/2 given by (X hh⇑ , X lh⇓ , X lh⇑ , X hh⇓ ), which mix with the conduction band electrons to form the excitons.
The effects of the strain on the excitonic states are described by the well-known Pikus and Bir (PB) hamiltonian [10], which in the basis (X hh⇑ , X lh⇓ , X lh⇑ , X hh⇓ ) becomes: In the previous expressions, ∆ hh−lh is the energy splitting between the lh and hh states and the strain components e c ij are obtained from ε  III.In the right-most side of the equation, N mec is the number of phonons and g (pxpy) 0,(mxmy) the energy modulation strength by a single phonon.To simplify the notation, we will eliminate the phonon and polariton indices of δE (pxpy) hh,(mxmy) and δE (pxpy) lh,(mxmy) whenever these can be inferred from the context.Note that the uniaxial strain components along the diagonal modulate the energy of the hh and lh states, while the shear ones introduce a reduction in symmetry that couples these states.
By including the exciton-photon coupling with an energy Ω R (and Ω R /3 for the lh states due to the reduced oscillator strength), the polariton Hamiltonian H P ol in the (X hh⇑ , X lh⇓ , X lh⇑ , X hh⇓ , L L , L R ) basis becomes: The upper right 4 × 4 matrix block corresponds to the exciton Hamiltonian of Eq. 18.In the lower left 2 × 2 block, the diagonal term δ CX is the detuning between the bare photon and exciton states while the off-diagonal terms describe the mixing of the photonic states induced by the TE-TM coupling discussed in the previous section.We are interested in the lowest energy eigenstates of Eq.21, as well as on their mixing by the strain field.In order to obtain an analytical expression for these modes, we proceed with a series of simplification steps.We first note that the lh states are blue-shifted relative to the hh states by an energy ∼ ∆ hh−lh much larger than the phonon energy.
To address the polariton-phonon coupling, we can then elliminate the lh states from the basis by using perturbartion theory to include their effects on the coupling between the hh and light states.In this way, the dimension of the basis reduces from 6 to 4. Furthermore, we further simplify the expressions by taking into account that the strain components e xz and e yz vanish for the considered polariton modes (cf.Table IV ).The simplified interaction matrix in the reduced basis (X hh⇑ , X hh⇓ , L L , L R ) after these transformations reads: Equation 22 can be analytically diagonalized in the absence of the strain field (i.e., for ε ij = 0) and used to determined the unitary transformation matrix in the polariton basis.Furthermore, phonon-mediated interactions between the upper (UP) and lower (LP) polariton states can be neglected since the energy difference between these two states is much larger than the phonon energy.In this way, we arrive to the following interaction Hamiltonian for the phonon-induced coupling between the LP pseudo-spin states Ψ (pxpy) LP⇑ , Ψ LP⇓ : From the diagonal elements of the previous equation, one obtains the following approximation for the time-averaged splitting between the lowest energy eigenvalues (E ⇑ and E ⇓ ): which is valid for δ CX ≪ Ω R .

Intra-level coupling
The on-site energy g (pxpy) 0,↑↑(mxmy) as well as the coupling energy between the pseudo-spin states g (pxpy) 0,↑↓(mxmy) induced by a single phonon can be determined directly from the diagonal and non-diagonal elements of Eq. 23, respectively: In these expressions, the superscripts ZPMR specify that the strain components are evaluated for a single phonon and we assumed that |δ states.We note that e c,ZPM xy and, thus, the pseudo-spin coupling g (pxpy) 0,↑↓(mxmy) , vanishes for the TA' and LA' modes in square traps (i.e., with ∆a/a = 0), as well as for uneven phonons.For the TA mode, in contrast, g (pxpy) 0,↑↓(mxmy) is large and not sensitive to the trap asymmetry.Equations 25 and 26 apply for the first-order coupling between pseudo-spin states.In the presence of stimulated phonons (e.g., induced by a BAWR), one can also envisage second-order coupling effects between pseudo-spin levels split by twice the phonon energy Ω M with an effective coupling strength: Here, the polariton states emits (or absorbs) two phonons, the first in a spin-conserving and the second in a spinflipping transition.Such a second-order process becomes relevant when G (pxpy) 2,↑↓(mxmy) ≪ g (pxpy) 0,↑↓(mxmy) .

Inter-level coupling
The framework of the previous section can be readily extended to address the inter-level coupling mechanism between polariton levels with different orbital indices (p x p y ) and (p ′ x , p ′ y ) mediated by a (m x m y ) phonon.The interaction between these states is described by the following Hamiltonian: The 2 × 2 diagonal blocks of Eq. 28 are defined in Eq. ) are also obtained from Eq. 23 by neglecting the strain-independent terms.This equation defines the following two single-phonon coupling strengths for spin-conserving and a spin-flipping polariton transitions between the (p x p y ) and (p ′ x , p ′ y ) polariton states:

D. Coupling strength and cooperativity
The upper and lower panels of Table V summarized the coupling parameters related to phonon-mediated intraand inter-level coupling of polariton levels, respectively.The table only includes data for optomechanical transitions with non-vanishing coupling strengths involving the TA', TA, and LA' phonon modes and polaritons with orbital indices m i and p i with i = 1, 2. The latter were determined using the parameters of Table III for zero photon-exciton detuning, δ CX = 0.The phonon frequencies in the second column were obtained by the diagonalization of Eq. 8.The energy splitting between the states in the 5 th column depends on the TE-TM splitting (cf.Eq. 24) and is stated as a function of the asymmetry ∆a a .For the pseudo-spin states, this splitting is typically of only a few µeV.The 7 th and 8 th columns list the level modulation strength g pxpy 0,mxmy , which determines the strain-induced modulation amplitude of the level leading to emission sidebands.In agreement with the remarks in Sec.V C 2, the latter is only non-vanishing for the symmetric (m x m y ) = 1, 1 phonons.

Coupling and self-oscillations
Optomechanical self-oscillations require phonon-induced transitions between polariton states spaced by one or two phonon energies, for first and second-order coupling processes, respectively.These transition energies ∆E between the involved states are listed in the 6 th column of Table V.Only the (m x m y ) = (11) phonon can induce intra-level transitions between pseudo-spin states: ∆E for these transitions is normally small.Inter-level transitions can take place between between GS and ESs as well as between almost degenerated ESs.The former are always of second-order to due the large energy difference (∆E ≫ Ω mec ).The latter can be of first or second-order, depending on the value of ∆a/a.Note, however, that ∆E can also be significantly changed by a preferential population of one of the pseudo-spin levels via non-linear polariton-polariton interactions.
As a second requirement, self-oscillations pressuposes a sufficiently strong optomechanical coupling to overcome phonon and polariton decoherence.The latter can be quantified by the cooperativity C (pxpy) ↑↓(↑↓)(mxmy) , an adimensional parameter giving the optomechanical interaction between phonons and polaritons with finite decoherence rates γ pol and Γ mec , respectively.For first-order processes, it is related to the coupling energy g (pxpy) (mxmy) according to: TABLE V: Coupling for polariton modes (p x p y ) and (p ′ x , p ′ y ) induced by 3λ phonons with orbital indices (m x m y ), as calculated using the parameters listed in Table III.The phonon frequencies listed in the 2 nd column were determined using the elastic properties of GaAs.Since the spacer of the MC includes (Al,Ga)As layers, these values sligthly underestimate the measured resonance frequencies.The corresponding frequencies for the λ phonons are three times as large.The energy splitting ∆E (pxpy) between the states is give in terms of the trap asymmetry ∆a a (cf.Eq. 24).The decay rate for all phonon and polariton modes in the cooperativity calculations were taken to be equal to the one of the ground state polariton and 3λ mode (see text for details).
Here, N pol is the polariton population and C (pxpy)↔(p ′ x p ′ y ) 0,↑↓(mxmy) the single polariton cooperativity.Cooperativities exceeding unity mark the onset of self-oscillations between states split by one (or two, for second-order processes) phonon energies (see, e.g., Ref. [11]).For second-order processes, the cooperativity depends on the populations of the initial and final polariton states [12].If one assumes both to be equal to N pol , then one obtains the following expression for the second-order cooperativity: 4Ω mec Γ mec (2nd order).(33) Note that in contrast to the first-order, the second-order cooperativity does not depend of the polariton decoherence rate.Columns 9 and 10 (11 and 12) in the upper part of Table V list the coupling energies and associated inverse cooperativities for intra-level coupling of the first (second) order.In the calculations, we assumed that all involved effective coupling rate of G 2 = (g 1 − g 2 )g h /Ω M .In general g 1 is equal to g 2 , however, different polariton populations of the modes can lead to g 1 − g 2 ̸ = 0 [13].The above expression for G 2 , assuming g ∆ = g 1 − g 2 = g 0,↑↑ , is used at the end of Sec.Electrically stimulated sidebands of the main text.

Quadratic Hamiltonian and RWA
On the basis of measurements and arguments in Ref. [14], we assume that there exists an optomechanical coupling between two polaritonic modes (for simplicity in this section we refer to the ground state and the excited state with the subscript g and e, respectively) involving a quadratic phonon operator.Such interaction is included in the Hamiltonian For simplicity we assume a driving Hamiltonian that feeds independently and coherently each of the three modes via the rates β e , β g , and β m with frequencies ω e,d , ω g,d , and Ω d , respectively.
Ĥdriving = ℏ(β m be iΩ d t + β g âg e iω g,d t + β e âe e iω e,d t + h.c.) However, we can also apply the results below to the case in which the polaritonic modes are nonresonantly driven.By moving to the interaction picture of with the detuning variables defined as ∆ m = Ω M − Ω d and ∆ i = ω i − ω i,d , with i = e, g.Notice that within this RWA picture the choice of ∆ m = ∆ e = ∆ g = 0 is the condition that we are interested in since this implies that ω e − ω g = 2Ω M , i.e., the detuning between the excited and the ground state is resonant with the energy of two phonons.

Strong coupling
The optomechanical coupling of the Eq.( 44) can be linearized considering fluctuations around the strong coherent amplitudes of each mode (arising due to the presence of polaritonic and mechanical driving by a BAWR transducer).First, following a similar procedure used to describe optomechanical strong-coupling for the case of linear optomechanical interaction in Ref. [11], we consider fluctuations in the excited state which is less populated than the ground state.For this we replace âe → α e + δ a e and âg → α g .We take the large numbers α g/e = √ n g/e to be real, these For Γ m ≪ κ e and at resonance, ∆ e = ∆ g = ∆ m = 0, the expression simplifies to and one can see that strong coupling arises when the effective interaction, The imaginary part of ω ± being κ e /4 implies that the peak linewidth is κ e /2, i.e., a 50% reduction.This is due to the fact that these excitations are half mechanical and the phonon lifetime, being much larger than κ −1 e , virtually does not contribute to the linewidth.Importantly, the compounded mechanical and optical enhancements of the effective coupling constant relaxes the number of phonons and polaritons required for reaching the strong coupling regime.The above procedure is similar to that describing strong coupling in the case of a linear optomechanical interaction as presented in Ref. [11].
We proceed further by noting that the optomechanical interaction of the RWA Hamiltonian can be written as ℏG 2 ( Ψ †b2 + Ψ( b † ) 2 ) with the operator Ψ = â † g âe .It is easy to see that in absence of such interaction this polaritonic operator, decays with (κ e + κ g )/2 with κ g the decay rate of the ground state.Since [ Ψ, Ψ † ] = â † g âg − â † e âe + 1, it is convenient to define Ψ = Ψ/ n g − n e + 1, so that in the high occupation limit [ Ψ, Ψ † ] = (â † g âg − â † e âe + 1)/(n g − n e + 1) = 1, i.e., this polaritonic excitation fulfills the bosonic commutation relation.Introducing this operator and the expansion b = √ n b + δb in Eq.( 44) we obtain a beam-splitter-like Hamiltonian For Γ m ≪ κ e + κ g and at resonance, ∆ e = ∆ g = ∆ m = 0, the eigenenergies simplifies to ω ± = −i κ e + κ g 4 ± g2 − (κ e + κ g ) 2 4 2 (51) For n g ≫ n e , g = 2 √ n b n g G 2 and one recovers the same optomechanical effective coupling as in Eq.49.For g > (κ e + κ g )/4 these solutions have linewidth (κ e + κ g )/4, i.e., half the original linewidth.Since the excited state linewidth was shown to halve when 2 √ n b n g G 2 ≫ κ e /4 the halving of the sum of the linewidths indicates that the effective linewidth of the ground state also halves.We note that the operator Ψ can be interpreted as describing a quasi-particle that represents the beating between states g and e, and which is resonantly coupled with the mechanics through the creation of two phonons.
relevant confined phonons have either longitudinal (LA) or transverse (TA) preferential polarizations with frequencies f

Figure
Figure 1f reveals two remarkable optomechanical features for P Exc > 200 mW: the locking of the pseudo-spin state at f (λ) TA and the emergence of sidebands separated by multiples of f (3λ)

Figure
Figure 1e shows a spectrum of the trap T 2 under the modulation by f (3λ) LA = 7 GHz phonons generated by the BAWR driven with radio frequency (RF) voltage.One now observes well-defined and symmetric sidebands separated by f (3λ) by the small yellow arrows) are the phonon sidebands due to the modulation by RF-generated phonons.The weak diagonal feature indicated by the blue dashed arrow is the Rayleigh scattering of the control laser as it was scanned from positive to negative values of ∆ L .Interestingly, the sidebands redshift by as much as 0.2 × f (3λ) LA when the control laser is within their spectral range, i.e., for |∆ L | ≤ 5 × f (3λ)

FIG. 1 .FIG. 2 .
FIG. 1. Coherent optomechanics with confined polaritons.a Sketch of a structured MC, which consists of a spacer embedding quantum wells (QWs) sandwiched between acousto-optic distributed Bragg reflectors (aoDBRs).The µm-wide and nm-high mesa within the spacer provides lateral confinement potential (the trap depicted by the yellow curve) for polaritons and phonons.The latter are injected optically or using a ringshaped piezoelectric bulk acoustic wave resonator (BAWR).The phonons non-adiabatically modulate the discrete polariton energy levels (horizontal dashed yellow line) to form sidebands (dashed green lines).b Spatially resolved (energy-integrated) photoluminescence (PL) image map showing the bright emission of trap T2.The superimposed dashed lines are outlines of the BAWR electrodes and its active area.PL spectra of the ground state (GS) of traps c T1 and d T2 in the BEC regime recorded with the BAWR off.In (d), the self-induced sidebands and indicated by blue-arrows.e PL spectrum for T2 with the BAWR driven at 7014.3 MHz displaying induced sidebands (red arrows).f Spectral PL map of the GS of T2 for increasing optical excitation and the BAWR off.Curve d is a cross-section along the dashed horizontal line.The energy scales in c-f are relative to the main PL line and normalized to the phonon energies hf (3λ) LA and hf (λ) TA in the lower and upper axis, respectively.

Fig. SM 1 :
Fig. SM 1: Optical properties of the structured microcavity.(a-c) Calculated optical (R ph ) reflectivity and acoustic reflectivity (R a ) for the longitudinal acoustic (LA) phonons with wavelengths 3λ and λ, respectively.The dashed blue and solid red curves are for the etched and non-etched areas of the sample.The dips within the stop-band of high reflectivity between 1.48 and 1.57 eV in (a) 6.6 and 7.3 GHz in (b), and between 20 and 21.5 GHz in (b) and (c) are, respectively, for the polariton, and phonon modes confined within the MC spacer.(d-f) Depth dependence (z direction) of the squared electric [E 2 (z)] and strain (|u 2 zz (z)|) fields at the reflectivity minima within the stop-bands in (a-c).(g-i) Close-up around the MC spacer area of the profiles in (d-f) within the MC spacer.The green curves in (d-g) display the depth dependence of the refractive index (n) of the layers.The green curve in (e-f) and (h-i) displays the corresponding depth dependence for the density ρ (normalized to the GaAs density, ρ GaAs ).

Fig. SM 2 :
Fig. SM 2: Photoluminescence lines recorded on non-etched (nER, green) and etched (ER, magenta) regions on the 2-inch MC wafer at different radial positions (r) relative to the wafer center (r = 0).The dashed lines show a fit to a photon-exciton coupled oscillator model for the coupling between the optical mode as well as the hh and lh excitons, yielding the three polariton modes LP (lower polariton), MP (middle polariton), and UP (upper polariton).The dashed lines show the bare photon, hh and lh exciton energies.Note that the former is blue-shifted in the ER regions.T 1 and T 2 mark the position of the intracavity traps discussed in the main text.

Figure 2
Figure 2 displays the energy of the polariton photoluminescence (PL) recorded on non-etched (green) and etched (magenta) regions as the function of the radial position on the sample (with r = 0 being the wafer center).At each position, one detects PL from the lower (UP), middle (MP), and upper (UP) polariton branches resulting from the coupling between the optical MC mode (C) with the electron heavy-hole (X hh ) and electron light-hole (X hh ) excitons in the QWs.The solid lines are fits to a photon-exciton coupled oscillator model for the interaction between the three modes.The model assumes that the reduced spacer thickness blue-shifts the optical modes in the etched regions relative to the non-etched ones, while the excitonic energies remain constant.The fits yields three polariton modes LP, MP, and UP.The parameters used for the fits are summarized in TableII.The dashed lines show the fitted bare X hh and X lh exciton energies, as well as the radial-dependence of the photon energy (C) in the non-etched regions.LP is highly photonic at the center of the wafer (r = 0).The zero detuning between the C and X hh energies is reached at r = 15 mm.Note that the lateral confinement potential for the intracavity traps, which is equal to the difference between the UP energies in the etched and non-etched region, reduces with r.T 1 and T 2 mark the position of the intracavity traps investigated here.

Figure
Figure3ashows an exemplary PL map recorded along the spatial [-1-10] axis of a 4 × 4 µm 2 trap T 1 .The measurement were carried out at 10 K under non-resonant optical excitation below the condensation threshold.Several confined levels can be easily identified.The spatially integrated spectrum is shown in the middle-section.Each level has a characteristic spatial profile of intensity, which reflects the squared profiles of the polariton wavefunctions.The spectrum can be faithfully reproduced using numerical simulations[2].The simulated 2D spatial profiles are shown in the right-section.

Fig. SM 3 :
Fig. SM 3: (a) The map of PL of a 4 × 4µ m 2 trap resolved in energy and in space measured at 10K.Several confined polariton modes are visible below the barrier energy around 1530 meV.The dashed and solid white lines on top of the data are the simulated cross-section of the polariton confinement potential and squared wavefunctions, respectively.The section on the right shows a spatially integrated PL spectrum of the trap.The three sections further to the right display calculated spatial profiles of the square wavefunctions for the three lowest energy levels.(b) Lower part: a map showing trap PL spectrum as the function of the optical excitation power (P Exc. ).Upper part: total integrated PL as the function of the excitation power.The vertical dashed line designates the condensation threshold power (P Th.).

Fig. SM 4 :
Fig. SM 4: Acoustic frequency comb in PL of confined polaritons.Dependence of the PL spectrum (below the condensation threshold) of a 4 × 4 µm 2 trap on the RF frequency (F RF ) applied to a BAWR transducer.The F RF range corresponds to the range of the MC acoustic mode.

Fig. SM 5 :
Fig. SM 5: A sketch of the experimental setup for high-resolution optical spectroscopy of phonon sidebands in the emission of confined polariton condensates.The sample is mounted in a liquid He cryostat with rf connection to drive the BAWRs.The orange and red arrows show the optical paths of the pump and control laser beams, respectively, used to excite the sample.The PL signal (represented by the yellow area) is fiber-coupled to a Fabry-Perot etalon (FP) tunable by a piezo-controller (V piezo ).The output of the FP is directed via a second fiber to a spectrometer with a CCD detector.

Fig. SM 6 :
Fig. SM 6: (Lower panel) High-resolution spectrum of the ground state (GS) of the trap as the function of the optical excitation power (P Exc ).The lower and upper split levels are designated L and U , respectively.(Upper panel) GS linewidth (γ GS ) measured with low-resolution of ∼ 25 GHz (black triangles), and with high-resolution of ∼ 0.3 GHz for the L and U states -red circles and blue diamonds, respectively.

Fig. SM 7 :
Fig. SM 7: Self-induced sidebands.(a) Raw maps of the ground state (GS) PL recorded for increasing optical excitation power P exc focused on a spot of 10 µm positioned over a 4 × 4 µm 2 trap.The energy (horizontal axis) is specified in terms of the voltage applied to the piezo controlling the etalon with a voltage to frequency conversion factor of 6V per free spectral range of 68 GHz.The zero-phonon line (ZPL) is marked by a red arrow.(b) Same data after aligning all of the ZPL.The scale is stated in units of the LA 3λ phonon frequency ℏΩ M .(c) Shifted data for the excited state of the trap.The yellow arrows display the energies of the TA and LA λ-phonons.

Fig. SM 8 :
Fig. SM 8: Dependence on the RF power (P 2 RF ) applied to the BAWR of the high-resolution PL spectrum of the polariton ground state (a) as-measured and (b) after energy correction to match the zero phonon line for the different P 2 RF .The yellow arrows display the energies of the TA and LA λ-phonons.

Fig. SM 9 :
Fig. SM 9: (a) Dependence of the first excited state emission spectrum of a trap on acoustic amplitude ∼ √ P RF .(b-e) PL spectra corresponding to the √ P RF values indicated by yellow arrows (and letters) in panel a.The red dots are the data points while the blue solid lines are fits to Eq. 1. Dependence of the fitted modulation amplitude (f) and the linewidth (g) on the normalized acoustic amplitude.The vertical dashed green line in (f) and (g) indicates the acoustic amplitude for which the first sidebands appear.

Fig. SM 10 :
Fig. SM 10: Hybrid trap confining phonons and polaritons.The trap is assumed to have a rectangular shape with lateral dimensions a x = a + ∆a/2 and a y = a − ∆a/2, respectively, with ∆a ≪ a.

Fig. SM 11 :
Fig. SM 11: Calculated average energy ((ℏω TE + ℏω TM )/2, left vertical scale) as well as the splitting δ TE−TM = ℏ(ω TE − ω TM ) between transverse electric (TE) and transverse magnetic (TM) modes of the optical microcavity of Table I as a function of the angle of incidence θ L (relative to the z axis) in the x − z plane.The dashed vertical line marks θ L corresponding to an in-plane wave vector ⃗ k = (k x , k y ) = (π/a, 0) for the (p x , p y ) = (1, 1) polariton mode yielding a splitting of 9.5 µeV, where a = 4 µm is the trap size.

Fig. SM 12 :
Fig. SM 12: Calculated amplitude of the (a) strain field component (a) u ZPM zz and of (b) the z-oriented displacement u ZPM z (s) * (pxpy) are mediated by its excitonic component with (orthonormalized) wave function

Fig. SM 13 :
Fig. SM 13: Coupling between the polariton levels (p x p y ) and (p ′ x , p ′ y ) mediated by a (m x m y ) phonon.Intra-level coupling induces the energy modulation of the pseudo-spin levels (quantified by the coupling strength g (pxpy) 0,(mxmy) and g (p ′x p ′ y ) , which we will assume to have the same spatial dependence as the polariton wave function Ψ (s) * (pxpy) in Eq. 2.
(mxmy) = e c xx , e c yy , e c zz , e c yz , e c xz , e c xy in the cartesian reference frame (superscript c ).We use the representation in the cartesian frame to obtain a more familiar representation of the Pikus and Bir Hamiltonian in Sec.V C 3.
= (e c xx , e c xx , e c zz , e c yz , e c xz , e c xy ) T in Eq. 15 and displayed for the (m x m y ) = (1, 1) phonon mode in Table V. a h , b, and d are, respectively, the hydrostatic, uniaxial, and shear deformation potentials for the GaAs QWs listed in Table R .They justify our previous assertion that the uniaxial strain components defining the diagonal element δE hh modulate the energy of the polariton states while the shear component couple their pseudospin counterparts Ψ

2 √ n b n g G 2 2 2 • ⟨δa e ⟩ ⟨δb⟩ . ( 46 )= γ 2 − i κ e + Γ m 4 ± g 2
strong coherent amplitudes of the polaritonic fields are set by the driving fields.Similarly, we write the mechanical mode operator as b = α b + δb where δb is the mechanical fluctuation around the strong coherent amplitude of the mechanical field and take α b = √ n b real.Indeed, for this detuning condition the system naturally tends to the large n b regime even in absence of the phonon driving.It can be thought of as a result of polariton induced parametric driving of the phonon field, resembling a single mode squeezing that is limited by the nonlinearity.The amplitude of the mechanical RF driving, β m , provides a knob to further increase n b .Neglecting higher order terms and keeping the dominant interaction term for the case of strong mechanical driving, assuming√ n b > √ n e ,we arrive to the following beam splitter Hamiltonian Ĥm,e = ℏ∆ m b † b + ℏ∆ e δa † e δa e + ℏ2G 2 √ n b n g δa † e δb + δb † δa e ,(45)which describes the coherent energy exchange between the fluctuation of the mechanical field and the fluctuation of the polaritonic excited state.Introducing the linewidth of the excited state κ e and the phonon linewidth Γ m the equations of motion ared dt ⟨δa e ⟩ ⟨δb⟩ = −i ∆ e − i κe 2 √ n b n g G 2 ∆ m − i ΓmDefining g = 2 √ n b n g G 2 , δ = ∆ e − ∆ m and γ = ∆ e + ∆ m the eigenfrequencies become ω ± + δ + i(Γ m − κ e )

TABLE I :
Layer structure of the microcavity samples.The DBRs consist of stacks of Al x1 Ga 1−x1 As and Al x2 Ga 1−x2 As layers with thicknesses d 1 and d 2 and Al compositions x 1 and x 2 , respectively.d T and n rep are, respectively, the total thickness and the number of periods.

TABLE II :
Parameters determined from fits of the spatial dispersion of the polaritons in extended etched (ER) and non-etched (nER) cavity regions.

TABLE III :
Parameters used in the calculations.