Retrodiction beyond the Heisenberg uncertainty relation

In quantum mechanics, the Heisenberg uncertainty relation presents an ultimate limit to the precision by which one can predict the outcome of position and momentum measurements on a particle. Heisenberg explicitly stated this relation for the prediction of “hypothetical future measurements”, and it does not describe the situation where knowledge is available about the system both earlier and later than the time of the measurement. Here, we study what happens under such circumstances with an atomic ensemble containing 1011 rubidium atoms, initiated nearly in the ground state in the presence of a magnetic field. The collective spin observables of the atoms are then well described by canonical position and momentum observables, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{x}}_{\text{A}}$$\end{document}x^A and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{p}}_{\text{A}}$$\end{document}p^A that satisfy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[{\hat{x}}_{\text{A}},{\hat{p}}_{\text{A}}]=i\hslash$$\end{document}[x^A,p^A]=iℏ. Quantum non-demolition measurements of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{p}}_{\text{A}}$$\end{document}p^A before and of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{x}}_{\text{A}}$$\end{document}x^A after time t allow precise estimates of both observables at time t. By means of the past quantum state formalism, we demonstrate that outcomes of measurements of both the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{x}}_{\text{A}}$$\end{document}x^A and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{p}}_{A}$$\end{document}p^A observables can be inferred with errors below the standard quantum limit. The capability of assigning precise values to multiple observables and to observe their variation during physical processes may have implications in quantum state estimation and sensing.

In the limit of large laser detuning from the optical transition, one can disregard absorption in the interaction of light with the atomic D2 transition from the 5S 1/2 , F = 2 ground state to the 5P 3/2 excited state and employ the effective interaction Hamiltonian [1] where λ = 780 nm and Γ = 2π × 6.067 MHz are the wavelength of the probe light and the full width at half maximum (FWHM) linewidth of the excited state, respectively, while n A is the atomic density. A is the cross section of the laser beam and Φ is the photon flux. The first scalar term can be treated as a DC Stark shift, which equally shifts all atomic energy levels and is proportional to the photon flux or light intensity. The second vector term provides the desired Faraday rotation operation where the atomic spin J is rotated around the z -axis by an amount proportional tô S z . Likewise, the Stokes vector S is rotated around the z -axis by an amount proportional toĴ z . The last tensor term gives rise to a complicated dynamical Stark shift which is useful in some quantum protocols and will vanish for large detunings with respect to the hyperfine splitting of the excited state, as the coefficient a 2 goes to zero. For the D2 transition starting from the hyperfine ground state F = 2, the detuning dependent vector and tensor polarizabilities a 1 , and a 2 are given by [1,2] ( where ∆ 13 = 2π × 423.60 MHz and ∆ 23 = 2π × 266.65 MHz are the hyperfine splitting in the 87 Rb excited state 5P 3/2 , respectively, while ∆ is the laser detuning with respect to 5P 3/2 F = 3. Given ∆ = −2π × 2.09 GHz (negative sign represents the blue detuning) in our experiment, a2 a1 = 0.0094 1, thus, the tensor interaction can be ignored in our experiment, and the dispersive interaction of light with the atomic D2 transition can be approximated by the quantum non-demolition (QND)-type Faraday interaction Hamiltonian, with α = − Γ 8A∆ λ 2 2π a 1 . To describe the interaction between light and atoms in the experiment, the quantum state of light is characterized by the Stokes operatorsŜ x ,Ŝ y andŜ z , which are given by the differences of the number operators of the photons polarized in different orthogonal bases. For light propagating in the z -direction, we havê with the creation and annihilation operatorsâ † x,y,±45 • ,σ ± ,â x,y,±45 • ,σ ± in x, y, ±45 • and σ ± -polarization. The Stokes vector satisfies the angular momentum commutator relation, [Ŝ y ,Ŝ z ] = iŜ x which implies the Heisenberg uncertainty Heisenberg uncertainty relation Var(Ŝ y ) · Var(Ŝ z ) ≥ Sx 2 4 . In the y-linearly polarized coherent light pulses to interact with the atomic ensembles, which means S x can be treated as a large classical value proportional to the photon flux Φ = P/( ω) where P represents the optical power and ω is the energy of a single photon. and ω is the energy of a single photon. The quantum variablesŜ y andŜ z are the physical variables we are The coupling constant in the main text is defined as κ 2 = 1 2 α 2 J x Φτ . Here τ is the probe pulse duration.

Supplementary Note 2. Stroboscopic back-action-evading measurement
Given a large enough laser detuning, the relation Ĥ int ,Ĵ z = 0 ensures the QND nature of the measurement with respect to the interaction Hamiltonian. Under the influence of a magnetic field along the x-axis, however, the Zeeman Hamiltonian Ω LĴx couplesĴ z andĴ y . To restore the QND property of the probing scheme in the presence of the Zeeman induced precession of the spin at the Larmor frequency Ω L , we probe the ensemble with δ-pulses at the times t = 0, π ΩL , ..., nπ ΩL , where the precession does not mixĴ y andĴ z . In such an ideal setup, only the shot noise in the light affects the precision of the measurement ofĴ z , while the conjugate quadratureĴ y becomes anti-squeezed due to the measurement backaction.
We introduce canonical operators for light and for the collective spin of atomsx The input-output relations of the matter-light interaction is then given by According to the formula for the variance of a Gaussian distributed variablep A conditioned on the measurement of another correlated Gaussian variablex L , we have where Cov [p A ,x L ] represents the covariance ofx L andp A , and following Supplementary Equation (10) we can calculate the conditional variance ofp A as where we assume coherent initial states for the light and atoms. While this results fully accounts for the conditional spin squeezing by the measurement back action, we shall provide a theoretical description of the same result by means of the POVM formalism in the next section. That formalism is needed to account for the correlations between subsequent measurements on the same system.
We note that the QND character of the stroboscopic probing scheme only applies for infinitely short pulses (duty factor D = 0). In our experiments, we use finite duration probe pulses which implies a (small) spin rotation during the probing. This reduces the effective coupling to the desired spin component and couples the squeezed and anti-squeezed components and thus reduces the effect of squeezing. We shall return to an effective analysis of these effects at the end of Supplementary Note 3.

Supplementary Note 3. Past quantum Gaussian state in Faraday interaction
In terms of the canonical variables, the QND interaction that measuresp A is similar to a continuous variable control-Z gate [3] and can be written asĈ The initial state of atom and light, Ψ in A and Ψ in L , can be expanded in the eigenstates ofp A andx L p A |a pA = a|a pA , (14) with the wave functions The coherent states of collective spin and light are described by Gaussian wave functions ψp A (a) and ψx L (l) with a variance of 1 2 , Applying the QND interaction to the initial state of atom and light, we get After the interaction we measure the quadrature of light in thex L basis with the outcome m. The resulting conditional state of the atomic variables is updated as ]|a pA da.
The mean value is displaced from 0 to κm 1+κ 2 , and as we saw in Supplementary Equation (12), the atomic variance of p A is reduced by a factor of 1 + κ 2 .

Measurement operators
In the positive-operator valued measure (POVM) formalism, the initial atomic state is represented by the density matrix ρ 0 , which can describe both pure and mixed states. When the QND interaction acts on the whole systemmeter state ("system" = atoms, "meter" = light), the total density matrix ρ 0 ⊗ |Ψ in . By inserting the identity operator of the lightÎ m = l∈M |l l|x L we obtain where we have introduced the system operatorsΩ m = (Î A ⊗ m|x L )Ĉ z (Î A ⊗ |Ψ in L ). To calculate the effect ofΩ m on the atomic system, we insert |a a|p A da for the second occurrence ofÎ A , Similarly, the operator for measuring the rotated quadrature observablex where |a, θ is the eigenstate ofx A (θ) with eigenvalue a. In the limit of large κ,Ω m becomes a (strong) projective measurementΩ a = |a, θ a, θ|, with a = m/κ.

Retrodiction beyond the Heisenberg uncertainty relation
We consider two types of measurements: a hypothetical projective measurementΩ a , with outcome a for thex A (θ) and the actually performed optical measurementΩ m2 , with outcome m 2 for the observablex L .
The outcome of these measurement can be predicted by the density matrix conditioned on the first measurement outcome m 1 , and they can be retrodicted by incorporating also the outcome of later optical measurements ofx A and p A with outcomes m 3 and m 4 .
Conditioned on the first measurement outcome m 1 , i.e. by using the the prediction only, if the 2nd pulse is a hypothetical projective measurementΩ a , we have where ρ =Ω m1 ρ 0Ω † m1 is the density matrix after 1st pulse measurement. If the 2nd measurement is a POVMΩ m2 , we have The joint probability for all four outcomes {m 1 , a, m 3 , m 4 } with an atomic projective measurement or {m 1 , m 2 , m 3 , m 4 } with only optical measurements can be described as For a projective measurementΩ m =Ω a , the atomic state is projected to an atomic spin eigenstate |a, θ , and the measurement result is the corresponding eigenvalue a. For an optical measurementΩ m =Ω m2 , the measurement result is m 2 containing information about the light and atomic spin.
As a result of both the prediction and retrodiction, i.e. conditioned on the measurement outcomes m 1 , m 3 and m 4 , the probability for the outcome is thus given as Tr Ω a ρΩ † a E da ∝ a, θ|ρ|a, θ a, θ|E|a, θ , where the effect matrix E conditioned on the 3rd and 4th measurement equals Using the expression for the POVM operators, we obtain In order to obtain the elements a, θ|ρ|a , θ and a , θ|E|a, θ of ρ and E, we introduce the corresponding Wigner functions, cf. the definition, The Wigner functions of ρ and E are while, e.g., p + r, 0|ρ|p − r, 0 is the inverse Fourier transformation of W ρ (q, p), Introducing the rotated quadratures, where and p, q are eigenvalues ofp A ,x A respectively.
Specifically, the diagonal elements are v, θ|ρ|v, θ = W ρ (q, p)du Pr(a|m 1 ), Pr(m 2 |m 1 ), Pr(a|m 1 , m 3 , m 4 ) and Pr(m 2 |m 1 , m 3 , m 4 ) are all Gaussian distributions determined by their respective mean values and variances. By defining the variance of a, θ|ρ|a, θ and a, θ|E|a, θ as the associated variance of Pr(a|m 1 ) is The variance of Pr(m 2 |m 1 ) is found to obey the relation where the first term 1 2 comes from optical shot noise. The variance of Pr(a|m 1 , m 3 , m 4 ) is We show the polar plot of σ 2 ρE for different coupling strengths κ 1 and κ 3 in Supplementary Figure 1. Interestingly, while our initial argument was that prior and posterior measurements of two conjugate observables would in principle make a better estimation of both values than indicated by the Heisenberg uncertainty relation, no particular consequences were expected for quadratures along other directions. Rather than a squeezing ellipse, the retrodicted distribution shown in Supplementary Figure 1 suggests the form of a "squeezing butterfly" [4].
We can also obtain an analytical expressions for the variance of the optical measurement outcomes around the retrodicted expectation value, Var(m 2 |m 1 , m 3 , m 4 ). This expression is more complex since it involves the off-diagonal elements of ρ and E in Supplementary Equation (28) This is shown with the distribution patterns of σ 2 ρE around θ = π 2 in Supplementary Figure 3 for different values of κ 4 . To observe the variance reduction induced by retrodiction in a projective measurement, the atomic quadrature angle has to be controlled with high accuracy. For the retrodiction of the optical measurement, however, the probing ofp A with strength κ 4 does not have a similarly important effect. This is shown in Supplementary Figure 4, for κ 2 2 = 0.81, where the variance shows a reduction in a broader range around θ = π 2 even if κ 4 = 0. Supplementary  Figure 4 shows that when we increase the value of κ 4 from 0, an improvement is observed. However, the improvement saturates around κ 4 1.5. This agrees qualitatively with the experimental results. , for different value of κ 2 . For θ = 0, π 2 , the rescaling makes the result independent of κ 2 . In the limit of κ 2 → ∞,Ω m2 approaches the projective measurementΩ a , and the rescaled optical variance equals the theoretical value for the atomic variance as shown by the fat dashed lines.

Supplementary Note 5. Effects of off-resonant atomic excitation and decay
So far in our analysis we have ignored effects of dissipation, in particular the spontaneous emission by the off-resonant excitation of the atomic excited states. These processes have three consequences: (i) they break the permutation symmetry of the ensemble which is equivalent to a reduction of the effective mean spin projection and hence a reduced collective coupling to the light, (ii) they break the correlation between the given atom and the rest of the ensemble, which is ultimately the cause of the squeezing of the collective spin, and (iii) the atoms decay at random into one of the spin states, contributing a random fluctuating term in the collective spin [5,6].
These processes occur with a probability η τ = hκ 2 d , where h is a numerical factor of order unity that depends on the probe polarization and the light-atom detuning, and d = n A σ 0 L is the optical depth on resonance given by the atomic density n A , the absorption cross section on resonance σ 0 , and the length of the cell L. For a finite duty factor in the stroboscopic detection, the coupling strength acquires an effective valueκ discussed in the next section.
During the first pulse there is hence a competition between the squeezing due to probing and the increasing spin fluctuations due to decay, explaining the existence of an optimal probing time τ 1 . While decay during the later measurement during τ 3 does not alter the previous state, it causes a reduction of the strength κ 4 of the last measurements, which impacts the retrodiction according to Supplementary Equation (44) and may be the cause of the slow increase of the uncertainty product as function of τ 3 in Fig. 3(b) of the main text.

Supplementary Note 6. Effect of finite probe pulse duration
So far, we described the ideal stroboscopic detection with δ-function pulses of an effective QND spin observable, interrogated at twice the spin oscillator frequency. In the experiments, however, we apply finite duration pulses, described by a stroboscopic function u(t) with a duty factor D, where where k ∈ N, and T = 2π/Ω L is the Larmor period. The Stokes componentŜ out y of the linearly polarized probe beam is detected and its Fourier component at cos(Ω L t) is measured by a lock-in amplifier. This effectively probes the atomic variableX  After a pulse train of duration τ = N T with mean photon fluxΦ, the integrated optical signal is characterized by a Stokes vector with the measurement variance [1,7,8] Var(Ŝ out y,τ ) ≈Φ Here we have introduced the effective coupling constant,κ 2 = 1 4 α 2 J xΦ τ [1 + Sinc(πD)], which becomes equal to κ 2 in the limit of vanishing duty factor D.
Disregarding the spin rotation during each pulse, we can evaluate the integrated effect of the stroboscopically- Let us now turn to estimate the effect of the spin rotation during each short probing pulse, where we expect a reduced effective coupling strength due to the shorter average projection during angular rotation.
Assume that during each optical pulse the spin rotates between the directions −φ and φ, where 0 denotes the direction corresponding to the observablep A . The duty factor thus reads D = 2φ/π. Let us divide each single pulse into 2N + 1 infinitesimal duration (and hence QND) interactions in different directions, φ n = n N φ, where n = −N, −N + 1, · · · N − 1, N . The coupling strength for each interaction is evenly distributed as K = κ 2N +1 . So the evolution operator, coupling to slightly rotated atomic observables in each subinterval, becomeŝ U n = e −iK(pA cos φn+xA sin φn)⊗pL = e − iK 2 sin φn cos φn cancels a similar term inÛ n , and we obtain In the limit of N → +∞, we havê We define t 1 = sin φ φ and t 2 = −2φ+sin 2φ 8φ 2 , and we note that when φ → 0, t 1 → 1 and t 2 → 0.Û then simplifies to theĈ z -QND interaction described above, while a finite φ yields an effectively reduced coupling strength (by sin φ/φ) and a quadratic (squeezing) effect on the optical probe, which will also alter the measurement sensitivity.
Taking into account the modification of the detection scheme due to the finite pulse duration, we obtain new measurement operatorsΩ m By defining G(m, κ, a) = exp[ (t1κa−m) 2 4it2κ 2 −2 ], the measurement operator in the basis ofx A (θ) =p A cos θ +x A sin θ's eigenstates |a, θ with outcome m can be written aŝ Ω m ∝ G(m, κ, a)|a, θ a, θ|da. (55) This generalizes the above expression, and will now be applied to the four successive pulses, where the first measureŝ p A , the second second measuresx A (θ) to verify the variance of the past quantum state, the third measuresx A and the fourth pulse measuresp A .
The distribution in the θ direction can be derived from W ρ according to Supplementary Equation (36), with the conditional variance This variance is shown in the polar plot Fig. 2A in the main text, clearly revealing the (anti)squeezing of the (x A )p A -quadrature. The effect matrix including the third and the fourth measurement is Similarly, the associated Wigner function of this effect matrix is W E (q, p) = 1 π q + r, π 2 |E|q − r, π 2 e 2ipr dr ∝ G * (m 3 , κ 3 , q + r)G(m 3 , κ 3 , q − r) |G(m 4 , κ 4 , a 4 )| 2 e 2ia4r da 4 e 2ipr dr. (60) According to Supplementary Equation (37), the conditional variance from the effect matrix is The past quantum state probability distribution of the atomic quadraturex θ , conditioned on both the first, third and fourth pulse is given by P (a|m 1 , m 2 , m 3 ) ∝ a, θ|ρ|a, θ a, θ|E|a, θ with the variance The corresponding polar distribution is shown in Supplementary Figure 5 for different values of the duty factor.

Supplementary Note 7. Wineland criterion
To quantify how much the variance is below SQL, we use the squeezing parameter ξ 2 R by the Wineland criterion [9,10], which is essentially the ratio of the minimal angular resolving power of the spin squeezed state to that of the CSS, and has taken into account the shortening of the spin-vector due to dissipation during squeezing: where ∆φ is the minimal angular resolution for the spin squeezed state and (∆φ) CSS is the angular resolution for CSS. ∆φ = ∆Ĵ y,z / | J x (τ tot ) |, (∆φ) CSS = ∆(Ĵ y,z ) PN / | J x |, where | J x | is the average spin for a perfect CSS along the x -direction, and | J x (τ tot ) | is the average spin of the prepared CSS after a total probing time of τ tot = τ 1 + ∆τ + τ 2 as shown in Fig.1 of main text, with τ 1 the prediction pulse duration, τ 2 the verification pulse duration, and ∆τ the gap time in between. The ratio | J x | / | J x (τ tot ) | is obtained from the measurements of the population decay time T 1 with the probe on. ∆(Ĵ y ) 2 PN = ∆(Ĵ z ) 2 PN is the projection noise for the perfect CSS deduced from the measured spin noise of thermal state. ∆(Ĵ y ) 2 and ∆(Ĵ z ) 2 are the measured retrodicted variances ofĴ y andĴ z , respectively.