Measurement-induced, spatially-extended entanglement in a hot, strongly-interacting atomic system

Quantum technologies use entanglement to outperform classical technologies, and often employ strong cooling and isolation to protect entangled entities from decoherence by random interactions. Here we show that the opposite strategy—promoting random interactions—can help generate and preserve entanglement. We use optical quantum non-demolition measurement to produce entanglement in a hot alkali vapor, in a regime dominated by random spin-exchange collisions. We use Bayesian statistics and spin-squeezing inequalities to show that at least 1.52(4) × 1013 of the 5.32(12) × 1013 participating atoms enter into singlet-type entangled states, which persist for tens of spin-thermalization times and span thousands of times the nearest-neighbor distance. The results show that high temperatures and strong random interactions need not destroy many-body quantum coherence, that collective measurement can produce very complex entangled states, and that the hot, strongly-interacting media now in use for extreme atomic sensing are well suited for sensing beyond the standard quantum limit.

Below I list my comments (not in the order of importance) 1)sometimes the logical flow of the story breaks: a) in the introductory section, they mention the notion "[1,1,1]" for a magnetic field direction which is not explained right away. It becomes clear only after looking at the fig 1 at the next page. At least, authors should add a reference to this figure b)in the Result -material system section, authors mention that at Larmor freq < 5 kHz the desired regime of SEFR is achieved. Only a half page later they explain why it is so. c) connected to b) I believe, authors should give a brief explanation of physics behind SERF regime in the introductory paragraph. This will help non-experts better understand this work without looking up references 2) a bit more references are needed to cover the statements made by authors. a) at the beginning, "the same processes also decouple ... which increase the spin coherence time" b)at the Discussion, "... observed macroscopic singlet states shares several traits with a spin liquid state,.." 3) Regarding the generated entangled state: a) I did not find that the authors specified which degree of freedom is entangled (electron or nuclear spin) b) is it possible to write down the form of the state or to describe it somehow c) what is the fidelity of the entangled state? In other words, is it useful for sensing and other applications mentioned in the introduction? The authors present measurements of spin noise in a hot rubidium vapor. The discretized measurements of spin noise are fed into a Kalman filter which, given the spin noise dynamics, estimates the mean value of the noise and its fluctuations. The authors use spin squeezing inequalities to claim that the measured fluctuations around the mean of the polarimeter signal, which reflects the collective spin along the laser beam axis, are smaller than the standard quantum limit (and the thermal state limit). The authors claim that for this to happen, a macroscopically large number of atoms must have been entangled (due to the measurement performed by the light field), in particular, into a macroscopic singlet state. To support their case, the authors present further systematic checks, like squeezing dependence on light power and magnetic gradient.

General comments
As a first comment, the manuscript is very clearly written and the work therein seems to be a thoroughly explained experiment. The figures are very nice and informative. Apart from the somewhat technical discussion on the workings of the Kalman filter, the manuscript is accessible by the general reader. This work could be a significant step forward in the field of quantum sensing, and a testimony to the fact that great strides can still be made in atomic physics and quantum metrology with relatively simple experimental setups augmented with fresh experimental approaches and an insightful theoretical analysis unraveling the details of the underlying physics.

Technical comments
For the sake of being fully convinced, further scrutinizing the reported results, and clarifying some subtle points, I would like to better understand the following: (1a) In Figs. 1b,c the authors compare the SQL and TSS with the error band around the running mean of the spin noise signal, as extracted by the KF. However, this error band represents the noise of spin noise, whereas I would think that what limits a metrological measurement is not the noise on top of the noise, but the noise itself. In other words, the whole oscillating signal in Fig.   1b, being stochastic and hence unpredictable for timescales > T2, is what I would think sets a limit to the measurement precision when integrating for times > T2, the noise on top of the noise (the spread of the shot-to-shot measurements) being a second-order effect. By visually inspecting the rms amplitude of the spin noise signal itself with ¼ of the SQL and TSS 4σ -bars, it seems that this amplitude is quite larger.
(1b) Hence the fact the KF (given enough points and the underlying dynamics) estimates with a given precision a part of the noise signal where some randomly generated coherence dominates the dynamics does not imply that it can predict an intrinsically unpredictable noise signal for times > T2. That is, the KF might allow to precisely estimate the noise amplitude in a "coherent" snapshot lasting for about T2, but this noise amplitude itself will be random in many such snapshots, the distribution being set by the random bursts of the spin noise signal itself (i.e. its amplitude) and not the shot-to-shot fluctuations. So it is not clear if the presented sub-SQL measurement is a "metrological sub-SQL".
(1c) To elaborate a bit more, what I understand as spin noise is the spontaneously generated collective spin (the randomly created oscillations lasting for about T2, then randomly regenerating themselves etc). As recently shown (PR Research 1, 033017, 2019), spin exchange collisions (as well as other kinds of binary collisions) continuously generate spin noise due to the quantum randomness of the post-collision states. Now, the so produced non-zero collective spin fluctuations (as in Fig. 1) scale with atom number as do the collective quantum uncertainties of spin observables in specific states (SQL or TSS), and could also be numerically similar. But attributing the former to the latter is to my understanding far from obvious if not plainly incorrect, as it is also far from obvious that such fluctuations (generated by binary collisions) are in any way related to the actual measurement of the collective spin by the light field. I'm not saying that the authors make such attributions, I'm just "thinking aloud" and would just like to "disentangle" three concurrent issues that I find confusing: spin noise generated by collisions, shot-to-shot variations in measured spin noise, and SQL/TSS bars.
(1d) Related to this is a statement in page 3, left column, where the authors state that it is the measurement (I add "with the light field") that reduces the pink error band in the first few microseconds of the measurement. However, one could claim that it is just the numerical inability of a classical estimator to make an estimate with only a few points available around t=0. Indeed, in the distribution of the actual measurement points just after t=0 there is no apparent change in their spread, so there is no evidence of shrinking of the actual measurement uncertainty after the onset of measurement (i.e. due to acquisition of information).
(1e) So from the authors' perspective I understand that it is the measurement of the collective spin by the light field that projects the atoms to a non-classical state, and the pink error band reflects the atomic noise of this alleged non-classical state, much like we plot coherent states of light with a thickened sine wave, the thickness reflecting the quantum fluctuations of the electric field. This pink error band being smaller than the SQL, there is entanglement, the authors claim.
(1f) Based on the points (1a)-(1d), however, I could paint a different physical picture. Random atomic collisions generate the spin noise signal shown in Fig. 1, the atomic state being some separable state determined by the (still not well understood) physics of spin exchange collisions at the quantum noise level. This state could have even zero or some other nonzero but small variance along z, while the observed pink error band could be due to some classical noise source.
The subtlety is that since the transverse components and their variances are not measured in order to asses metrological spin squeezing, the authors measure along 1D and rely on the 3D KF estimates to find the total spin variance. But, for the sake of arguing, the atomic collision dynamics (which do generate quantum fluctuations of the collective spin) coupled with the collective measurement induced by the light field interaction could lead to rather complicated dynamics not captured by Eq. 3 and the KF. Essentially, how do we really know that these fluctuations (the pink error band) are of atomic origin? Usually in studies of spin noise, the scaling of the total noise power under the spectra in Fig. 2a is plotted against atom number, and a linear scaling with atom number shows that these are indeed spin noise spectra. But now we are talking about consecutive shot-to-shot fluctuations of the measured spin noise. How do they scale with atom number? I understand it might be hard to stay in the SERF regime and significantly vary atom number, but do the authors have some other way to elaborate on this?
(2) The authors claim that a substantial fraction (30%) of the atoms are entangled in a macroscopic singlet state. The singlet being magnetically silent, I would expect that outside the SERF regime (large Larmor frequencies where purported variances approach SQL) these atoms would cease to be magnetically silent, hence the rms amplitude of the spin noise signal itself, and not just the fluctuations of the amplitude around its running mean, should grow larger.
Equivalently (since this might be hard to observe when the linewidths increase), I would expect the plotted spectra in Fig. 2a to contain different integrated noise powers, if I assume that at the low frequencies of the SERF regime a large fraction of atoms do not contribute to the average spin noise signal but only to its shot-to-shot variance. With the quality of the spectra and the fits, a 30% effect in integrated noise power should be readily observable. Is there such an effect? If not, why?
Essentially, all my questions are intertwined and boil down to the aforementioned subtle concurrence of (i) spin noise itself, (ii) shot-to-shot fluctuations thereof and (iii) theoretical uncertainty bars. I'm looking forward to the authors helping me to clarify the above.
Wording and referencing (1) It would probably be prudent to abstain from statements on how many orders of magnitude this measurement surpasses other measurements in entanglement metrics, since not all measurements can be compared in a straightforward way, nor is such a global comparison the main objective of this work, the physics details of previous and the current measurement being rather subtle.

Minor Comments
(1) At the end of the abstract the same sentence is repeated twice.

University of Crete
Reviewer #3: Remarks to the Author: This is a very interesting paper. The number of particles that are reported to be entangled constitutes a new record. The regime of operation of the experiment is also remarkable. It is fascinating to think about the fact that the particles participating in the entanglement are constantly changing because of the spin-exchange collisions, but that the entanglement itself is preserved. I do think that the paper deserves to be published in Nature Communications, but I would like the authors to address my questions below. I am listing them roughly in order of importance, starting with the most important.
The authors give a bound for the number of entangled particles. I am wondering whether it would be possible to also infer something about the type of entanglement, i.e. two-particle singlets versus more complex multi-party entanglement. I am in particular thinking about the methods used in this paper: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.86.4431 Could they be adapted to the present experiment? Or is it clear that no multi-party entanglement will be created?
The authors also estimate the spatial range of entanglement based on applying a magnetic field gradient. They find a range that is much greater than a wavelength, but smaller than the size of the cell. Can this result be understood quantitatively?
I gained some understanding of the SERF regime from reading this paper, but I do feel that the explanation could still have been more comprehensive and self-contained. Some related detail questions include the size of A_{hf} (it would have been nice to see that somewhere on page 2 or 3), and the meaning of the 'nuclear slowing-down factor' q.
Finally a minor point, there is a lot of repetition towards the end of the abstract.

Reply to comments of Reviewer 1
We thank Reviewer 1 for a careful and detailed reading of the manuscript, and for identifying several aspects that required improvement. Below we give a point-by-point response to the Reviewer's comments. Resulting changes are indicated in blue in the revised manuscript. b)in the Result -material system section, authors mention that at Larmor freq < 5 kHz the desired regime of SEFR is achieved. Only a half page later they explain why it is so.
c) connected to b) I believe, authors should give a brief explanation of physics behind SERF regime in the introductory paragraph. This will help non-experts better understand this work without looking up references Response: We thank the Reviewer for these suggestions. We have added the corresponding reference and explanation according to the Reviewer's suggestions. Response: We thank the Reviewer for these comments. Indeed, we did not specify which degree of freedom is entangled. Spin squeezing theory allows us to say that the atoms are entangled, but (to date at least) does not indicate in what degree of freedom. We can hypothesize that the entanglement follows the chain of interactions: The optical probe interacts with the electron spin and orbital angular momentum, which presumably entangles the electron spins of different atoms, including atoms at a distance, because the probe light interacts with all of the atoms. The electron spin is coherently coupled to the nuclear spin by the hyperfine interaction, so we can expect entanglement also of the nuclear spins of distant atoms. Finally, collisions between atoms exchange their electron spins. We can hypothesize thus entangled states involving the electrons and nuclei of clusters of atoms in one place, with clusters of atoms in another. Fortunately, the squeezing is useful for metrological purposes even if we don't know the exact nature of the entangled state. One example that has been studied in detail is the magnetic gradiometer 1 . Probably the most commonly used method is to compute the conditional variance of fits to the data 2 . We have previously used this method for spin squeezing using a sequence of probe pulses 3,4 . This conditional variance method has several intrinsic drawbacks when working with diffusive continuous-time data: No simple parametrized fit function captures accurately the diffusion process over long time scales; using non-simple fit functions makes the fits less accurate (the so-called "bias-variance tradeoff"); and any simple fit function (e.g. a sinusoid with amplitude and phase that is polynomial in time) has a nonlinear dependence on its parameters (e.g. F z is a sinusoidal, not linear, function of the phase). The conditional variance approach is also simply cumbersome, in that it requires a large amount of data to be fit for every point in the time series. In contrast, the Kalman filter based on Eq. (3) is efficient, uniquely defined, linear, and optimal in a least-squares sense. We appreciate that the Kalman filter approach is relatively new and unfamiliar in atomic sensing, but it has been shown to be remarkably accurate in describing the statistical properties of spin noise. See for example Jimenez-Martinez et al. 5 , an experiment we performed precisely to check the accuracy of the Kalman filter in this context. Comment 5. The authors mention the optimal optical power of 2 mW, it would be interesting to see an explanation why this is the case Response: We thank the Reviewer for this good question. There are many parameters could affect the entanglement generation, and the optical power is one of them. Higher optical power will make stronger interaction with atoms therefore in principle will bring us better signalto-noise ratio, however at the same time, it will increase the power broadening of the spin noise resonance, which is to say it will accelerate the spin relaxation and diffusion. In optimizing the experiment we acquired data with different probe powers, keeping other parameters fixed and found 2 mW to be optimal for spin squeezing. Response: We thank the Reviewer for this question. The reviewer is correct, our vapor cell is enclosed in a 4-layer magnetic shield which prevents outside fields from reaching the sensor. This is the usual configuration for testing magnetic sensors, because it blocks environmental noise. The configuration is also used for precision sensing when the source is small enough to be placed inside the shields along with the sensor. For measuring the field from larger sources, e.g. in human brain magnetic field measurements, the source and sensor are placed together in a magnetically shielded room, which is simply a larger magnetic shield. We note that shielding is especially important for SERF magnetometers, which are extremely sensitive (sub-fT/ p Hz), but only have this sensitivity when the total field strength is small. A different class of magnetometers is used for unshielded measurements, e.g. measurements of the earth field.

Reply to Reviewer 2
We thank Reviewer 2 for the very in-depth questions and comments. Below we give a bit of context to frame the discussion and then reply to the Reviewer's queries. Resulting changes are marked in blue in the revised manuscript.

Context
As the Reviewer is probably quite aware, SERF regime vapors have a complex physics, and there is to date no theory that can accurately describe non-classical states in these vapors, nor accurately describe the quantum effects of non-destructive measurement, e.g. Faraday rotation probing. One might expect that SERF-regime vapors would be very good for quantumenhanced sensing, because they combine high optical depth (which in simpler systems makes for good QND measurements and good measurement-induced squeezing) with long coherence times (which makes for good sensitivity). Or one might expect that SERF-regime vapors are very bad for quantum enhanced sensing, because the SERF physics would scramble any entangled/squeezed states through the fast and random spin-exchange. We saw an opportunity to test this latter hypothesis, by trying to make a singlet state. Our approach is at heart the same as we used in Behbood et al PRL 2014 [1] where we used cold atoms and pulsed measurements, with a (1,1,1) B-field and a 1/3 period wait time between measurements, to get statistics of all three components of F. For the SERF experiment we use continuous measurements and thus the Kalman filter is the most appropriate analysis tool.
A natural question is "why use a non-polarized state rather than a polarized state?" Of course, we are ultimately interested in polarized states, because these are relevant to sensing. But if the question is whether SERF physics can support entanglement/squeezing, making an unpolarized/singlet state is arguably a more stringent test, because strongly polarized states can be protected against spin-exchange relaxation by other mechanisms. The singlet is also much less sensitive to magnetic and technical noise. Finally, for an unpolarized ensemble the statistical model is linear, allowing us to use an ordinary Kalman filter rather than an extended Kalman filter. If we had included optical pumping in the experiment, and tried to create a strongly-polarized state, the statistical model would have to be nonlinear to account for the saturation of polarization due to optical pumping.

Response to points raised
Comment 1a. (1a) In Figs. 1b, c the authors compare the SQL and TSS with the error band around the running mean of the spin noise signal, as extracted by the KF. However, this error band represents the noise of spin noise, whereas I would think that what limits a metrological measurement is not the noise on top of the noise, but the noise itself. In other words, the whole oscillating signal in Fig. 1b, being stochastic and hence unpredictable for timescales > T 2 , is what I would think sets a limit to the measurement precision when integrating for times > T 2 , the noise on top of the noise (the spread of the shot-to-shot measurements) being a second-order effect. By visually inspecting the rms amplitude of the spin noise signal itself with ¼ of the SQL and TSS 4σ -bars, it seems that this amplitude is quite larger.
Response: As the Reviewer writes, one may reasonably expect that measurement of slow signals, i.e. of frequency components larger than 1/T 2 , will be limited by spin diffusion, not by the instantaneous uncertainty of the state. Allowing that this is the case, there is nonetheless the possibility to improve the measurement of faster frequency components. This question is studied in Shah  Comment 1b. (1b) Hence the fact the KF (given enough points and the underlying dynamics) estimates with a given precision a part of the noise signal where some randomly generated coherence dominates the dynamics does not imply that it can predict an intrinsically unpredictable noise signal for times > T 2 . That is, the KF might allow to precisely estimate the noise amplitude in a "coherent" snapshot lasting for about T 2 , but this noise amplitude itself will be random in many such snapshots, the distribution being set by the random bursts of the spin noise signal itself (i.e. its amplitude) and not the shot-to-shot fluctuations. So it is not clear if the presented sub-SQL measurement is a "metrological sub-SQL".
Response: As described in our response to (1a) we agree with the highlighted statements. The question of the metrological value will of course depend on what one aims to measure. If the spin noise itself is of interest, then a low-noise, nondestructive readout will more clearly reveal the spin noise, while also not perturbing it. This has some metrological value, see for example these publications by Lucivero et al. on the topic [4,5]. Nonetheless, externally imposed changes in the spin state will probably be more often of interest than the spin noise itself. For example, in a FID magnetometer, the rate of precession indicates the instantaneous field. In this scenario the changes of interest may well be in the sub-T 2 time scale, and the precision of the instantaneous estimates would be very relevant.
Comment 1c. (1c) To elaborate a bit more, what I understand as spin noise is the spontaneously generated collective spin (the randomly created oscillations lasting for about T 2 , then randomly regenerating themselves etc). As recently shown (PR Research 1, 033017, 2019), spin exchange collisions (as well as other kinds of binary collisions) continuously generate spin noise due to the quantum randomness of the post-collision states. Now, the so produced non-zero collective spin fluctuations (as in Fig. 1) scale with atom number as do the collective quantum uncertainties of spin observables in specific states (SQL or TSS), and could also be numerically similar. But attributing the former to the latter is to my understanding far from obvious if not plainly incorrect, as it is also far from obvious that such fluctuations (generated by binary collisions) are in any way related to the actual measurement of the collective spin by the light field. I'm not saying that the authors make such attributions, I'm just "thinking aloud" and would just like to disentangle" three concurrent issues that I find confusing: spin noise generated by collisions, shot-to-shot variations in measured spin noise, and SQL/TSS bars.
Response: Regarding the green-highlighted text. This is probably well known by the Reviewer, but it bears repeating: SE collisions, in combination with the HF interaction, cause a collection of atoms to relax toward a spin thermal state. In this process, the net spin is unchanged, because both SE and HF processes conserve total angular momentum. Because the spin-thermal state is the highest entropy state with a given net spin angular momentum, this adds noise to any state that is not already a spin-thermal state. Regarding the yellow-highlighted text: For the spin noise shown in Fig. 1, the rate of relaxation to the spin thermal state is much faster than either the spin precession or the spin diffusion. Because of this, the observed collective spin oscillation and fluctuations cannot be due to the SE process, but rather to processes that modify the total angular momentum, which include binary spin destruction (SD) collisions, diffusion of atoms into and out of the probed region, and scattering of probe light. Regarding the bluehighlighted text: The measurement interaction causes three relevant "measurement back action" effects on the state of the atoms. The first two we will call "dynamical effects" because they change the observable F via Hamiltonian. These are 1) spin rotation caused by the ellipticity of the probe light, which produces an optical Zeeman shift. Because the probe is linearly polarized, it only has a nonzero ellipticity through quantum fluctuations, so the generated rotation is random with zero mean. 2) scattering of the probe light, which also makes a random contribution to the spin. The last effect 3) we will call an "information effect" because it is caused by projecting the quantum state in the act of measurement, not directly by the dynamics. The probe gives information about the state, reducing our uncertainty about F z . This projects the state into a state more closely resembling an eigenstate of the F z operator.
Note that for an unpolarized state like the one used here, effect 1) is negligible in practice. It rotates the state by a small angle, causing F x → F x cosθ + F y sinθ and similar for F y , where ! is the spin rotation angle produced by the optical Zeeman shift. Because F x ͠ F y ͠ 1/ N , the sin ! term can be neglected, provided |!| ≪ 1. In contrast, for a state polarized along the y direction, such that F x ͠ 1/ N , F y ͠ N, this condition would be |!| ≪ 1/) * + . The measurementinduced spin rotation is always a significant effect on the state when the measurement produces an uncertainty comparable to the SQL.
Comment 1d. (1d) Related to this is a statement in page 3, left column, where the authors state that it is the measurement (I add "with the light field") that reduces the pink error band in the first few microseconds of the measurement. However, one could claim that it is just the numerical inability of a classical estimator to make an estimate with only a few points available around t=0. Indeed, in the distribution of the actual measurement points just after t=0 there is no apparent change in their spread, so there is no evidence of shrinking of the actual measurement uncertainty after the onset of measurement (i.e. due to acquisition of information).
Response: The Reviewer seems to be looking for a physical effect on the spins due to the measurement. As described in response to (1c), the physical effect (a random spin rotation) has a small effect on the spin components of an unpolarized state like we use here. Comment 1e. (1e) So from the authors' perspective I understand that it is the measurement of the collective spin by the light field that projects the atoms to a non-classical state, and the pink error band reflects the atomic noise of this alleged nonclassical state, much like we plot coherent states of light with a thickened sine wave, the thickness reflecting the quantum fluctuations of the electric field. This pink error band being smaller than the SQL, there is entanglement, the authors claim.
Response: The (width of the) pink band represents our uncertainty about the spin observable F z . Note that the Kalman filter provides also uncertainties for the other components, in the form of a covariance matrix that describes both the variances and the correlations of F x , F y and F z . Because it reflects our knowledge of these variables, the covariance matrix provides upper bounds on the uncertainty of the state.
Comment 1f. (1f) Based on the points (1a)-(1d), however, I could paint a different physical picture. Random atomic collisions generate the spin noise signal shown in Fig. 1, the atomic state being some separable state determined by the (still not well understood) physics of spin exchange collisions at the quantum noise level. This state could have even zero or some other nonzero but small variance along z, while the observed pink error band could be due to some classical noise source. The subtlety is that since the transverse components and their variances are not measured in order to assess metrological spin squeezing, the authors measure along 1D and rely on the 3D KF estimates to find the total spin variance. But, for the sake of arguing, the atomic collision dynamics (which do generate quantum fluctuations of the collective spin) coupled with the collective measurement induced by the light field interaction could lead to rather complicated dynamics not captured by Eq. 3 and the KF. Essentially, how do we really know that these fluctuations (the pink error band) are of atomic origin? Usually in studies of spin noise, the scaling of the total noise power under the spectra in Fig. 2a is plotted against atom number, and a linear scaling with atom number shows that these are indeed spin noise spectra. But now we are talking about consecutive shotto-shot fluctuations of the measured spin noise. How do they scale with atom number? I understand it might be hard to stay in the SERF regime and significantly vary atom number, but do the authors have some other way to elaborate on this?
Response: Regarding the yellow text: Random spin processes (SD collisions, diffusion) do indeed cause the spin noise signal of Fig. 1. As described in the response to (1c), the (width of the) pink band indicates our uncertainty about F z as a function of time as the measurement proceeds. The quantum uncertainty of F z cannot be larger than our uncertainty about F z , so the width of the pink band is an upper bound. It is probably worth pointing out that this uncertainty comes to an equilibrium value (see for example Fig. 6) due to a competition of diffusion (which pushes the uncertainties toward their thermal state values) and measurement, which pushes them toward zero. Regarding the blue text: We make various checks of the validity of Eq. (3) and the KF. In fact, these are equivalent because the KF is derived from Eq. (3). See the "Validation" section of the Methods. Based on these, we believe that the KF results accurately describe the dynamics of F. Regarding the green text: Scaling with atom number and photon flux is often used to separate different noise contributions (atomic quantum noise, atomic technical noise, photon shot noise, etc.). One main motivation for doing this is to identify the SQL. This strategy works well in scenarios involving non-interacting particles, because each noise contribution has a simple polynomial scaling. When the particles begin to interact, this is no longer the case. This can be observed for example in spin noise spectra: outside the SERF regime the area of the spin noise peak (i.e., the integrated noise power) is proportional to atom number, and independent of Larmor frequency and relaxation rate. In the transition from the non-SERF to SERF regimes it is not proportional to atom number. For example, in Fig. 2a the spin noise peaks are not of the same area, even though the density is the same. For this reason, we did not consider it appropriate to use scaling to determine SQL. Instead, we made a direct calibration of the number of atoms participating. This used the spin noise linewith versus density, and the known value of the SE collision rates, as a calibration for the density (see Methods: Density Calibration), a direct measurement of the beam dimensions to determine the effective volume of the sample, and the computed rotation efficiency of the vapor (see Methods: Observed Spin Signal). The SQL of the total noise var(F x ) + var(F y ) + var(F z ) for N atoms in a thermal state is computed in Methods: Entanglement Witness.
Comment 2. (2) The authors claim that a substantial fraction (30%) of the atoms are entangled in a macroscopic singlet state. The singlet being magnetically silent, I would expect that outside the SERF regime (large Larmor frequencies where purported variances approach SQL) these atoms would cease to be magnetically silent, hence the rms amplitude of the spin noise signal itself, and not just the fluctuations of the amplitude around its running mean, should grow larger. Equivalently (since this might be hard to observe when the linewidths increase), I would expect the plotted spectra in Fig. 2a to contain different integrated noise powers, if I assume that at the low frequencies of the SERF regime a large fraction of atoms do not contribute to the average spin noise signal but only to its shot-to-shot variance. With the quality of the spectra and the fits, a 30% effect in integrated noise power should be readily observable. Is there such an effect? If not, why?
Response: As already mentioned in response to (1f), there is a difference in the integrated noise powers (the "area" we called it) between the SERF and non-SERF regimes. This does not have anything to do with the QND measurement or the generation of singlet, however. A simple proof of this is that if we turn down the probe power the spin squeezing goes away, but the spin noise spectra look the same (that is, the atomic contribution to the rotation angle noise is the same; the shot noise contribution to the angular noise is larger at lower probe power). How is it possible that we are putting 30% of the spins into singlets and it does not reduce the spin noise? One way to understand this is to note that by measurement we are only shrinking the uncertainty of F, not its average. As seen in Fig. 1, the average continues to diffuse in the same way as it would without the measurement (we ignore power broadening, which in practice is small and in principle can be made arbitrarily small through large OD). Nonetheless, due to the measurement we know the value of F with uncertainty below the SQL, and so the uncertainty of the state must also have been reduced below this level. And spin squeezing theory tells us that the only way to reduce the quantum uncertainty below the SQL is to make singlets. If this still appears contradictory, consider this scenario: we have N = 10 12 atoms experiencing spin diffusion, which causes the average spin polarization <F> to wander about zero with excursions of typical magnitude δ<F z > ͠ N = 10 6 . This condition is compatible with different states, with different uncertainties: 1) a small minority (~one part in