Quantum measurement arrow of time and fluctuation relations for measuring spin of ultracold atoms

The origin of macroscopic irreversibility from microscopically time-reversible dynamical laws—often called the arrow-of-time problem—is of fundamental interest in both science and philosophy. Experimentally probing such questions in quantum theory requires systems with near-perfect isolation from the environment and long coherence times. Ultracold atoms are uniquely suited to this task. We experimentally demonstrate a striking parallel between the statistical irreversibility of wavefunction collapse and the arrow of time problem in the weak measurement of the quantum spin of an atomic cloud. Our experiments include statistically rare events where the arrow of time is inferred backward; nevertheless we provide evidence for absolute irreversibility and a strictly positive average arrow of time for the measurement process, captured by a fluctuation theorem. We further demonstrate absolute irreversibility for measurements performed on a quantum many-body entangled wavefunction—a unique opportunity afforded by our platform—with implications for studying quantum many-body dynamics and quantum thermodynamics.

The first step of our experimental protocol is to initialize our two-state quantum system into a quantum state We use a coherent two-photon Raman interaction to couple two ground states |ψ ↑ and |ψ ↓ of a three-level Λ system via an excited state |e using optical fields with complex Rabi frequencies Ω A and Ω B (see Fig. 1 of the main text) [1].
Here we denote the initial atomic state for state preparation as |ψ i . In the limit of large detuning ∆ of the optical frequencies from the excited state, the dynamics is captured through the time evolution of the two-level system {|ψ ↑ , |ψ ↓ } under the atom-field interaction as: When the Raman interaction is effected with square diabatic pulses the right hand side contains terms that are constant in time so that the system of coupled equations can be integrated directly; for a time of interaction t, the evolution of the state vector then acquires the form whereÛ = exp i Ωt 2 n · σ is the unitary time evolution operator [2,3]. Here σ is the vector of Pauli matrices. The generalized Rabi frequency Ω = (|Ω A | 2 + |Ω B | 2 )/4∆, and the vector n = (sin 2α sin φ, sin 2α cos φ, cos 2α) T in the unitary evolution are determined by experimental parameters: tan α = |Ω A |/|Ω B | and φ = φ A − φ B describe the relative strengths and phase between the Raman beams in the experiment [2][3][4].
In the experiment the two-state quantum system is constituted by the |F = 2, m F = 2 and |F = 2, m F = 0 atomic Zeeman sublevels in the |F = 2 ground state hyperfine manifold of 87 Rb. We create a Bose-Einstein condensate (BEC) in the |ψ i = |ψ ↑ ≡ |F = 2, m F = 2 atomic state in a magnetic trap [1][2][3][4]. The trap is turned off, and the cloud under free-fall expands for 9 ms to a regime where interatomic interactions are negligible and the dynamics can be described using the single-atom Hamiltonian of the atom-field interaction described above. A bias magnetic field of ≈ 10 Gauss defines a quantization axis for the system. The Raman pulses that prepare the initial quantum state in our experiment-two phase-coherent, diabatic, 5 µs square pulses derived from a single Raman master laser, and detuned 440 MHz below the excited state |e ≡ |F = 1, m F = 1 on the D1 line of Rb-copropagate along this quantization axis and have polarizations that allow us to couple the |ψ ↑ ≡ |F = 2, m F = 2 and |ψ ↓ ≡ |F = 2, m F = 0 states via the excited state |e . The individual frequencies of the Raman pulses are controlled with acousto-optic modulators, so that the two atomic states, with a ground-state splitting of ≈ 7.2 MHz, are individually addressed by the optical fields.
For the results relating to the arrow of time and fluctuation relations for a spatially uniform preparation of the cloud, we use Gaussian Raman beams with uniform phase and orthogonal circular polarizations (σ + , σ − ). The Raman beams are overlapped before being fiber-coupled so that they share the same Gaussian spatial mode and are essentially * nbig@pas.rochester.edu , and the second verifying the coherence through matter-wave interference. Persisting spatial and spin coherence at the time of imaging are apparent in the sinusoidally-varying intensity in each spin state in the resulting absorption image. Note that since the absorption imaging process is destructive, the three data shots shown are from separate experimental runs where the Raman process is performed each time on different clouds initially identically prepared in atomic state |F = 2, m F = 2 .
uniform in intensity over the spatial extent of the cloud (≈ 50 µm at the time of the Raman interaction). This process thus controllably creates an ensemble of ≈ 6.5 × 10 6 identically prepared atoms in a coherent superposition state of the atomic ground states, which is the state |ψ 0 in our investigation of the quantum measurement process. Images of the cloud are taken with standard absorption imaging techniques [5]. This obtains a column density of the cloud integrated along the quantization axis. The Stern-Gerlach interaction produces dynamics in the transverse plane, so that the density profile of the cloud in the transverse direction is a probability distribution for spin measurement with the Stern-Gerlach process. We note that since our image processing involves subtracting images taken at three successive times-we capture an absorption image of the atoms, a beam image without atoms, and a dark shot of background light, which we process to compute the optical density of the cloud-we see negative values in some pixels as a result of noise. The imaging system we use has unit magnification, and the camera has a pixel size of 16 µm ×16 µm.
In the subsequent results treating the fluctuation theorem in the context of the quantum many-body wavefunction of the BEC, we create an entangled superposition state of the cloud where a center-of-mass orbital angular momentum (OAM) quantum number is correlated with spin state. Here, one of the Raman beams is first sent through a spiral phase plate that creates a vortex Laguerre-Gauss mode with OAM quantum number = 1 and circular polarization σ + . This beam is then overlapped with the second Raman beam in a Gaussian spatial mode and orthogonal circular polarization σ − before interacting with the cloud, initially in state |F = 2, m F = 2 . The atoms that are transfered from |F = 2, m F = 2 to |F = 2, m F = 0 acquire an OAM equal to the difference of the OAM between the two Raman beams. This creates the entangled state, where we define |ψ ↑ ≡ |F = 2, m F = 0 and |ψ ↓ ≡ |F = 2, m F = 2 for notational consistency with the analysis in Supplementary Note 3. Note that this assignment is opposite that used in the Gaussian analysis, which is the natural one, however the results discussed are unchanged by this since we are only concerned with relative attributes of the cloud as the spin states separate. The LG and G mode functions are further discussed in Supplementary Note 3. We now address the important issue of relaxation and decoherence timescales in our system. The two-photon resonance linewidth of our Raman process is typically a few kHz. We use 5 µs pulses in our two-photon Raman transfer process to create superpositions of spin states, so that the time scales of state preparation are well within the time-scales of state decay. Quantum systems are inevitably coupled to their environment, and sources of homogenous and inhomogenous relaxation, including spin-spin interactions, and stray magnetic fields, are a concern. Here, we are benefitted by the ultracold temperatures of our dilute atomic cloud, and the excellent isolation from the environment, so that these sources of decoherence are well-controlled in our experiment. We demonstrate that the spatial and spin coherence of the cloud is retained at the time of absorption imaging, and is not destroyed during the free fall of the cloud from the time of state preparation to the time of imaging. To do this, we consider again an atomic cloud prepared in a coherent superposition of spin states |ψ ↑ and |ψ ↓ , with coupled orbital angular momenta ( = 1, = 0) (Supplementary Fig. 1(a)). The state with angular momentum = 1 contains an azimuthal phase that varies by 2π around the central axis of the mode, while the spin state with = 0 has a uniform phase. We then apply a second 5 µs pulse pair with effectively uniform Gaussian profiles and angular momentum quantum numbers ( = 0, = 0), that transfers ≈ 50% of the atomic population between each state (Supplementary Fig. 1(b) shows the atomic state prepared by just this second pulse pair, with no preceding LG-G transfer). Supplementary Fig. 1(c) shows an example of the absorption image produced by an atomic cloud after such a sequence when first an LG-G pulse pair and subsequently a G-G pulse pair are applied, showing matter-wave interference between the coherent superposition states [1,3,4]. The azimuthal phase winding from the = 1 component produces a sinusoidal intensity variation in the absorption image of each spin state, and indicates the robustness of the atomic coherences written into the macroscopic BEC wavefunction. Further the absorption image indicates that this spatial and spin coherence is not destroyed on expansion of the cloud from the time of the Raman state preparation up to the time of absorption imaging. This coherence is still visible when the relative time delay of the second (interfering) pulse pair from the first (state preparation) pulse pair is varied; in the data shown in Supplementary Fig. 1(c) the time between the two pulse pairs was 10 µs, the measurement time τ = 1100 µs, and the image was taken after the time-of-flight of 13 ms. We can thus treat our atomic spin state as coherent over the timescales of the experiment.

B. Time-of-flight Stern-Gerlach absorption imaging
Our weak measurement of spin state relies on a readout provided by a time-of-flight Stern-Gerlach process that correlates atomic spin state with position. In examining the evolution of our initial atomic state during this process, we note that the homogeneous acceleration of the atomic wave-packet due to free-fall under gravity produces a center-ofmass dynamics that may be separated out from dynamics that affects the relative attributes of the atomic spin states (see for example analysis in [6]); this center-of-mass dynamics is excluded from our subsequent analysis. Further, we model the Stern-Gerlach interaction as a momentum kick, since the Stern-Gerlach magnetic field is pulsed on for a time τ much shorter than the subsequent free evolution time t f . The relative kinetic energy of the components of the cloud at this stage of our experiment is small enough that to a first approximation we ignore evolution from the free Hamiltonian during the Stern-Gerlach interaction. We thus model our measurement dynamics in two stages: a stage where the dynamics is dominated by the interaction with the inhomogenous Stern-Gerlach magnetic field with interaction Hamiltonian H SG , followed by a stage where the dynamics is governed by the free HamiltonianĤ free-fall .
The Stern-Gerlach interaction is a magnetic dipole interaction between magnetic field and atomic spin: where g F is the Landé factor and µ B the Bohr magneton. Here, we have a magnetic field that maintains the quantization axis in the z direction but has a gradient B 0 ≈ 300 G cm −1 in the x direction. This inhomogeneous field is pulsed on for a short time τ so that we may neglect evolution from the free Hamiltonian, and find the unitary evolution operator for the atomic state to beÛ where δp x = g F µ B B 0 τ . Our measurement process begins with our cloud in a coherent superposition of spin states |s , that are here expressed as eigenstates ofσ z with eigenvalues s = ±1. At this stage the quantum state is a product state of spatial and spin components |ψ 0 ≡ |α |s . In the Gaussian approximation we can write our initial position space wavefunction as a Gaussian wavepacket centered on x = 0 with width σ: The wavefunction for atoms in spin state |s after the Stern-Gerlach momentum kick from the magnetic field is in free fall leaves us with the probability distribution in position space, where β = (1 + i t f /2M σ 2 ). The two different spin states evolve as Gaussian wavepackets centered at ±δp x t f /M with a scaled width σ|β|, justifying our use of Gaussian Kraus operators in our analysis.

SUPPLEMENTARY NOTE 2: AN EXAMPLE OF MEASUREMENT REVERSAL
As an example of measurement reversal in a succession of weak measurements of spin state with a Stern-Gerlach apparatus, consider the sequence depicted in Supplementary Fig. 2. An atomic cloud is prepared in a coherent superposition of spin states |ψ a = √ 1 − a|ψ ↓ + √ a|ψ ↑ . For a time-of-flight Stern-Gerlach process, we can expect the spin state to be correlated with position so that the probability distribution for our readout is the sum of two Gaussians; our distributions may be shifted and rescaled to give (for σ 2 = 1/2), Suppose a weak measurement of spin state for the state |ψ a , gives a readout r = 0.5. We can now use quantum Bayesian approach to infer that the spin state is then |ψ b . Suppose we could select out this portion of the cloud with Supplementary Figure 3: Separated cloud for different z, the initial state of the qubit. The panels represent z = −1 to z = 1 in increment of dz = 0.5. We assume σ x = 0.5, σ y = 0.5, w = 1.5, x 0G = −1, x 0LG = 1, y 0G = 0, and y 0LG = 0. Figure 4: Verifying fluctuation theorem for different initial spin state, denoted by their spin expectation value, z = ψ i |σ z |ψ i , whereσ z is the Pauli spin matrix. We assume σ x = 0.5, σ y = 0.5, w = 1.5, x 0G = −1, x 0LG = 1, y 0G = 0, and y 0LG = 0. The dots indicate µ computed independently. The asymmetry in this curve follows from the asymmetry between LG and G modes.
a second slit, and perform a second Stern-Gerlach weak measurement on it. If our second measurement on this spin state now yields readout −r, or equivalently r = −0.5-which can happen with a finite probability as determined by the probability distribution in panel (b)-we must infer that the spin state is |ψ c with the associated probability distribution; but this is just the original spin state |ψ a . This is an example of successful measurement reversal.

SUPPLEMENTARY NOTE 3: A QUANTUM MANY-BODY FLUCTUATION THEOREM FOR QUANTUM SPIN MEASUREMENTS USING A BOSE-EINSTEIN CONDENSATE
We verify a quantum fluctuation theorem which describes the quantum many-body dynamics of a Bose-Einstein condensate (BEC) subject to weak quantum spin measurements. A quantum coherent cloud of atoms is treated as a logical qubit in a designated spin state, where the quantum spin information is encoded spatially across the cloud. The logical qubit subsequently undergoes a quantum spin measurement via an absorption imaging technique that projects the collective atomic wavefunction into the position basis, yielding a many-body statistics for the quantum measurement process, with associated spin states inferred from the quantum Bayesian update rule. For this analysis, we define an atomic cloud qubit in the initial state, where the spin information is encoded spatially across the many-body quantum system as, The state |ψ i represents the quantum coherent state of the atomic cloud just prior to the absorption imaging, which is really the act of quantum measurement in our experiment. Here LG and G represent the following Laguerre-Gaussian (with associated azimuthal and radial quantum numbers = 1, p = 0), and Gaussian ( = 0, p = 0) spatial modes at the imaging plane, where (x 0LG , y 0LG ), ( x 0G , y 0G ) are coordinates representing the centers of LG and G modes, σ x , σ y are the Gaussian standard deviations, and w is the waist of the Laguerre-Gaussian mode. We absorb all the associated phases of the Laguerre-Gaussian mode to the real function φ(x, y). We have kept the center of the LG and G mode functions arbitrary in order to account for the possibility that a magnetic field gradient may separate them relative to each other, conditioned on the spin state |ψ ↑ or |ψ ↓ . See Supplementary Fig. 3. Now a two dimensional imaging of the cloud produces the following local quantum spin state update which can be thought of as the update of the logical qubit state |ψ i , where the quantum state |ψ i = α|ψ ↑ + β|ψ ↓ encodes the same quantum information as encoded in the initial logical qubit state |ψ i = α| ⇑ + β| ⇓ . This allows us to characterize the irreversibility of quantum measurement process in the experiment by considering the information dynamics resulting from the quantum measurement (cloud absorption imaging), in terms of the mapping between qubit states expressed in Supplementary Equation (12). The measurement operatorM F is defined as,M The forward probability density p F (x, y) corresponding to the measurement process is, This probability distribution p F (x, y) is directly measured in the experiment. Similarly, the measurement is reversed by the following measurement operator [7,8], LG(x, y) , where θ is the time-reversal operator for a qubit. We obtain the associated backward probability density, only defined at points where the forward probability density p F (x, y) is nonvanishing. Given the forward and backward probabilities, we define the statistical arrow of time for the quantum measurement process as [7][8][9], Note that the relative phase between LG and G modes does not appear in Q(x, y). The arrow of time Q(x, y) as a function of position of the atoms in the final cloud absorption image can therefore be determined from the experimental data p F (x, y), and the fit parameters for the LG and G functions obtained from the experimental data. We can now use this to state the fluctuation theorem for quantum measurements for the quantum many-body context discussed above as, where and |ψ i is the qubit state orthogonal to |ψ i [7]. See Supplementary Fig. 4 where we verify the fluctuation theorem using numerical simulations.
In the experiment, this cloud is prepared using a coherent two-photon Raman technique that associates the spin and orbital degrees of freedom of the atomic system by using beams containing orbital angular momentum (see state preparation Supplementary Note A). The modes of these Raman beams are Gaussian ( = 0) and Laguerre-Gaussian ( = 1) modes. The cloud begins in state |ψ ↓ in a Gaussian spatial mode, and atoms that are transferred to state |ψ ↑ pick up a vortex phase from the Raman beams, associating the LG and G mode functions with the spin states. Note that these are not true LG and G spatial modes of the cloud in the experiment, since the true spatial mode of the cloud depends on the complex interaction with the optical fields [3]; however to a first approximation the spatial modes follow the characteristic shapes of these functions. To obtain a better fit to the experimental data we empirically assign a mode function to the Gaussian portion of the cloud that is a bimodal Gaussian distribution with different variances (σ x , σ y ) and (σ x0 , σ y0 ), co-centered on (x 0G , y 0G ). In place of the Gaussian density |G(x, y)| 2 , we then use a modified density, where the parameter d weights the bimodal distribution while ensuring overall normalization when included with the Laguerre-Gaussian portion of the cloud. The form of the effect matrix-M † F (x, y)M F (x, y)-used in the fluctuation theorem results uses this modified density. The association of the orbital angular momentum quantum number with spin state is unaffected by this change, as also the essential results of the fluctuation theorem. The Laguerre-Gaussian density function used is the standard form In comparison, note that the effect matrix in the Gaussian-Gaussian Stern-Gerlach experiment for uniformly spin initialized atomic cloud has the following form: where r is the scaled position readout variable as defined in the main text.  Here we provide additional data used in the discussion of the quantum many-body fluctuation theorem. The cloud is prepared in an entangled superposition state that is the initial state of our analysis. The data in Supplementary Figs. 25-36 show absorption images of the atomic cloud after a weak Stern-Gerlach measurement. Here again, since the absorption images are processed to obtain the cloud column densities, small negative values occur due to noise. These pixels are not used in the calculations of experimentally determined quantities. We fit the mode functions of Supplementary Equations (19) and (20) to the positive-valued absorption image to obtain fit parameters for the cloud to be used in our model. The irreversibility is computed from data and from theory using the fit parameters. Supplementary Fig. 41 shows the values for e −Q and irreversibility parameter µ, where the values plotted are (black dots) e −Q and (brown dots) 1 − µ as computed using the experimental data. Also shown are the (blue diamonds) theoretically calculated values for e −Q .

SUPPLEMENTARY NOTE 5: ERROR ANALYSIS OF THE FLUCTUATION THEOREM
The fluctuation theorem is numerically verified from single shot realizations of the experiment, and therefore each data point in Figures 2(b,d) and Figure 3 of the main text corresponds to a realization of the experiment. The cloud absorption imaging technique provides the entire statistics in a single shot from which we extract the experimental forward probability density (directly proportional to the measured intensity) as well as the Gaussian and Laguerre-Gaussian fit parameters, which is sufficient to compute the arrow of time Q from the experiment (see Equation (2) of the main text). As the number of counts recorded by the CCD camera is very large, the errors in estimating the LHS and RHS of the fluctuation theorem from the experimental data using this single shot imaging scheme do not originate from having fewer realizations of the experiment, but rather arise from imperfections in the theoretical modeling, as well as the imaging process itself; for instance, we fit the measured data for the experiment to a weighted sum of two identical Gaussians in the first example, but note that the spatial mode of the BEC wavefunction is not a true Gaussian, and only tends to Gaussian form after expansion. Additional systematic errors arising from the absorption imaging process include saturation effects and noise due to background variations during the imaging sequence. Error from saturation in the data are visible at the peak of some of the measured distributions, and the Gaussian approximation provides a reliable-as can be observed from the accuracy of the fit to data points near the peak-extrapolation for the peak intensity value in these cases. These errors do not vanish if the experiment is repeated a large number of times as they arise due to imperfections in our model/imaging process itself, and would tend to zero only for the case when the theoretical fit to the data is perfect. We identify three sources of error in our analysis which are treated as independent so the relative errors are added in computing the error bars provided in Figures 2(b,d) and Figure 3 of the main text.
The first source of error we considered includes the systematic error arising due to renormalization of measured positive intensities. To account for this, we compute the relative error we make in simulating an ideal experiment, where we renormalize the simulated distribution after removing the corresponding data points which shows negative intensity due to background noise in the experimental data. We compute the e −Q for both simulated data sets and compute the relative error w.r.t the simulated experiment where those data points not accounted for in the experiment are correspondingly omitted for the simulation. This gives an estimate for the relative error RE(norm) from renormalization of measured positive intensities in an ideal experiment, and we get the estimate for the contribution to error bar in e −Q exp for the experimental data by multiplying RE(norm)× e −Q exp . With the renormalized experimental data, next we estimate the error arising from imperfect fit to our theory model in the following manner. We can re-write the LHS of the fluctuation theorem as, and p B (r i )dr i is the backward probability density around the discrete (binned) readout r i having binwidth dr i ≈ dr, a constant determined by the pixel size and scaling we choose. We partially account for the error due to the negative intensity points which are excluded in our analysis as the error in estimating dr. We estimate the relative error in dr as, RE(dr) = dr −1 (|max(r) − min(r)|/(number of bins) − dr). A similar approach is taken for the LG-G data, where we replace dr with dA, the two dimensional area element, and the relative error in this area element is similarly estimated from the area missing from excluding the negative intensity points. To this, we add the relative error from the sum of backward probabilities shown in Supplementary Equation (22) to estimate the contribution to the error in our estimate for e −Q from irregularities in the image taken. This is computed in the following manner. Each of the data points y i can be computed from the experimental data, as well as from the theory fit we use, giving the theory prediction y 0 i for each i. See Supplementary Fig. 37(c) where the points y i and y 0 i are shown for one example. A difficulty is that because the experimental data we use is from a single shot, the error bar σ i for each of the data points y i is not provided. Nevertheless, we can estimate this error bar from the χ 2 statistics, In order to do so, we make the following assumptions: (1) each of the σ i are identical and equal to σ 0 , and (2) χ 2 evaluated per degree of freedom is equal to one. This is equivalent to assuming that if we were to repeat the experiment many times, each of the data points y i are distributed as a Gaussian with approximately the same width. Such an approximation for σ 0 when σ i is not available is typically used to estimate the errors in the fit parameters (see for instance Appendix D of [10]). We therefore have, We denote the number of degrees of freedom M = N − P , where P is the number of constraints which relate the data points y i . The number of constraints P can be approximated by the number of fit parameters we use in the problem, which determine the relations between data points y i . Now σ 0 is the error associated to each y i , but we are interested in providing error bars ±σ LHS for the sum, e −Q exp ≈ dr i y i . Owing to the assumption that each σ i ≈ σ 0 , the relative error for e −Q exp is estimated as the sum, σLHS e −Q exp = RE(norm) + RE( i y i ) + RE(dr). For the relative error RE( i y i ), we use a conservative estimate based on the prior analysis where χ 2 per degree of freedom equated to one as RE( i y i ) = √ N σ0 i yi . Note that the contribution to the error from RE( i y i ) would vanish in the case if the data is perfectly described by the theoretical model, i.e., when y i = y 0 i .

SUPPLEMENTARY NOTE 6: COMPARISON FOR THE PROBABILITY DISTRIBUTION p(Q) FROM REPEATED QUBIT MEASUREMENTS ON A SINGLE QUBIT
In the methods section of the main text, we briefly discussed how the probability distribution p(Q) is computed from the experimental data for spin measurements with a uniformly initialized cloud. Here we detail the process. First, we compute the arrow of time Q(r i ) for each of the data points r i by using Equation (2) of the main text, and the corresponding probability densities p(r i )dr. Computing the arrow of time Q(r i ) from Equation (2) of the main text requires that we square the measured probability p(r i ), and divide by the determinant of the effect matrix, M † (r)M (r), and then take a logarithm of this ratio. Note that the effect matrix does not depend on the initial spin of the system, and only depends on the measurement strength, which in turn depends on the separation between the identical Gaussian clouds, as well as the standard deviation of the Gaussian. Therefore the denominator can be thought of as a form factor for the measurement process. This simple dependency of Q(r i ) on the measured quantities makes the LHS of the fluctuation theorem e −Q straightforward to compute from the experimental data itself.
In addition, we are interested in knowing how Q is distributed in probability as this distribution contains useful information regarding the average arrow of time Q , as well as shows instances where the arrow of time is estimated backward (negative Q). Such information is not immediately accessible from the probability distribution of the readout p(r). In order to construct the probability distribution p(Q)dQ from p(r)dr, we compute Q(r) and p(Q(r)) for each r. In this step, it is possible that different r values may have similar Q, with different probability densities p(r 1 )dr and p(r 2 )dr. Binning the probabilities of Q by ordering the list with increasing Q ensures that p(Q)dQ represents probability density summed over points r 1 and r 2 for a given Q within range dQ. The probability distribution p(Q) shown in the main text as well as the supplement corresponds to binning nearest four values of p(Q) with the corresponding average Q along the x axis. The probability distribution p(Q) can also be computed analytically for the case when z = 0, given in Equation (3) of the main text. To demonstrate the consistency of our approach for other cases as well, in Supplementary Fig. 37, we have generated the probability distribution shown in Figure 2(c) of the main text assuming the equivalent measurement process on a single qubit, repeated many times to obtain the statistics.

SUPPLEMENTARY NOTE 7: VERIFYING THE FLUCTUATION THEOREM FOR THE EXPERIMENTAL DATA
The results in the main text pertaining to the fluctuation theorem represented in Figures 2(b,d) and Figure 3 provide evidence for the presence of absolute irreversibility in the quantum measurement process, where we compute the LHS of the fluctuation theorem e −Q and show that it is a number 1 − µ between zero and one. A difficulty for the observed data we have is that µ appearing in the RHS of the fluctuation theorem cannot be estimated in a simple manner directly from the experimental data. If we look at the quantities appearing in the definition of µ given in Supplementary Equation (18), the denominator of the integrant is the forward probability density, which is directly measured, but the numerator depends not only on the model we assume for the measurement process, but also on the initial quantum state |ψ , as well as its orthogonal quantum state |ψ . This additional dependency suggests that a direct method to estimate this quantity is therefore not straightforward from the experimental data we have. A different experimental configuration of our setup which may allow directly measuring µ independently of the LHS is also not immediately evident.
Regardless, we take a hybrid approach where we estimate the numerator based on the fit parameters we obtain and evaluate the denominator directly from the measured data, and evaluate the integral expression in Supplementary Equation (18) as a sum. A similar approach is taken for the Gaussian-Gaussian Stern-Gerlach experiment as well. In both cases, our estimate for µ deviates from the model predictions to some extent, while the same data provides good agreement for the LHS of the fluctuation theorem, providing excellent evidence for the existence of absolute irreversibility in the measurement process. We associate the better agreement of our methods to estimate the LHS of the fluctuation theorem to the nature of the quantity being computed. The measure e −Q is similar to estimating the moment generating function e kQ for the probability distribution at the value k = −1. Moment generating functions by definition contain information from all the higher order moments of the distribution as well; this therefore provides a more accurate prediction of the LHS. The RHS computed from Supplementary Equation (18) do not appear to have such an immediate correspondence, partially explaining why our methods give good predictions for the LHS of the fluctuation theorem. This is demonstrated in Supplementary Fig. 38, Supplementary Fig. 39, and Supplementary  Fig. 41, where the error bars for µ is computed by following the same approach we detailed for computing the error bars in e −Q exp . Supplementary Fig. 40 additionally shows the corresponding data set for one example, which indicates that for the same data set, the estimates summed over to obtain e −Q exp agrees rather well with the corresponding fit prediction, while the estimates summed over to obtain µ incur more errors.             While the probability distribution is available in closed form from [9] for the case when z = 0, it is not straightforward to obtain this distribution for arbitrary z. Therefore, to demonstrate the validity of our numerical approach, we compare the distribution we obtain from numerical methods (red curve) to the probability distribution constructed from assuming measuring a single qubit for the same initial conditions many times (grey distribution) as typically done in superconducting qubit platforms [9,11]. We find good agreement as shown above.  We define e −Q = dr p B (r) and µ = drλ(r), where both λ and p B are computed by including additional assumptions about the measurement process, such as the measurement is described by the Gaussian measurement operators in the Stern Gerlach experiment discussed in Fig. 2 of the main text. While we get good estimate for the LHS, e −Q , we note that the RHS computed in the same manner is not as efficient. This is because estimating e −Q for a probability distribution makes use of information about all the higher order moments of the distribution (as it is like the moment generating function, e kQ of the probability distribution p(Q)), while our method to estimate the RHS independently does not have such an immediate correspondence. This can be seen from the comparison to theory shown in (a) and (b) for one data point corresponding to Supplementary Fig. 14