Transverse spin forces and non-equilibrium particle dynamics in a circularly polarized vacuum optical trap

We provide a vivid demonstration of the mechanical effect of transverse spin momentum in an optical beam in free space. This component of the Poynting momentum was previously thought to be virtual, and unmeasurable. Here, its effect is revealed in the inertial motion of a probe particle in a circularly polarized Gaussian trap, in vacuum. Transverse spin forces combine with thermal fluctuations to induce a striking range of non-equilibrium phenomena. With increasing beam power we observe (i) growing departures from energy equipartition, (ii) the formation of coherent, thermally excited orbits and, ultimately, (iii) the ejection of the particle from the trap. As well as corroborating existing measurements of spin momentum, our results reveal its dynamic effect. We show how the under-damped motion of probe particles in structured light fields can expose the nature and morphology of optical momentum flows, and provide a testbed for elementary non-equilibrium statistical mechanics.

In the following section we briefly review the various ways in which optical momentum can be decomposed into distinct components, and how those components couple to matter. We provide a qualitative justification for our particular choice. Next, we show how the elected spin and orbital components of the momentum are distributed within counter-propagating, circularly polarized beams.

A. Momentum decompositions and coupling to matter
In this article, we claim that the observation of a transverse spin force is evidence of transverse spin momentum. This association is not trivial: optical momentum can be decomposed in various ways, and the ways in which these components couple to matter is open to interpretation. Furthermore, momentum is a point-wise property of the field, while forces are integral quantities. We note that the connection between optical force and momentum is a fundamental issue, effecting the interpretation of all such experiments. The relationship between optical torque and spin angular momentum is analogous. Circularly polarized light was first observed to rotate objects by Beth in 1936 [1], yet the interpretation of this experiment continues to be debated [2].
Under the conditions of our experiment (e.g. quasi-paraxial light beams incident on spherical, dielectric particles in vacuum), the possibilities are restricted. Below, we provide a rationale for the interpretation of our experimental results.
The Poynting momentum can be decomposed into distinct components in various ways. In continuous media careful consideration of this issue underpins discussions of the Abrham-Minkowski dilemma e.g. [3,4]. In vacuum, decompositions are less varied and most seek to separate out a spin component. This can be achieved with emphasis placed on the electric field (as we do in this article), the magnetic field or on their average [5,6]: The choice of gauge provides a further degree of freedom [7]. We interpret our results in terms of Eq. (1a), for the following reasons: 2. The force on a small, finite particle can be decomposed into parts that can be directly identified with the underlying components of momentum (see Eqns. (9,10)). This provides an immediate association with the decomposition Eq. (1a), and also with the Coulomb gauge which gives the force due to the orbital component of the momentum, 3. For Mie particles this formal association between spin force and spin momentum is lost.
Nevertheless, a transverse force on a sphere requires a source of transverse momentum. 4. For dielectrics, the decomposition Eq. (1a) has been shown to make sense in a variety of experimental systems, including for spheres and slabs in the Mie regime [7][8][9]. 5. p S E and p O E have been shown to couple to dielectrics in the Mie regime, in qualitatively different ways. For instance, consideration of the optical forces on finite flat plates indicate that p S E couples to matter via an edge effect [8]. Decomposing the Poynting momentum via the electric field [5,10], we have, in SI units: where c is the speed of light, and 0 the permittivity, in vacuum. The suffices E and H are suppressed from here on. The first part, p O is independent of polarization and referred to as orbital momentum. It is equivalent to the canonical momentum derived from Noethers theorem, expressed in the Coulomb gauge. The second term, p S , is connected with inhomogeneous circular polarization and referred to as spin momentum. The ways in which theses two contributions couple to matter is controversial. This is especially for the spin term whose appearance in Eq. (2) is a consequence of the symmetrization of the canonical stress-energy tensor [7,11], and was previously thought virtual. Optical momentum in counter-propagating beams consists of the direct sum of the momenta associated with each independent beam, and an additional term associated with interference. Owing to the obvious symmetries in the geometry, the axial momentum of one beam effectively cancels that of the other. The remaining momentum consists of azimuthal components of spin and orbital momentum that circulate around the beam symmetry axis. Below we evaluate these components of the Poynting vector for single and counter-propagating beams, showing that the azimuthal momentum is dominated by the spin term, such that p S φ /p O φ <≈ 2kz R , where z R is the Rayleigh range, related to the beam waist, W 0 by z R = kw 2 0 /2. For the beams used in our experiment, kz R ≈ 150.
Equation (3) gives the electric field in a circularly polarized beam, up to the first non-paraxial term. With, In the Coulomb gauge the total, spin and canonical momentum are: For a single beam, the azimuthal components of the spin and canonical momenta are: . The second two factors on the right of Eq. (5b) are negligible (smaller by a factor of ≈ kz R ). Combining Eq. (5) with Eq. (3) gives, Thus, as the beams are defocused, and the Rayleigh range increased, the ratio of the spin to canonical momentum in a single circularly polarized Gaussian beam increases in proportion to kz R ; this ratio is independent of position within the beams.
Counter-propagating beams contain a series of interference fringes which modulate the intensity and momentum fields of isolated beams. For a counter-propagating system with an electric field described by the sum of Eq. (3) with its reflection through the z = 0 plane, the modulus of the electric field is: so that bright fringes appear at z ≈ 0, λ/2, λ, 3λ/4.... Eq. (7) is a good approximation for near-paraxial beams. However, it fails qualitatively in one important respect. Counterpropagating plane waves contain dark fringes for which the intensity vanishes on extended planes. Even weak focusing destroys this condition. In the dark fringes of near-paraxial counter-propagating waves, the intensity approaches zero, but reaches this only at discrete points on the beam axis. Including the appropriate interference terms the azimuthal spin and canonical momentum are, to lowest order: Thus, in near-paraxial, counter-propagating, circularly polarized Gaussian beams, the mean ratio of transverse spin to transverse canonical momentum is 2kz R , with the canonical component becoming small in the bright interference fringes.
Although these equations provide a good approximation to the momenta, they suggest an intriguing paradox. The canonical momentum falls to zero in each bright interference fringe, but reaches a maximum in each dark fringe. This apparent irregularity is resolved by observing that the wave-front curvature prevents the intensity falling to zero in a dark fringe except at discrete points on the axis, where the azimuthal momentum also vanishes. Nevertheless, the canonical momentum density in the dark fringes can become surprisingly large, even though its absolute value remains small. Additionally, it should be noted that, whilst the azimuthal canonical momentum is susceptible to strong interference effects, the azimuthal spin of the counter-propagating beams is approximately the sum of that for the two separate beams with interference having a marginal effect.
Throughout the beam, the azimuthal spin momentum is larger than the canonical momentum by a factor of at least kz R (≈ 150 for our beams), the mean ratio being ≈ 300. Numerical evaluation of the Richards-Wolf representation of the counter-propagating beams shows that the ratio of the momenta in a bright fringe is ≈ (kz R ) 2 , Fig. (1).

Supplementary Note 2. OPTICAL FORCES
A. Small particle approximation The spin and orbit (or, canonical) components of the Poynting momentum couple to light in fundamentally different ways. For instance, the optical pressure applied to a flat plate by p S does not scale with the exposed area of the plate, suggesting that coupling of p S is an edge effect [8]. Similarly, the force on a small particle manifests as a finite size effect [7]. To lowest order, the optical scattering force on a polarizable point particle is directly proportional to p O [7]: Optical spin forces appear at the next highest order [7,12]: α e/m is the electric/magnetic polarizability which, for dielectric particles are, approximately: Here, α 0 e/m are the Claussius-Mossotti polarizabilities for dielectric particles with permeability equal to one. α e/m are the polarizabilties, corrected for finite size. These latter polarizabilities contain an imaginary part, even for non-absorbing particles, that produces the scattering force captured by Eq. (9).
Comparing Eq. (9) and Eq. (10), the ratio of the force due to p S to that due to p S is: where it is assumed p S p O , as discussed above. The term on the far left corresponds to the case of a silica particle ( p = 2.1) in vacuum. Thus, as the particle size increases, spin forces begin to dominate. In accordance with Eq. (10), these forces act in the opposite direction to the Poynting vector. Figure (2) shows this transition. For a very small particle Azimuthal forces at r=w 0 /2 a=0.02 µm : exact a=0.02 µm : approx Azimuthal forces at r=w 0 /2 a=0.08 µm : exact a=0.08 µm : approx Azimuthal forces at r=w 0 /2 a=0.  weakly with z, and that the ratio f opt φ /f opt r ≈ 0.17 is approximately constant throughout the trap.

Supplementary Note 3. LANGEVIN EQUATION OF MOTION
In general, the motion of the spherical bead is given by a Langevin equation (see, for example, [13]), Eq. (13).
F opt is the absolute optical force as a function of position, r. For a particular beam, all of the optical forces are directly proportional to the total optical power. Elsewhere, lower case letters will be used to indicate the power per unit power e.g. F opt = P f opt . F L (t) is an uncorrelated fluctuation, Eq. (14).
Where ξ = 6πµa is the usual Stokes coefficient for a sphere of radius a and I is a unit matrix.
Under vacuum conditions the viscosity, µ is a function of the ambient pressure [14], Eq. (15).
µ 0 = 1.9 × 10 −5 Pa s is the viscosity of air at atmospheric pressure, K n =l/a the Knudsen number and c K a factor: The pressure dependence enters via the mean free path as follows: where p 0 is atmospheric pressure, and p a is the actual pressure in the chamber.
In what follows, we prefer to use measured values of the drag coefficients, see section Supplementary Note 5.
Radial and azimuthal forces at z=0 Figure 3. (a) Variation of the azimuthal force with the axial coordinate, at a radial distance r = W 0 /2, radial and azimuthal forces at z = λ/4 (b), and z = 0 (c)

A. Linear Regime, Characteristic Frequencies
For small displacements, the optical force field can be linearized about the mechanical equilibrium position, on the beam axis. In Cartesian coordinates, Where, K is a non-symmetric two dimensional tensor given by the sum of a isotropic diagonal part, K r I, associated with the conservative gradient force, and a skew symmetric part, with entry K φ , associated with the non-conservative spin force. K φ can be positive or negative, depending on the handedness of the circular polarization. The power normalised stiffness tensor is written, k with spin and gradient coefficients, k φ , k r so that F opt = P f opt = −Kr = −P kr. The linearized equation of motion is then: Or, in the frequency domain: where R is the Fourier transform of the Cartesian coordinates of position, r.
The motion of the trapped particle can be described in terms of characteristic frequencies, ω = ω c , obtained by setting the determinant of the left hand side matrix, Eq. (20a) to zero e.g.
Where ω 0 = K r /m is the conventional resonant frequency of a harmonic oscillator with spring constant, K r . There are four distinct values of ω c , corresponding to the different permutations of the ± signs. To simplify the notation we write, and, Then, the full set of characteristic frequencies are: In general, the real parts of the ω c relate to oscillatory motion, and the imaginary parts either to growth or decay, depending on the sign. The exact expression for ω c , Eq. (21b) reveals a fundamental difference between motion in a linearly polarized trap, where K φ = 0, and a circularly polarized trap where K φ = 0. We discuss these cases separately below.
Linear polarization, K φ = 0: In this case, the ω c are degenerate with ω c+ = −ω * c− and ω c− = −ω * c+ , leaving two unique values, ω c+ and ω c− . In the limit of low optical power (or high viscosity), ξ 2 4mK r so the Π and, therefore, ω c is purely imaginary and the motion consists of relaxation only. This is the over-damped regime. As the optical power is increased, K r = P k r increases and the first term becomes positive when 4mK r > ξ 2 and ω c acquires a real component, corresponding to oscillatory behaviour. In this, under-damped regime, the motion corresponds to a damped oscillation at frequency a ω ≈ K r /m that increases with (P ), with an invariant decay constant, ξ/2m. there are four distinct ω c . Since Π ∝ √ P , the ω c are similar to those for the K φ = 0 case when P is small e.g. for small P , ω c ≈ ± K r /m + iξ/2m. As the optical power is increased both the real and imaginary parts deviate increasingly from these values. Most significantly, the imaginary parts of two of the ω c approach zero with increasing optical power. In particular, for K φ > 0, (ω c− ) approaches zero, and is equal to it when the optical power, P reaches a threshold value P = P thr .
Where Ω thr is the value of (ω c− ), the oscillatory response of the system, at P = P thr .
Equivalently, (ω c+ ) approaches zero and (ω c+ ) → Ω thr at P thr when K φ < 0. Since (ω c ) quantifies the time constant for the damped oscillations, this process signifies a transition to sustained, driven vibration in which the viscous drag of the ambient gas is compensated for by the driving force, in this case provided by optical spin momentum.
The thermal motion of the particle, particularly the distribution of the particle energy in frequency space, can be understood through the generalised power spectrum, described below.

B. Power Spectrum
In terms of the optical force field, the beam axis is a stable equilibrium position, where the forces vanish and are locally restoring. In this configuration, the particle is exposed to thermal fluctuations that take the form of white noise. The amplitude of the resulting motion depends on the susceptibility of the trap, which is frequency dependent. Obviously, thermal forces operating at frequencies at, or close to, the characteristic frequencies described above, will have a greater effect on the motion. When the optical power is such a characteristic frequency is purely real or has a small imaginary part, high amplitude motion and instability will result. These factors are quantified through the power spectrum.
From Eq.(20), we have Combining with Eq. (14), gives : Where the superscript, H, indicates the Hermitian transpose and, The power spectral density is: This can be expanded in the characteristic frequencies, ω c . The factor M H is associated with another set of characteristic frequencies, given simply by the negative values, −ω c , leading to, Where ω c,i label the characteristic frequencies, ω c and ν ± are the characteristic frequencies when K φ = 0, i.e., For brevity we write, with Λ = −ξ 2 + 4mK r . As with the previous discussion of the characteristic frequencies, the power spectrum depends qualitatively on whether or not K φ = 0. We discuss these two cases separately, below.
Linear polarization, K φ = 0 : In this case, the cross terms in R(ω)R * (ω) disappear, and the characteristic frequencies are degenerate, leaving only two independent values, ω c+ and ω c− , equivalent to ν ± when K φ = 0. The power spectrum consists only of the diagonal terms X(ω)X * (ω) = Y (ω)Y * (ω) , which takes the familiar form of a Lorentzian: This function consists of a single peak, which attains its maximum when ω 2 = (ν 2 + ) = (ν 2 − ). In the under-damped regime, this is ω ≈ K r /m, and the peak height is, As usual, the peak width at half the maximum is given by the imaginary part of ω c , i.e.
ξ/2m. Circular polarization, K φ = 0 : In this case, the cross terms in the power spectrum, do not vanish, and all four characteristic frequencies are distinct. As described above, we expect the imaginary parts of two of the characteristic frequencies to approach zero as the optical power is increased towards P thr : for K φ > 0, (ω c− ) → 0 (and −ω * c− → 0) as P → P thr . When K φ < 0, ω c+ → 0. The power dependence of the power spectrum can be appreciated through the approximate form of the characteristic frequencies, Eq. (21c) which, in terms of the optical power, P , can be written: where P thr is the threshold power, Eq. (25a). For low optical power, ω c are close to the characteristic frequencies when K φ = 0 (i.e. ω c± ≈ ν ± ) and the dependence of the power spectrum on the optical power is similar to that in the case of linear polarization i.e. the peak height decreases according to ≈ 1/P , and the width remains approximately constant.
As P increases towards P thr , this behaviour changes qualitatively. Writing δ = (P − P thr ), the characteristic frequencies are approximately, and the peak heights of the diagonal terms in the power spectrum are approximately, As the optical power, P , approaches the threshold power, P thr the peak in the power spectrum grows in height according to 1/(P − P thr ) 2 and decreases in width according to the imaginary part of (ω c ) ∝ (P − P thr ).
In summary, for linearly polarized beams, the peak height in the power spectrum decreases with optical power, ∝ 1/P and the width remains constant. For circularly polarized beams, the same behaviour is obtained when P P thr but, as P approaches P thr the peak height grows as 1/(P − P thr ) 2 while the width decreases with (P − P thr ).

C. Variance And Correlations
According to the Wiener-Khinchine theorem, time correlations and particle variance follow from the Fourier transform of the power spectrum [13].
For τ = 0 these quantities correspond to the variance of the particle in the trap, and for τ > 0 we have the correlation between the position of the particle between present and future times. Eq. (38) can be evaluated with the residue theorem, using a hemi-circular contour, closed in the lower half plane for τ > 0, Where the sum is taken over the poles lying in the lower half plane. Comparing the case of linear and circularly polarized beams leads to the following conclusions.
Linear polarization, K φ = 0 : Summing the residues gives, for the diagonal terms, When τ = 0, x 2 = k B T /K r which follows directly from equipartition, by equating the elastic energy of the trap, 1 2 K r x 2 with the thermal energy for a single degree of freedom,  . This causes the causes the sum of the residues for the cross term to vanish, ensuring that the x and y coordinates are instantaneously uncorrelated, as they should be e.g. xy = 0. This also fixes the relative phases of the oscillatory parts of x(t + τ )x(t) and x(t + τ )y(t) , as described below. Second, either A(P ) or a(P ) are singular for P → P thr , depending on the sign of K φ . For example, when K φ > 0, (ω c− ) → 0 as P → P thr . Under this condition, the residues for ω c− and its complex conjugate contain a factor, (ω 2 c− − (ω * c− ) 2 ). From the approximate form of ω c− , Eq. (36) this evaluates to, (ω 2 c− − (ω * c− ) 2 ) ≈ − iξω 0 δ 2mP thr , where δ = (P − P thr ), so that A(P ) ∝ 1/δ as P approaches P thr . With these considerations in mind, the time covariance of the coordinates is: for the diagonal terms, and for the cross terms, Where the approximate form of ω c , valid in the under-damped regime, has been used (Eq. (21). As the optical power is increased towards P thr , singular terms (A(P ) for K φ > 0) dominate and the corresponding decay constants increase rapidly leaving: where the approximation, Eq. (36), for δ = (P − P thr ) < 0 has been used. Thus, as the optical power approaches P thr , the covariance of the coordinates consist of a damped oscillation with amplitude ∝ 1/(P thr − P ) at the resonant frequency, ω 0 , decaying with a long time constant ∝ 1/(P thr − P ). In other words, the particle motion undergoes a transition from biased stochastic motion towards deterministic, sustained oscillation.
The phase difference between the oscillations, x(t + τ )y(t) and x(t + τ )x(t) indicates sustained circular motion. When P = P thr , the frequency of this motion is given by Ω thr , Eq. (25b). For yet higher optical powers, the particle begins to orbit the beam axis deterministically, as described below. There are several points to note. First, threshold power and orbit frequencies become large as the sphere radius becomes small. This reflects the scaling properties of spin forces [7], which vanish in the case of a point dipole. Second, the threshold power varies most strongly with sphere radius. For the size of the sphere used in the experiment (a = 0.77µm), P thr varies between ≈ 0.1 W and 1 W as the beam waist varies from ≈ 2µm to ≈ 3.5µm. Third, beyond a beam waist radius of W 0 ≈ 2µm the orbit frequency, Ω thr is almost completely independent of W 0 , and depends only weakly on a, for a > 0.2µm, expressing the fact that the ratio between spin and gradient forces is approximately constant for sufficiently large particles in sufficiently paraxial beams. As can be seen, the orbit frequency varies between 10kHz and 20kHz for a variation in radius of ±0.1µm about the experimental value, a = 0.77µm, indicating the accuracy of the experiment in this regard. In passing, we note that it should be possible to generate very high orbit frequencies under easily obtained laboratory conditions. For a narrow beam waist (≈ 0.5µm), the threshold power, P thr , is ≈ 1 W for a sphere radius of a ≈ 0.18µm. Under these conditions the orbit frequency Ω thr ≈ 10 6 Hz. The radius of such an orbit would be ≈ 0.25µm.

Supplementary Note 4. ORBITAL MOTION
In the following we examine the orbital motion of particles in cicrularly polarized traps.
Initially, we derive conditions for stable equilibria, ignoring thermal fluctuations. Next we quantify the stochastic fluctuations associated with the orbits.

A. Equilibrium Conditions
In the absence of thermal fluctuations, the equations of motion in polar coordinates are: Where f r,φ are the radial (i.e. gradient) and azimuthal spin forces, per unit power, P .
Eq. (44a) is the radial component, Eq. (44b) tangential. These equations have a simple equilibrium solution, r = 0 which is also mechanically stable. However, above the threshold power, P thr , introduced above, this solution is not thermally stable. Equilibrium orbits are formed when the azimuthal spin forces, P f φ balance the azimuthal drag, ξrφ, and the gradient force P f r balances the centripetal force mrφ 2 . Setting r = r o ,φ = Ω o Eqs. (44), gives the equilibrium conditions, P o is the power required to sustain an orbit of radius r o when the drag is ξ, and Ω o is the associated frequency. Equilibrium conditions can be varied either through the power, P , or through the pressure in the cell, which determines the viscosity and therefore the drag, ξ.
Since f r /f φ is approximately independent of radius, the orbit frequency is determined by this ratio, and by the drag, but does not depend explicitly on the optical power.

B. Linear Stability
The linear stability of the equilibrium orbits can be understood by first rewriting the equations of motion, Eq. (44) as coupled, first order equations, For small r 1 , v 1 , Ω 1 : Derivatives of f v are : Or, using Eqs. (45): Derivatives of f Ω are : Or, using Eqs. (45) : The system is asymptotically stable if and only if the real parts of all the eigenvalues of the matrix corresponding to Eqs. (48) are negative i.e. iff (λ i ) < 0 ∀i. Setting the determinant of Eq. (48) to zero gives the following polynomial, satisfied by the eigenvalues, With: Rescaling λ in Eq. (56) The roots of P l (l) = 0 determine the linear stability of the orbit. P l (l) depends only on the coefficients X(r o ) and Y (r o ), which depend only on the optical forces and are independent of the mass and the drag. Consequently, the orbital stability can be described solely in terms of orbit radius.
A complete stability analysis can be achieved by considering the discriminant of the cubic, Eq. (56). This leads to rather complicated formulae. A simpler approach is provided below, resulting in formulae that are accurate for the system considered here.
1. The polynomial, P l (l) = 0 has at least one purely real root which is negative, since 4. A necessary condition for there to exist three distinct, real roots is that P l (l) contains two turning points i.e. P l (l) has two real roots, l = − 2 3 ± 2 The conditions in (3) and (4) can be estimated by assuming that the gradient and spin forces are proportional to the gradient of the beam intensity e.g.
Where W 0 is the beam waist radius and γ 1 scales the radial force relative to the azimuthal force. This ignores the finite size of the particle, but compares favourably with numerical evaluations of the forces. Under this approximation, so that orbits are asymptotically unstable for r o > ∼ 2/3W 0 . Condition (4) is, so that orbits are completely unstable for r o > ∼ W 0 .

C. Orbits Fluctuations
When the optical power exceeds the threshold, P thr , mechanically stable orbits can form, as described above. For these orbits to be thermally stable and coherent, radial fluctuations must be small in comparison to the orbit radius, r o . These fluctuations can be estimated in a manner similar to sub-threshold fluctuations analysed in Section Supplementary Note 3.
We consider perturbations to the equilibrium orbits in the presence of thermal fluctuations, and derive an expression for (r −r o ) 2 . Close to P thr , we find that (r −r o ) 2 ∝ 1/(P −P thr ): immediately above threshold fluctuations in the orbit radius are very large, and the orbit is incoherent despite being in mechanical equilibrium. This is related to the degree of linearity in the force field. A higher power, P orb is required to propel the particle into non-linear regions where the orbits become coherent.
The Langevin equations of motion in polar coordinates are: Where, as before, f L r,φ are uncorrelated stochastic forces with zero mean and variance 2k B T ξδ(t − t ). Perturbing the equilibrium orbit with r ≈ r o + r 1 (t) and Ω ≈ Ω o + Ω 1 (t), and locally linearising the force field so that, for example f r (r o + r 1 ) ≈ f r (r o ) + (r 1 − r o )f r (r o ) and removing the lowest order (i.e. equilibrium) solution leaves, to first order: The Fourier transform is: Wherer 1 ,Ω 1 are the FTs of r 1 , Ω 1 . This allows us to compute the power spectrum of the perturbed quantities, r 1 and Ω 1 , in a manner directly analogous to the treatment in Section Supplementary Note 3 B. The result is: With, Where the frequency has been rescaled according to ω = ξ m W . P (W ) is the polynomial, P (W ) = −iW 3 − 2W 2 + iXW + Y with coefficients: The form of P (W ) is familiar from the analysis Section (Supplementary Note 3 A), for reasons associated with its construction.
As previously, time dependent variances may be computed through the inverse Fourier transform of the power spectrum. The instantaneous variance in the orbit radius is: A basic property of the orbit fluctuations follows directly from the form of P (W ). For a linear force field with f r = k r r and f φ = k φ r, the coefficient Y vanishes identically. When this happens, W = 0 is a characteristic frequency, and the variances r 2 1 and Ω 2 1 diverge. For this reason, the forcefield must depart sufficiently from linearity in order for the orbit to be radially confined. Figure (5) shows a representative calculation of r 2 1 as a function of optical power. We note that this approximation suffers from a number of inaccuracies (e.g. we have assumed that the force field is linear within the range of the fluctuations of the particle), and the results should not be treated as quantitatively meaningful. However, the qualitative message is clear. Orbits with small radius will not be coherent. In particular, the variance in the orbit radius is likely to take the particle into regions where the transverse force is not only greatly reduced from the equilibrium value, but reversed completely. This process decreases the particle momentum resulting in the repeated formation and collapse of orbits. Only for sufficiently large orbit radius will the trajectory be stable and coherent.
As a consequence, there is an intermediary regime above threshold power, P thr , in which the particle may acquire sufficient angular momentum to overcome gradient forces, but which may also suffer loss of momentum due to fluctuations. Therefore, a higher power, P orb > P thr is required to observe orbit formation. The hysteresis effect that we observe is associated with this.

Supplementary Note 5. DATA FITTING
The theoretical form of power spectral density (PSD, Eq. 29) for a particle in regime I can be rewritten using the following parameters: where m is the mass of the particle. This fitting procedure was applied independently for each polarization setting and optical power, holding m and ξ fixed, and allowing all other parameters to vary. We find that the experimental measurements closely match theoretical values, with only a single scaling constant for the optical power. Crucially, the ratios for the radial and azimuthal forces (tabulated below), and their variation with radius (shown graphically in Fig. (1)) are almost identical. The need for rescaling the power arises from the variability of the experimental parameters. For instance, small uncertainties in the particle radius, density and refractive index strongly effect the absolute values of the force while the ratios of the radial and azimuthal forces are relatively insensitive to these variations.