Quantitative analysis of non-equilibrium systems from short-time experimental data

We provide a minimal strategy for the quantitative analysis of a large class of non-equilibrium systems in a {statistically} steady state using the short-time Thermodynamic Uncertainty Relation (TUR). From short-time trajectory data obtained from experiments, we demonstrate how we can simultaneously infer quantitatively, both the thermodynamic force field acting on the system, as well as the (potentially exact) rate of entropy production. We benchmark this scheme first for an experimental study of a colloidal particle system where exact analytical results are known, before applying it to the case of a colloidal particle in a hydrodynamical flow field, where neither analytical nor numerical results are available. In this latter case, we build an effective model of the system based on our results. In both cases, we also demonstrate that our results match with those obtained from another recently introduced scheme [Phys. Rev. X 10, 021009].

Non-equilibrium thermodynamics at microscopic length scales is dominated by a fascinating range of phenomena [2], where thermal fluctuations play a crucial role. These phenomena can now be observed in great detail experimentally, due to the availability and scope of current microscopic manipulation techniques. The interpretation and quantitative analysis of the experimentally available data is however lagging behind these advances, mostly due to the fact that the vast majority of these systems are too complicated to model without making several approximations, despite having far fewer degrees of freedom than their macroscopic counterparts. Even when it is possible to build such simplified models, these are still usually too complicated to solve except sometimes by numerical analysis of specific systems, which however lack general insights. There could also be other factors making the system hard to solve, such as the presence of a background flow, for which the spatial dependence of the flow velocity needs to be known by means of solving the corresponding Navier-Stokes equation; usually a difficult task, especially for unsteady flows. In the face of all these challenges, a relevant question is whether it is at all possible to gain any precise quantitative information about a complex non-equilibrium system directly from experimental data, bypassing the first step of either having a known model to compare with or building in simplifying assumptions about the system.
Not surprisingly, this question has aroused a lot of recent interest. Broadly speaking, measurements from experiments can be used to obtain general information about the system, such as identifying that detailed balance is broken and hence the system is out-of-equilibrium [3][4][5] (not always obvious for microscopic systems such as at the cellular level), or to obtain more specific properties of the system such as the rate of dissipation of energy (equivalently the rate of entropy production) [6][7][8][9][10][11][12][13], the average phase-space velocity field [1,3,14] related to the so-called thermodynamic force field [15,16] or the microscopic forces driving the system [1,12,17].
The motivation for such studies is that if quantitative information about the system can be directly obtained from experimentally observed quantities, then this understanding can be used for building more realistic and experimentally validated models of the system of interest [3,18,19].
A very informative quantity about a non-equilibrium system is the rate of entropy production. This quantity not only signals -when it is non-zero -that the system is out of equilibrium, but also provides a quantitative measure of how far from equilibrium a system is and the irreversibility of the dynamics [20][21][22]. In the context of microscopic machines [23], a quantification of the amount of energy dissipated directly provides information about engine efficiencies [24][25][26] and prescriptions for obtaining optimal operating conditions [27]. The value of the entropy production rate can also be used to obtain information-theoretic quantities of interest [28], or even information about hidden degrees of freedom [29]. The entropy production rate is also a very robust quantity to measure from the experimental point of view, since it is not so strongly affected by conversion-factor errors in measuring particle positions, as we remark later.
The entropy production rate can be obtained directly from experimental data, at least for systems where it is understood that the underlying dynamics is Markovian, by several means. These include utilizing the Harada-Sasa equality [6] which involves a spectral analysis of trajectory data [30,31], determining the average steady state current and steady-state probability distribution from the data [7], determining the time-irreversibility of the dynamics [22,[32][33][34][35][36] and relatedly determining estimators for the ratio of forward and backward processes directly from the data [10,37,38]. Recent approaches [1,12] also advocate inferring first the microscopic force field from which the entropy production rate can be deduced.
An alternative strategy to direct estimation, is to set lower bounds on the entropy production rate [39][40][41][42][43] by measuring experimentally accessible quantities. One class of these bounds, for example those based on the thermodynamic uncertainty relation (TUR) [43][44][45][46][47], have been further developed into variational inference schemes which translate the task of identifying entropy production to an optimization problem over the space of a single projected fluctuating current in the system [11,[48][49][50]. Recently, a similar variational scheme using neural networks was also proposed [51]. As compared to some of the other trajectory-based entropy estimation methods, these inference schemes do not involve the estimation of probability distributions over the phase-space. Rather they usually only involve means and variances of measured currents. Hence they are known to work better in higher dimensional systems [11]. In addition, it is proven that such an optimization problem gives the exact value of the entropy production rate in a steady state as well as the exact value of the thermodynamic force field in the phase space of the degrees of freedom we can measure, if short-time currents are used [48][49][50][51]. However, these methods have not yet been tested against experimental data to the best of our knowledge.
Here we test the Short-time TUR scheme against the challenges posed by experimental setups involving colloidal particles in time-varying potentials with (possible) background flows. In order to benchmark the scheme, we first test the scheme in a setup where the entropy production rate of the system can be analytically predicted for any set of parameters. For this set up, we test our predictions against both analytical results as well as another recently proposed numerical scheme, namely stochastic force inference (SFI) [1]. After this benchmarking exercise, we apply our scheme to a modified system for which the underlying model is both unknown and hard to estimate. Though there is no theoretical value to compare with in this case, the short-time TUR's predictions even here, are again in perfect agreement with that predicted by the SFI technique [1]. These results provide a motivation for modelling this system in terms of coupled Langevin equations with two free parameters. We demonstrate that such a model does indeed capture the experimental observations, hence demonstrating the usefulness of these schemes in modeling complex scenarios.

MODEL
The results we demonstrate here apply to systems with continuous state-space but a finite-number of degrees of freedom, described by overdamped Langevin equations of the typeẊ Here µ = 1, . . . , d is the number of degrees of freedom of the system and we use · to refer to the Ito convention. F µ (X) is a function of X, but not an explicit function where · denotes averaging over the statistics of the noise. The corresponding Fokker-Planck equation for the probability distribution function P is given by: where the repeated indices are summed over. In the steady state ∂ t P = 0. The total rate of entropy production σ can be obtained as [7,34], is called the thermodynamic force field [11]. Overdamped Langevin equations are excellent descriptions for colloidal particle systems. Even for systems where the Langevin equation is not known, the fact that such a description exists in principle is all that is needed in order to apply Eq. 3a and obtain σ by determining the current and steady-state probability density directly from the timeseries data [7,11]. Another approach is to first infer the terms in the Langevin equation, F µ and D [1,12] and use Eq. 3a to obtain σ. These methods can be applied directly on data obtained from tracking the system or even by using tracking-free methods in image space [12].

RESULTS
In this paper, we demonstrate an alternative method for the simultaneous determination of both the entropy production rate as well as the thermodynamic force field F µ from experimental data, using the recently introduced short-time thermodynamic inference relation [48][49][50]. Our method is built on an exact result obtained in [48][49][50]: where k B is the Boltzmann constant and J is a weighted scalar current constructed from the non-equilibrium stationary state as shown below. The notation · stands for an ensemble average. The current that maximizes the term within the square brackets is J ∝ ∆S tot . Here ∆t is the short time interval over which the mean and variance of the current is evaluated [48]. In this work, it also coincides with the sampling rate of the trajectory. As for the ordinary TUR [52], our result too holds for any X that is even under time reversal. The equality in (4) holds only when X includes all degrees of freedom of the system. If not, then the RHS of (4) gives a lower bound. The proof presented for Eq. (4) in [48] was based on exact results for non-trivial models. It was shown that Eq. (4) is a consequence of fluctuations of ∆S tot becoming Gaussian, in the ∆t → 0 limit. Later in [49] and [50], Eq. (4) was rigorously proved for overdamped diffusive processes.
Let us now discretize X in time with time interval ∆t: X 0 µ · · · X j µ · · · X N µ . We use latin indices as superscripts for the discrete time labels and Einstein summation convention is applied to the greek indices. For a given function d(X) we can now define a time-discretised scalar function constructed from the steady state current, Any such current, when substituted in the expression inside the square brackets of Eq. 4 can be shown to give a lower bound σ L which is ≤ σ. In addition, for a special value of d = d * , J ∝ ∆S tot and σ L = σ. The algorithm we use, which obtains this d * and σ through a maximisation procedure is as follows: 1. We first obtain a time-series of experimental data: X k .
2. To be able to perform the maximisation we use a set of basis functions ψ m (X), m = 1, . . . , M , in the space spanned by X such that where w m ∈ R d and are the parameters to be optimised. We use two sets of basis functions: Gaussian and linear and generate all our results in both these bases, for comparison.
3. Maximize Eq. (4) to obtain σ. This maximization is done using a numerical optimizer: We start with an initial guess for w m , calculate the timeseries J k , construct the function within the square brackets in (4) and then maximise over w m to obtain σ and also the set of values w * m such that d * = M m=1 w * m ψ m (X) maximises Eq. (4). The maximising current J * is constructed from d * using Eq. (5) and in addition can be shown to be proportional to ∆S tot [48].
Furthermore, the thermodynamic force is proportional to d * that maximises (4) [48][49][50], i.e., Hence, by solving an optimization problem, where the RHS of Eq. (4) is maximized in the space of all currents we can obtain σ as the optimal value as well as its conjugate thermodynamic force field, F = c d * where the proportionality constant can be fixed by using Var(J * ) = 2 J * at ∆t → 0 [48] as, c = 2 J * Var(J * ) . We note that, for any set of basis functions ψ m (X), m = 1, . . . , M which give an adequate representation of d(X), an analytic solution to the maximization problem is known [50]. This solution gives a deterministic estimate of σ as, whereψ k = ψ k and Ξ k,l = ψ k ψ l −ψ kψl . Further, the optimal coefficients can be directly computed without any optimisation as Repeated indices are summed over as before. Numerically, this involves inversion of the matrix Ξ. On the one hand, if d, N and M are not very large, this deterministic scheme is faster compared to a numerical optimization algorithm, and does not get stuck in local maxima. On the other hand, numerical optimization schemes can in principle simultaneously handle the optimisation of parameters of the basis functions. This is discussed in some detail in [49]. In addition, numerical optimization schemes have also been extended to systems driven in a time dependent manner [53], where it is as yet unclear how the deterministic scheme will perform. In this work, we implement the numerical optimization scheme using a particle-swarm optimizer. We provide a brief introduction to the algorithm in the Methods section and also study its convergence properties in Figs. 9 − 13. We note that Refs [49,50] have already demonstrated the feasibility of the scheme described here with numerical data. Here we test this scheme instead on controlled experimental setups.

Colloidal particle in a stochastically shaken trap
To test the inference scheme we first apply it to an experimental problem for which the rate of entropy production is known from theory [54-57] -a colloidal particle in a stochastically shaken optical trap. This model was first experimentally studied in [58]. We study it again in order to understand the limitations posed by experimental setups for our inference scheme as well as test and benchmark our scheme for a system where the results are known.
We trap a polystyrene particle in an optical trap; further details of how the experiment is performed may be found in the methods section. We modulate the position of the center of the trap λ(t) along a fixed direction x on the trapping plane perpendicular to the beam propagation (+z). The modulation is a Gaussian Ornstein-Uhlenbeck noise with zero mean and covariance λ(0)λ(s) = Aτ 0 exp(− | s | /τ 0 ), i.e., where η is Gaussian, has zero-mean and is white-in-time.
The correlation time τ 0 is held fixed for all our experiments. Note that Aτ 0 can be interpreted as an effective temperature [59]. The dynamics of the colloidal particle is well described by an overdamped Langevin equation, where K is the spring constant of the harmonic trap, γ is the drag coefficient, ξ is the thermal noise, D = k B T /γ is the diffusion coefficient of the particle and T the temperature of the medium. The noise ξ is also Gaussian, zero-mean and white-in-time and mutually independent from the noise η in Eq. (10). Equations (11) and (10) together define the model we call the Stochastic Sliding parabola. Starting from arbitrary initial conditions for x and λ, the system reaches a non-equilibrium steady state, with the probability distribution function and current given respectively by [54] P where the dimensionless parameters θ and δ are defined as, The rate of entropy production and the thermodynamic force field for this model are, In Fig. 1 we compare the above exact results to the outcome of the inference algorithm applied to numerically generated data for this model. Different sets of time-series data were generated by varying the noise amplitude ratio θ by varying A, keeping the other parameters fixed. In Fig, 1b, we see that the inference algorithm predicts a value σ L which is lower than the true value σ in the beginning, but gets very close to the true value, after a relatively modest number of steps. As we run the algorithm longer, σ L saturates to something very close to the actual value. The inference algorithm also simultaneously gives an optimal force field d * (x) which is very similar to the thermodynamic Force field F µ (x) expected from theory (see Fig. 8 in the Methods section). From Eq. (14), it is clear that σ increases linearly with θ or equivalently the parameter A. Fig. 1c illustrates that the inference algorithm captures this behaviour accurately. Since we are limited by the minimal resolution of the time series in probing the ∆t → 0 limit of Eq. (4), the inferred value of entropy production is in general different from the exact value by an O[∆t] term. For this model we can also compute this correction analytically as (using expressions previously obtained in [57]), where σ ∆t is the result one gets from Eq. (4) for a fixed value of ∆t. Notice that the O[∆t] correction increases with the value of θ. The inferred values of σ indeed lie between these two limits. The inferred entropy production rate (σL) plotted against the number of steps in the optimization process for θ = 0.33 with ∆t = 0.0001. c) Inferred entropy production as a function of the parameter θ. The blue line corresponds to the theoretical value of σ, Eq. (14). The squares correspond to the inferred values of σ using the inference scheme (Eq. (4)) in the Gaussian basis, and triangles correspond to inference using the Linear basis. The green dashed line corresponds to the best estimate of σ that can be obtained using inference with ∆t = 0.0001, using Eq. (15). The error estimates are set by computing the standard deviation over the values obtained for an ensemble of 8 trajectories ( see Fig. 7 in the Methods section).
Next, we tested the algorithm on experimentally generated data for the same model. In the experiments, we varied A ranging from 0.1 to 0.35 in units of 0.6 × 10 −6 2 m 2 s −1 (corresponds to θ varying from 0.22 to 0.77), while the other experimental parameters such as the trap stiffness, as well as the bath temperature, were assumed to be constant for the entire length of the experiment. In reality however, the laser used to trap the particle is prone to power fluctuations, and there can also be minor changes in the bath temperature due to heating caused by the long exposure to the laser. For large values of θ, we also expect non-harmonic effects to be significant, due to the particle exploring the peripheral regions of the trap [60]. We comment in the following paragraph on the implications of these fluctuations for our results. An immediate consequence is however that the theoretically predicted values Eq. (12a) can only be used as a reference. We benchmark our results instead by comparing them with values obtained by the application of the Stochastic Force Inference technique (SFI) scheme recently proposed in [1], which gives an independent estimate of both σ as well as the force fields.
Experiments for individual parameter sets were carried out for a duration of 100s, with a sampling rate of 10 KHz for the particle position. Only about 2/3rd of the available experimental data was used and the remaining 1/3rd was discarded because of the presence of uncontrolled experimental errors in them. For the analysed data, each of the 100s long data sets were further divided into 12.5s long patches, upon which the inference algorithm was then tested. In Fig. 2, we demonstrate the results of the analysis of the experimental data. The dark-blue dashed line corresponds to the theoretically predicted value, Eq. (14), of the entropy production rate for the model given by Equations (10) and (11) with the given parameters (the values are given in the Supplemental information). The blue line is the entropy production for a slightly modified model explained in the Supplemental information, obtained by analysing the data obtained from SFI and calculating the drift and diffusion terms from it. The region between the red dashed lines corresponds to the error bar set by the variation in model parameters (namely the drift and diffusion coefficients) in different experiments as quantified by the SFI analysis (see the Supplemental Information). The data points are the results of our inference algorithm as well as the SFI scheme. As is evident, our inference scheme predicts exactly the same or very similar values for σ as the SFI algorithm, for all values of A.
The prediction of the inference scheme and SFI matches also for the thermodynamic force (see Fig. 3). Namely, the optimal current d * (x) which we get as an outcome of our inference algorithm, also matchesF(x), which is F(x) estimated from the trajectory data by means of the Stochastic Force Inference technique [1]. We conclude that our inference algorithm infers the correct entropy production value, as well as the correct thermodynamic force field, for the experimental data, since we get the same results when using a completely indepen-  14). The squares and triangles corresponds to σ estimated from the experimental data using our TUR-based inference scheme with a Gaussian basis and a Linear basis, and using ∆t = 0.1ms. The error bars correspond to averages over eight independent realizations of duration 12.5s. The circles correspond to σ estimated using the Stochastic force inference scheme (SFI) [1] for the whole 100s data set, and the errorbars for these correspond to a self-consistent estimate of the inference error that the SFI provides [1]. The blue line corresponds to σ predicted by a model obtained from SFI (see Supplemental Info), and the red dashed lines correspond to error bars for this SFI-based model. dent and different technique. After this benchmarking exercise, we study now a more complicated situation.

A colloidal particle trapped near a microbubble
We now show how our scheme performs in estimating the rate of entropy generation for the case where the mechanical force on the colloidal particle is not known. For this purpose, we study a particle trapped in the vicinity of a microscopic bubble of size 20-22 µm. We have already used this experimental setup to study one or more microbubbles with colloidal particles moving in the liquid in a different context [61,62]. The microbubbles are nucleated on a liquid-glass interface. The surface is pre-coated by linear patterns of a MB-based soft oxometalate (SOM) material. We focus a laser beam on any region along this pattern, the SOM material gets intensely heated and a microbubble forms. The top of the bubble is colder than its bottom where it is anchored to the interface. As the surface tension is a function of temperature, the variation of the surface tension along the surface of the bubble sets up a Marangoni stress, driving a flow along the surface of the bubble. Marangoni flow around freely floating bubbles under a temperature gradient have been studied both experimentally [63] and analytically [64]. The additional complexity here is the presence of the bottom surface on which the flow must satisfy no-slip boundary conditions. The flow around the bubble in this setup is not yet known in detail although an approximate description, valid if we are not too close to the bubble, has been developed [62], which we show in Fig. 4. This flow drags the trapped colloidal particle and changes its steady-state probability distribution (See Figs. 5a and b). Since the flow streamlines are directed towards the bubble, we expect that these will confine the trapped particle more than the case without the bubble. This is indeed the case as we show later. We expect that the underlying description of the particle is still an overdamped Langevin equation, including a flow velocity field u(x). However, the quantification of this flow field is rather difficult, even numerically, as argued above. As a result, we have a system where the details of the microscopic description and forces are unknown. Our inference scheme, on the other hand, is easily applicable even in this context.
At the level of the non-equilibrium trajectories of the system, we see that there is a qualitative difference from the case without the bubble. First, we see that the particle is more confined in the trap along the x-direction, when there is a bubble in the vicinity (see Fig. 5b) as mentioned earlier. This confinement is caused both by the flow towards the bubble (as shown in Fig. 4), which gets balanced at the confined position by the opposing force of the confining potential, as well as the reduced fluctuations close to the bubble due to proximity effects [65]. Further statistical analyses also reveal weaker non- . Consistent with these observations, on applying the inference algorithm, we observe that the value of σ is substantially reduced in the presence of the bubble. The corresponding entropy production rates estimated are σ = 244.68 k B s −1 for the no-bubble case and σ = 7.66 k B s −1 for the case with the bubble. We also find that the thermodynamic force, estimated using the inference scheme, is significantly reduced along the x direction, and the force field is less tilted along that direction as compared to the case without the bubble, as shown in Fig. 5c.
To further analyze the effect of the bubble, we performed another experiment, where we trapped the particles at different distances from the bubble. As we go a distance d ∼ 1.5r (r is the radius of the microbubble) from the surface of the bubble, we see that the inferred value of σ gets closer to the value the system would have had in the absence of the bubble. This is demonstrated in Fig. 6.
An important point to understand here, in the light of these findings is the significance of the inferred value of σ. In the case without the bubble, it is exactly the total heat dissipated to the environment as a consequence of maintaining the system in a non-equilibrium steady state (by shaking the trap). In the case with the bubble however this is not the case. We present a possible mathematical description of this situation as an overdamped Langevin equation with space-dependent diffusion and damping terms in an unknown flow field u(x). Since the trap constrains the particle motion on scales which are at least two orders of magnitude smaller than the distance to the bubble, u(x) is further assumed to be a constant u d at a distance d from the surface of the bubble. σ calcu-

b)
FIG. 5. The colloidal system in the presence of the bubble. a) The microbubble -colloidal particle system. b) System trajectories without (red) and with (green) the bubble in the neighbourhood of the colloidal particle. We see that the colloidal particle is strongly confined in the presence of the bubble. c) The thermodynamic force field computed as the optimal field d * (x) without the bubble (red) and in the presence of the bubble (green). The corresponding entropy production rates estimated are σ = 244.68 kBs −1 for the no-bubble case and σ = 7.66 kBs −1 for the case with the bubble. lated from this model, reproduces the values we find from the experimental data, independent of u d , and purely as a consequence of the space-dependent diffusion and damping term, and the two fitting parameters a and b. This is demonstrated in Fig. 6 . As we discuss in the supplemental material however, there is another component of the entropy production, related to the work that the flow does against the confining potential [37,66]. This component, which does indeed depend on the value of u d , is not estimated by our inference scheme, due to the fact that u d is a field (corresponding to the velocities of the molecules of the thermal bath) which is odd under time reversal, for which the TUR does not hold [46,52,[67][68][69]. Hence we expect that the values of σ we find close to the bubble are underestimates of the true value. We elaborate on this point in the supplemental information.
Mathematical model: The colloidal system in the presence of the bubble and consequently the flow u d , can be simulated using the following equations: where, Here the parameters a and b can be tuned to match the experimental data. Particularly, 1/b stands for a characteristic length scale below which the flows created by the bubble are significant. When the distance of the trapped particle from the bubble is much greater than 1/b, we expect that the expressions will match the case without the bubble. Using a trial and error approach we obtained the fit parameters as a = 282.743 and b = 1/3µm −1 .
We remark that, as we did for the case without the bubble, the SFI technique could be used to model this case as well, since it explicitly gives the drift and diffusion terms. These are however particularly susceptible to erroneous estimates of a conversion factor which is needed to obtain the particle trajectory data in units of nm. We expand on this issue in the methods section as well as in the supplemental material. This error can be thought of as assigning wrong units to the affected phasespace coordinates. Since σ is a sum over all phase space coordinates, its evaluation is not affected by such an error unlike other quantities such as forces, diffusion terms and the thermodynamic force. Another way to understand this is to note that σ quantifies the irreversibility of the dynamics, which again is clearly not affected by a choice of units. Hence our model for the setup with the bubble only tries to reproduce the value of σ as a function of distance.
In conclusion, we have experimentally tested a simple and effective method, based on the Thermodynamic Uncertainty Relation [48][49][50] for inferring both the rate of entropy production σ and the corresponding thermodynamic force fields, in microscopic systems in nonequilibrium steady states. We have confirmed that an entirely independent method, SFI [1], gives the same answers in all the situations we have studied, hence adding weight to the physical significance of our findings. We have also carried out an extensive investigation of the convergence properties of our code as several parameters or hyper-parameters are varied, as well as a comparison with the SFI algorithm (Figs 9 − 13) in the Methods section.
Our short-time inference scheme does not need any model in order to be applicable. However, we can use our findings to come up with plausible models, which give the same σ values for a range of parameters, even in cases where modeling the system from first principles is complicated. In this regard it would also be interesting to perform a systematic study of different algorithmic schemes available to model a complex non-equilibrium systems, with focus on the advantages and disadvantages when applied to experimental data.
Experimental systems that would be particularly interesting to study are molecular motors or other cellular processes. Recently, Ref. [70] tried to quantify the activity of a cell by measuring the power spectral density of the fluctuations of the position of a phagocytosed micron-sized bead inside a cell. As it is possible to also trap such beads inside a cell with optical tweezers [70], this too could be a very interesting system to study. Finally, in other recent work [53], it has been demonstrated that inference schemes of this kind can also be made to work for non-stationary non-equilibrium states, further diversifying the scope of this class of techniques.

CODE AVAILABILITY
We use the open-sourced PYSWARM package in Python [71] for the optimization task. The algorithm used to produce the results in this paper is available at: https://doi.org/10.6084/m9.figshare.14174369. We perform the Stochastic Force Inference (SFI) analysis using the algorithm provided with [1].

DATA AVAILABILITY
The data used to produce the results in this paper is available at: https://doi.org/10.6084/m9.figshare.14176664

COMPETING INTERESTS
The authors declare that they have no competing interests.

A single colloidal particle in a stochastically shaken trap
The experimental setup consists of a sample chamber placed on a motorized xyz-scanning microscope stage, which contains an aqueous dispersion of spherical polystyrene particles (Sigma-Aldrich) of radius r = 1.5 µm. The sample chamber consists of two standard glass cover-slips (of refractive index ∼ 1.52) on top of one another. The thickness of the chamber is kept ∼ 100 µm by applying double-sided sticky tape in between the cover-slips. The aqueous immersion is made out of double distilled water at room temperature, which acts as a thermal bath. A single polystyrene particle is confined by an optical trap, which is created by tightly focusing a Gaussian laser beam of wavelength 1064 nm by means of a high-numerical-aperture oil-immersion objective (100x, NA = 1.3) in a standard inverted microscope (Olympus IX71). The trap is kept fixed at a height, h = 12 µm from the lower surface of the chamber in order to avoid spatial variation in the viscous drag due to the presence of the wall. The corner frequency of the trap (f c ) is set to be 135Hz. For the first set of experiments, the center of the trap is modulated (λ(t)) using an acousto-optic deflector, along a fixed direction x in the trapping plane, perpendicular to the beam propagation (+z). Thus, the modulation may be represented as a Gaussian Ornstein-Uhlenbeck noise with zero mean and covariance λ(s)λ(t) = Aτ 0 exp(|t − s|/τ 0 ). The correlation time τ 0 is held fixed for all our experiments. We determine the barycenter (x, y) displacement of the trapped particle by recording its back-scattered intensity from a detection laser (wavelength 785 nm, copropagated with the trapping beam) in the back-focal plane interferometry configuration. The measurement is carried out using a balanced-detection system comprising of high-speed photo-diodes [72], with sampling rate of 10 kHz and final spatial resolution of 10 nm. In all cases, the trap parameters including the conversion factors for the trajectory data were calibrated by fitting the probability distribution of the particle position in thermal equilibrium to the Boltzmann distribution P (x) = (2πDτ ) −1/2 exp(−x 2 /2Dτ ), where τ is the relaxation time in the trap, given by τ = 1/(2πf c ). We assume that the diffusion constant D has the room temperature value, D = 1.645 × 10 −13 m 2 s −1 . It is important to note that we assume that the trap parameters as well as the conversion factor for the trajectory data are unaffected when the Ornstein-Uhlenbeck modulation is turned on. However, in practice, the trap parameters can indeed be altered by small amounts over long durations of measurement ( -100s ), primarily due to the power fluctuations of the trapping laser. Further, the probe particle also moves in the y and z directions in the trap, which we have not measured here. These factors led to issues which prevented us from producing an exact replica of the theoretical model in the experiment. However, we have taken into account these limitations in our analysis as detailed in the supplemental information, Section I.
In the second set of experiments, i.e. for those with the microbubble, we employ a cover slip that is pre-coated by a polyoxometalate material [73,74] absorbing at 1064 nm as one of the surfaces of the sample chamber (typically bottom surface), and proceed to focus a second 1064 nm laser on the absorbing region. A microbubble is thus nucleated -the size of which is controlled by the power of the 1064 nm laser [73]. Typically we employ bubbles of size between 20-22 µm. Note that the sample chamber also contains the aqueous immersion of polystyrene particles. We trap a polystyrene probe particle at different distances from the bubble surface, and modulate the trap centre in a manner similar to the experiments without the bubble. The particle is trapped at a axial height corresponding to the bubble radius. The other experimental procedures remain identical to the first set of experiments. An important point here though, is the determination of the distance of the particle from the bubble surface. This we accomplish by using the pixels-to-distance calibration provided in the image acquisition software for the camera attached to the microscope, which we verify by measuring the diameters of the polystyrene particles in the dispersion (the standard deviation of which is around 3% as specified by the manufacturer), and achieve very good consistency. Note that we obtain a 2-d cross-section of the bubble as is demonstrated in Fig. 5, and are thus able to determine the surface-surface separation between the bubble and the particle with accuracy of around 5%. During the experiment, we also ensure that the bubble diameter remains constant by adjusting the power of the nucleating laser -indeed the bubble diameter is seen to remain almost constant for the 100 s that we need to collect data for one run of the experiment.
As opposed to the previous setting, the particle is now trapped in a region where it experiences: 1) a temperature gradient created by the laser beam used to generate the micro-bubble, 2) the microscopic flow generated due to the bubble which affects particle trajectory, and 3) Faxen-like corrections to the viscous drag coefficient of the encompassing fluid due to the proximity to a wall [75] -which is the bubble surface in this case. Now, since the trap parameters -particularly the conversion factor for the trajectory data -are determined for the equilibrium setting, it is clear that there could be significant deviations from those in the non-equilibrium configuration produced due to the presence of the bubble in close proximity of the trapped probe particle. It is also clear that the previous procedure for obtaining the trap parameters will not work in this case. This severely hampers any exact modeling and as a result we concentrate on getting only the value of σ and its variation as a function of the distance to the bubble, both of which are robust against the above errors.

Numerical algorithm
Our aim is to maximise a cost function C which is a function of a set of parameters w. We use a particle swarm optimization algorithm [76] to achieve this. A domain is chosen and N p particles are initialised in that domain. The k-th particle follows Newtonian dynamics given by: Here ω k and V k are the position and velocity vector of the k-th particle and A k is a stochastic function that depends on the position of all the particles. Different variants of this algorithm use different A. The simplest -the one that we use -is called the Original PSO. Let us first define the following: • The k-th particle carries an additional vector P k which is equal to ω k for which the value of the function C as observed by the k-th particle was maximum in its history.
• At any point of time let G denote the position of the particle in the whole swarm for which the function has the maximum value.
The function A is given by Here the Greek indices run over the dimension of space. W 1 and W 2 are two weights. The two terms in Eq. (19) push the particle in two different directions: one towards the point in history where the particle found the function to be a maxima and the other towards the point where the swarm finds the maximum value of the function at this point of time. These are multiplied by two random vectors U 1 and U 2 of dimension same as the dimension of space. Each of the components are independent, uniformly distributed (between zero and unity), random numbers. We keep track of the highest value of the function seen by the swarm and also the location of that point. There are two major advantages to this over standard gradient ascent algorithms: one, it does not require evaluation of the gradient of the function and two, it can be parallellized straightforwardly. All the numerical results reported in this paper are obtained using this algorithm. We implement this optimization scheme using open-sourced PYSWARM package in Python [71], with a default choice for the hyper-parameters.

Implementation of the algorithm
Here we describe how we applied this algorithm to numerical/ experimental data. We generate numerical data using first order Euler integration of Eq. (10) and Eq. (11) with a time step of ∆t = 0.0001. In either case we generate many copies of trajectories of length 12.5s, and construct the cost function in Eq. (4) using Eq. (5) and Eq. (6). We have tried out two different choices of basis functions to construct d(X). The first one is a Gaussian basis in which we represent d(X) as, Making use of the spatial symmetry of the problem, we assume d(X) to be an anti-symmetric function, with d(−X) = −d(X), and that reduces the dimensionality of the problem by a factor of 2. Here M is the number of Gaussian functions, and b i are the variance of the Gaussian in the x and λ direction. The centers of the Gaussian (x m , λ m ) are placed equally spaced in a rectangular region enclosing the data. Both M and b i are hyper parameters. We used M = 16 and b 2 x/λ = {x/λ} max /30. Secondly, we have also tried a linear basis (motivated by the prior knowledge of the linearity of the system) where we take In Figs. 9, 10, 12, and 13, we study the dependence of the output of the algorithm, for both basis functions, on ∆t, length of the time series data, N p as well as M .
Since we have used a finite amount of data to construct the cost function, it will be prone to statistical errors. Therefore we independently maximise the cost function for different 12.5s long data sets, and take their mean value as the optimized estimate of σ. We show the value of sigma inferred (σ L ) as a function of the number of steps in the optimization algorithm for different 12.5s long data sets in Fig. 7. With the numerical data, we also find that the optimal field d * (see Eq. (6)) is proportional to the thermodynamic force field F (Eq. (7)). We demonstrate this in Fig. 8.  Figure  1b of the main text. Right: The optimal d * ∝ F obtained from the algorithm for the same parameter choice.

Comparison of the TUR and the SFI inference schemes
In this section, we compare the TUR and SFI based inference algorithms with respect to the sampling rate ∆t, (for a fixed total length of the trajectory τ = 1s), as well as the length of the single trajectory used for inference τ (with a fixed ∆t = 0.0001s). All the results are obtained for numerically generated data.
In Fig. 9, we demonstrate, how much the inferred value can differ from the true entropy production rate, if the sampling time-step of the trajectory is increased, for the TUR inference scheme in the Linear and Gaussian basis and the SFI scheme in the Linear basis. Results show that, if we use larger ∆t values, the inferred value can become significantly lower, as much as 2/3rd of the original value. We also find that when θ = 0.66, SFI performs slightly better as compared to the TUR scheme.  In Fig. 10, we plot the estimation results of our scheme as well as the SFI scheme for θ = 0.22 and θ = 0.66. The analysis shows that at least 10 4 data points (∼ 1s data with ∆t = 0.0001) are required to get a reliable estimate of the entropy production rate, using the shorttime inference scheme. We note that this is well within the capacity of current experiments [77]. We also notice that, when the amount of data is less (10 3 data points ∼ 0.1s with ∆t = 0.0001s), the SFI based inference scheme performs better. Finally, we demonstrate that our TUR-based shorttime inference scheme for the thermodynamic force field, from a single trajectory of a given length, is equivalent to the SFI based approach for computing the thermodynamic force field. We quantify this by using the Mean-Squared-Error (MSE) function defined as M SE = (Ĉ µν −C µν ) , as a function of the length of the steady state trajectory used for inference. The red squares correspond to the short-time inference scheme and the blue triangles correspond to the SFI scheme. Both schemes perform similarly in all cases.
In conclusion, we find that when the trajectory length is ∼ 1s with sampling rate ∆t = 0.0001, both the SFI and TUR-based schemes are equally good and give robust estimates of the entropy production rate, as well as the thermodynamic force field. If the trajectory length is ∼ 0.1s with sampling rate ∆t = 0.0001, SFI is seen to perform marginally better in estimating the entropy production rate. Similarly, for a given length of the trajectory, with a fixed, but lower sampling rate ∆t, we find that SFI performs better for a larger θ value. It would be very interesting to investigate these results further for systems with non-linear forces and space-dependent diffusion terms.

Hyper-parameter tuning for the Particle-Swarm optimizer
The specific algorithm we have used in this study is built upon the Particle swarm optimizer. We note that this choice is not very crucial, and Refs. [49] and [50] contain equivalent algorithms that can be used to perform the optimization task. In the algorithm we present here, the optimization is done using both a Linear and Gaussian basis, and as seen in Fig. 9 and Fig. 10, the choice of the basis function is not so crucial for the evaluation of σ for this problem with linear forces. We further looked at the effect of one of the crucial hyper-parameters in the algorithm, which is the Number of particles (N p ). As shown in Fig. 12, we find that the effect is negligibly small for the Linear basis, and the inferred value of σ remains more or less the same, independent of N p . For the Gaussian basis, we find that the estimation improves with the number of particles. We have used N p = 10, for most of our simulations using the Gaussian and the Linear basis. Yet another hyper-parameter relevant for inference, when we use the Gaussian basis, is the number of grid points (N g ) that we choose in one quadrant along both x and λ directions. In this case, the number of grid points determine the number of parameters in the optimization problem ( = numbers of Gaussian functions in the basis), and while more of them can slow down the algorithm, too few can lead to the thermodynamic force not being well resolved. In this work, we choose N x g = N λ g in one quadrant, and therefore the total number of grid points are M = 4×N x g ×N λ g . The plot below shows the dependence of the inference on the number of grid points. We plot the ratio of inferred σ vs. the number of grid points in the x (same in the λ) direction, for θ = 0.22, θ = 0.66. We note that the inference works slightly better when N x g = N λ g = 2. We have therefore used N x g = N λ g = 2 in this work. This material contains supplemental information for the results presented in the manuscript "Quantitative analysis of non-equilibrium systems from short-time experimental data".

I. The Stochastic Force Inference technique (SFI)
In this work, we have used the Stochastic force inference technique, introduced in [1] for bench-marking our results for both experimental setups. In the SFI technique, both the drift and diffusive terms of the stochastic equation are represented using an appropriate finite set of basis functions, and then the coefficients are inferred from the trajectory using projective techniques. Using these estimates of the drift and diffusion terms, one can also obtain thermodynamic quantities such as the entropy production in the stationary state. In this work, we used the SFI technique (in the linear basis) to associate a dynamical equation to the experimental trajectory data used in Fig. 2, and to obtain an independent estimate of the entropy production rate which then can be compared with the results from the short-time inference scheme. In these cases, the conversion factor for the trajectory data is fixed such that the x component of the diffusion matrix has the correct room temperature value. For the case θ = 0.22, we obtain, for one 100s trajectory and ∆t = 0.0001s, the drift and diffusion matrix to be, The diffusion matrix agrees well with the theoretical model we wanted to realize in experiment, by construction. The x component is fixed such that it agrees with the room temperature value (which gives us the conversion factor for the x -data ) of the Diffusion constant. The λ component of the Diffusion is set externally by the Ornstein-Uhlenbeck process used to control the mean position of the optical trap. The same applies to the λ component of the force matrix F. On the other hand, the x component of the Force matrix, is determined by the effective force the trapped particle experiences along the x direction. For both the θ values, the theoretical F matrix is supposed to be, for f c = 135 Hz and τ 0 = 0.0025s. As we see from Eq. (22) and Eq. (23), the Forces constructed from the trajectory data is different from the theoretical model. We think that the differences may be mainly due to the long experimental stretches we used in this study, during which the particle might have diffused along y or z directions in the trap, where it experiences a weaker confinement along the x direction. In this work however, we took into account this error, by directly comparing the results from inference with the results obtained from the SFI technique. We further computed an effective theoretical expression for the entropy production rate by taking the average F matrix over the six points in Fig. 2 Since it is a Linear model, we can still solve it and obtain an analytical expression for σ, as well as the thermodynamic Force Field using standard techniques [11]. We get, The thermodynamic force field is given by, Taking the F matrix elements from Eq. (25) and using D 11 = 1.645 × 10 −13 and using θ = D22 D11 , we obtain, σ(θ) = −4.13934 + 513.929 θ + 0.00833487 θ .
We use these expressions as the SFI theory model in Fig. 2 to obtain the theoretical estimate of σ for the experimental data, and as the theoretical Force field form used for comparison in Fig. 3. The red dashed lines in Fig. 2 account for the error bars in the terms in the F matrix in Eq. (25). Notice that, the expression for σ diverges in the θ → 0 limit, as opposed to the case in Eq. (14). This is due to having a small non-zero F 21 term in the Force matrix. If we treat this term to be negligible and set it to 0, the divergence goes away.
Here we would also like to point out one advantage of describing a non-equilibrium system in terms of the entropy production rate σ. The advantage is that σ does not change even if we multiply the individual components of a trajectory using a scaling factor. On the other hand, information such as the force acting on the system or the Diffusion constant, will be modified and scaled if the coordinates are transformed. In our case, we use this to our advantage when studying the case with the bubble (Figs. 5 and 6), where the exact conversion factors for the trajectory data are hard to obtain.
II. The heat dissipated in the medium for the case with the bubble In this work, we have obtained an estimate for the average total entropy production of a colloidal particle maintained in a steady state by being confined in a shaken trap (the stochastic sliding parabola model), under two different experimental conditions, namely without and with a microscopic bubble in the vicinity of the trap. The average total entropy production for a system in steady state is also the same as the heat dissipated by the system into the surrounding bath (at constant temperature T ). This heat dissipated includes the heat associated with keeping the system in a steady state (by shaking the trap) and, if there is a flow, the heat associated with the work done by the flow on the particle. As we argue below, the latter component cannot be obtained by the short-time inference scheme, and is related to a fundamental limitation of the applicability of the TUR [46] related to how the flow term is dealt with.
We begin with a possible generic form of the Langevin equation in the presence of the bubble, where u, τ and D are taken to be slowly varying functions of x, and essentially treated as constants (u d , τ d and D d ) at a distance d from the bubble, where the particle is trapped. First, we notice that under the transformations x → x = x − τ d u d , the above equations map to the Stochastic sliding parabola model, with the parameters τ = τ d and D = D d . This observation also demonstrates that for the above system, the mean position of the particle is no longer at the center of the trap, but is instead x = u d τ d . Now we look at the entropy production in this system, using the standard definitions in Stochastic thermodynamics.
Since the system is in a stationary state, the actual rate of entropy production can be obtained in terms of the heat (q) dissipated to the medium at a temperature T as, However there is an ambiguity on how to obtain the correct value of σ, arising from two choices of transformations for the flow term under time-reversal [66]. The first approach is to let the flow term reverse it's sign under time-reversal, as physically meaningful for a velocity variable. This gives an estimate of medium entropy production [66] as, The observed trajectories of the colloidal particle, on the other hand, only show the effect of the flow u d as a constant external force acting on the system, which only amounts to shifting the mean position of the colloidal particle in the direction of the flow. This leads to a second (naive) approach to the entropy production in this system as, The physical distinction between the two definitions is as follows: when there is a background flow in the medium, this flow has to constantly do work against the confining potential to maintain the particle in it's "new" average position.This is an additional contribution to entropy production, that is only accounted for in the definition in Eq. (32). In other words, the particle trajectories do not carry information about this and hence the short-time inference scheme, which is based on TUR and the information carried by particle trajectories, only predicts the quantity σ in Eq. (33). σ and σ are related by, When the flow velocity u d = 0, they are the same.
Currents in the non-equilibrium stationary state Systems in a non-equilibrium stationary state are characterized by a non-vanishing current in the phase space [3]. For the colloidal system we consider, these currents can be estimated from the trajectory data as, Using Eq. (35) we estimate currents in the case when the bubble is present in the vicinity of the optical trap. We find that the phase space currents are reduced in magnitude. We demonstrate this with surface plots of the two components of the currents in Fig. 14 for the case discussed in Figure 5 in the main text.