## Main

To illustrate the basic idea of our feedback protocol, let us consider a microscopic particle on a spiral-staircase-like potential (Fig. 1). We set the height of each step comparable to the thermal energy kBT, where kB is the Boltzmann constant and T is temperature. Subjected to thermal fluctuations, the particle jumps between steps stochastically. Although the particle sometimes jumps to an upper step, downward jumps along the gradient are more frequent than upward jumps. In this manner, on average, the particle falls down the stairs unless it is externally pushed up (Fig. 1a). Now, let us consider the following feedback control: We measure the particle’s position at regular intervals, and if an upward jump is observed we place a block behind the particle to prevent subsequent downward jumps (Fig. 1b). If this procedure is repeated, the particle is expected to climb up the stairs. Note that, in the ideal case, energy to place the block can be negligible; this implies that the particle can obtain free energy without any direct energy injection. In such a case, what drives the particle to climb up the stairs? This apparent contradiction to the second law of thermodynamics, epitomized by Maxwell’s demon, inspired many physicists to generalize the principles of thermodynamics1,5,6. It is now understood that the particle is driven by the ‘information’ gained by the measurement of the particle’s location5,8. Figure 1: Schematic illustration of the experiment.

In microscopic systems, thermodynamic quantities such as work, heat and internal energy do not remain constant but fluctuate9,10. In fact, stochastic violations of the second law have been observed11,12; nonetheless, the second law still holds, on average, if the initial state is in thermal equilibrium: 〈ΔFW〉≤0, where ΔF is the free-energy difference between states, W the work done on the system and 〈·〉 the ensemble average. However, the feedback control enables us to selectively manipulate only fluctuations that cause ΔFW>0 such as upward jumps by using the information about the system13,14,15. Here, ‘feedback’ means that control protocols depend on measurement outcomes of the controlled system, in other words, ‘feedback control’ means a ‘closed-loop control’16. Our gedanken experiment shows that, by employing feedback control, the information can be used as a resource for free energy. In fact, Szilárd has developed a model that converts one bit of information about the system to kBTln2 of free energy or work1. In other words, the second law is generalized17 as follows: Here I is the mutual information content obtained by measurements6,18 (see Methods). So far, the idea of a simple thermal rectification by feedback control has found applications such as the reduction of thermal noise15 and the rectification of an atomic current at low temperature13. On the other hand, the Szilárd-type Maxwell demon enables us to evaluate both the input (used information content) and the output (obtained energy) of the feedback control and relate them operationally. Therefore, it has provided an ideal test-ground of information-to-energy conversion and played the crucial role in the foundation of thermodynamics. However, its experimental realization has been elusive. In this experiment, we develop a new method to evaluate the information contents and thermodynamic quantities of feedback systems and demonstrate the Szilárd-type information-to-energy conversion for the first time using a colloidal particle on a spiral-staircase-like potential.

A dimeric particle comprising polystyrene beads (diameter=287 nm) was attached to the top glass surface of a chamber filled with a buffer solution (Fig. 2a). The particle was pinned at a single point by a linker molecule; it exhibited rotational Brownian motion (Supplementary Fig. S2). By using quadrant electrodes imprinted on the bottom glass plate, we imposed 1 MHz electric fields to simultaneously create periodic potentials and constant torque on the particle along the angle of rotation. By using this new method, a tilted periodic potential with an ideal sinusoidal shape for the particle can be achieved, which is a realization of the spiral-staircase-like potential mentioned above (Fig. 2b, see also Supplementary Information). A feedback control was carried out under a microscope by constructing a real-time feedback system including video capture, image analysis, potential modulation and data storage. We repeated the following feedback cycle with a period of τ=44 ms and a minimum feedback delay of 1.1 ms, as illustrated in Fig. 2c. At t=0, the particle’s angular position is measured. If the particle is observed at the angular region indicated as ‘S’, the potential is changed to that with an opposite phase at t=ɛ; otherwise, no action is taken. At t=τ, the next cycle begins with the measurement of the angular position. Region S was chosen for its energy advantage; in region S, the potential energy before switching is always higher than that after switching. In the case of small ɛ, the particle is expected to be at rest around region S just before the switching at t=ɛ and then jump to the rightward well of the switched potential after the switching. On the other hand, for large ɛ, the particle falls down in the well away from region S before the switching. In this case, with a large probability, the particle jumps down to the leftward well of the switched potential after the switching. In this manner, the feedback delay ɛ regulates the efficiency of the feedback control. Note that, as τ=44 ms is sufficiently larger than the relaxation time in each well (10 ms) and smaller than the typical time to jump to neighbour wells (1 s), each feedback cycle is supposed to be a transition between equilibrium states.

In Fig. 3a, typical trajectories with the feedback control are shown. The trajectories are stepwise with a step size of 90°, which reflects the potential profile (Fig. 3b). We find that for small ɛ the particle rotates unidirectionally while climbing up the potential, whereas for large ɛ the particle goes down along the gradient. The rotation rate decreases monotonically with ɛ, as expected (Fig. 3c). Figure 3: Trajectories, mean velocities and excess free energy under feedback control.

We then focused on the energetics during a cycle. In Fig. 3d, we show the difference between the obtained free energy ΔF and the work done on the particle by the switching, W, which is averaged over a cycle (see Methods). We find that 〈ΔFW〉>0 for small ɛ; this implies that the particle gains a net free energy larger than the work done by absorbing heat beyond the conventional limitation of the second law of thermodynamics. For small ɛ, the switching mostly occurs when the particle is in region S. In such cases, the particle absorbs heat from an isothermal environment to reach region S before the measurements at t=0, then does work on the electric field at the switching, and finally jumps to the rightward well after the switching (Supplementary Fig. S7). Although such an event is not prohibited even if we randomly switch potentials without feedback control, it is typically an accidental and rare event in accordance with the second law of thermodynamics or the fluctuation theorem19,20,21. However, the feedback control can increase the likelihood of occurrence of such an event. This is the crux of the control by Maxwell’s demon. The resource of the excess free energy is the information obtained by the measurement. If the estimation error of the particle’s angular position is negligible, the amount of information is characterized by the Shannon information content I (ref. 22). In this study, I=−plnp−(1−p)ln(1−p), where p is the probability that the particle is observed in region S (see Methods). As noted in (1), I can be converted to free energy of up to kBT I (ref. 1). In our system, for the shortest feedback delay (ɛ=1.1 ms), p, I and 〈ΔFW〉 were 0.059, 0.22 and 0.062 kBT, respectively (Supplementary Fig. S9). This gives the efficiency of the information-to-energy conversion as 〈ΔFW〉/kBT I=28%. 100% efficiency can be achieved by quasistatic information heat engines such as the Szilárd engine1.

Although the second law concerns only the average, or the first-order cumulant, of the stochastic quantity ΔFW, Jarzynski pointed out that the second law naturally emerges as the first-order cumulant expansion of the following equality that involves ΔFW to all orders23,24: 〈eFW)/kBT 〉=1. Recently, the Jarzynski equality, which assumes a prescribed control scheme, was generalized to systems with a feedback control as follows7 (see Methods for heuristic derivation): where γ is an experimentally measurable quantity and is defined as the sum of the probabilities that the time-reversed trajectories are observed under time-reversed protocols for all possible protocols (see Methods). From its definition, 0≤γ≤2 in our system. Whereas I concerns the information obtained by the measurements, γ quantifies how efficiently we use the obtained information for the feedback control. If we control the system perfectly and deterministically, a time-reversed trajectory is always realized under the time-reversed protocol; γ then takes its maximum value. We repeated time-reversed cycles with and without switchings to obtain γ with a period of 220 ms, which is sufficiently long to ensure that the initial state of the cycle is relaxed to equilibrium.

Figure 4 shows that the conventional Jarzynski equality is violated in the presence of the feedback control. For large ɛ where 〈ΔFW〉≤0, the second law holds on average, but the Jarzynski equality is violated. On the other hand, the generalized Jarzynski equality (2) holds over a broad range of ɛ (Fig. 4a), showing that equality (2) expresses the effect of feedback control to all orders. γ seems to converge to unity in the limit of infinite ɛ; here, the angular position at the switching becomes independent of that at the measurement, and the conventional Jarzynski equality recovers. For a close examination, we plotted the discrepancy between γ and 〈eFW)/kBT 〉 and its convergence in Fig. 4b and c, respectively. The small discrepancy, less than 3%, for small ɛ (Fig. 4b) is supposed to result from the definition of the states. The equality assumes that each cycle starts from an equilibrium state. However, as the probability that the particle escapes from the well is not zero, an equilibrium state cannot be realized in a precise sense. Although the typical escape time (1 s) is much larger than the period of cycle (44 ms), it is possible that such a small discrepancy arises. It is known that a large number of cycles is necessary for the Jarzynski equality to converge owing to the exponential average24. We repeated more than 100,000 cycles for each ɛ and confirmed the convergence of the left-hand side of (2) (Fig. 4c). The validity of (2) verifies a new fundamental principle of an ‘information-heat engine’, which converts information to free energy, in terms of all orders. Figure 4: Verification of the generalized Jarzynski equality.

As the energy converted from information is compensated for by the demon’s energy cost to manipulate information2,3,4, the second law of thermodynamics is not violated when the total system including both the particle and demon is considered. In our system, the demon consists of macroscopic devices such as computers; the microscopic device gains energy at the expense of the energy consumption of a macroscopic device. In other words, by using information as the energy-transferring ‘medium’, this information-to-energy conversion can be used to transport energy to nanomachines25,26 even if it is not possible to drive them directly (Supplementary Fig. S1). The next step will be to extract work from the obtained free energy explicitly by coupling the system with a microscopic transducer. This can cause a further loss of the conversion efficiency. However, in this study, compared with the obtained free energy of kBT, a huge amount of energy was consumed for the information processing at the macroscopic level. The future challenge is to realize a nanoscale information-processing device such as an artificial molecular motor27, in which both the demon and the controlled system are microscopic.

## Methods

### Experimental set-up.

A dimeric particle composed of particles (287 nm diameter, Seradyn) was non-specifically attached to the top glass surface by means of a streptavidin linker coated on the particle’s surface. To impose a tilted periodic potential on the particle, an elliptically rotating electric field was induced by applying 1 MHz sinusoidal voltages on the quadrant electrodes patterned on the bottom glass surface. The direction of the long axis of the elliptically rotating electric field corresponds to the local minima of the potential. By changing the direction of its axis, we inverted the phase of the potential. The particle was observed on an upright microscope equipped with a high-speed camera at a period of 1.1 ms with an exposure time of 0.3 ms. Potentials were measured from transition probabilities. More than 100,000 feedback cycles were carried out for each feedback delay (ɛ). See Supplementary Information for details.

### Free energy and work.

Each potential well separated by peaks was defined as a state (Supplementary Fig. S6). The free energy of state k was calculated as , where U(x) is the potential energy at angular position x, and the integration is carried out in the angular region corresponding to state k. As the shapes of all the wells are almost the same, the free-energy difference between states is nearly equal to the difference of the potential energies of their local minima. The work done on the particle, W, was calculated as the potential-energy change associated with the switching: the potential energy after the switching minus that before the switching. In cycles without switching, W=0.

### Information content.

For an event k with a probability of occurrence p(k), the Shannon information content associated with this event is defined as −lnp(k). This definition leads to well-defined properties that the information content should satisfy22. The average Shannon information content becomes . Measurements are usually accompanied by errors, which reduce the amount of information that can be used. Although I denotes the amount of the information embedded in the system, the mutual information content, I′, denotes the amount of information that is obtained by the measurement7,18: , where p(m|k) is the conditional probability that the outcome of the measurement is the mth event when the kth event occurs actually. If the measurement is free from error, p(m|k)=δk,m (δk,m=1 if k=m, and otherwise 0). In such a case, I′=I. In the present experiment, we distinguished two events: the particle is observed in region S or not with negligible measurement errors. Then, the (average) Shannon information content per cycle becomes the so-called binary entropy function: I=−plnp−(1−p)ln(1−p), where p is the probability that the particle is observed in region S.

### Generalized Jarzynski equality and feedback efficacy.

Let us consider the situations in which we make measurements without error and divide the phase-space of the particle into several regions. Then, in each region, a more detailed expression of the Jarzynski equality holds28: 〈eFW)/kBTA=P(A)/P(A), where 〈A is the ensemble average over trajectories under the condition that the particle is observed in region A (A=S or outside S in our set-up) with probability P(A), and P(A) is the probability that the particle is observed in A under the time-reversed control protocol. Without feedback control, this detailed equality reproduces the Jarzynski equality as . In contrast, with feedback control, P(A)A is no longer a single probability distribution in terms of A, because the control protocols depend on A. Therefore, is not necessarily equal to unity. In such cases, the Jarzynski equality needs to be generalized to (2), where .

We measured the feedback efficacy, γ, as follows (see Supplementary Fig. S8). In the forward feedback cycle, we measured the particle’s angular position at t=0 and (1) switched or (2) did not switch the potential at t=ɛ depending on the angular position. Corresponding time-reversed trajectories are that the particle is observed in the region (1) S at t=τ after the switching at t=τɛ or (2) outside S without switching. Let the occurrence probabilities of time-reversed trajectories under timer-reversed protocols be psw and pns, respectively. Then, γ is γ=psw+pns. From its definition, if there are m states to be distinguished (m=2 in our experiment: whether the particle is in region S or not), 0≤γm. We repeated time-reversed cycles with/without switchings to obtain γ with a period of 220 ms, which is sufficiently long to ensure that the initial state of the cycle is relaxed to equilibrium.