Coherence and measurement in quantum thermodynamics

Thermodynamics is a highly successful macroscopic theory widely used across the natural sciences and for the construction of everyday devices, from car engines to solar cells. With thermodynamics predating quantum theory, research now aims to uncover the thermodynamic laws that govern finite size systems which may in addition host quantum effects. Recent theoretical breakthroughs include the characterisation of the efficiency of quantum thermal engines, the extension of classical non-equilibrium fluctuation theorems to the quantum regime and a new thermodynamic resource theory has led to the discovery of a set of second laws for finite size systems. These results have substantially advanced our understanding of nanoscale thermodynamics, however putting a finger on what is genuinely quantum in quantum thermodynamics has remained a challenge. Here we identify information processing tasks, the so-called projections, that can only be formulated within the framework of quantum mechanics. We show that the physical realisation of such projections can come with a non-trivial thermodynamic work only for quantum states with coherences. This contrasts with information erasure, first investigated by Landauer, for which a thermodynamic work cost applies for classical and quantum erasure alike. Repercussions on quantum work fluctuation relations and thermodynamic single-shot approaches are also discussed.

where σ is the vector of the three Pauli matrices, σ 1 , σ 2 and σ 3 and0 = Tr[|0 0| σ] is the unit vector in the Blochsphere pointing from the origin to the state |0 , see Fig. SI.1a. We assume without loss of generality that a ≥ 1 2 . If this was not the case, the labels |0 0| and |1 1| should be interchanged. The spin's initial Hamiltonian is given by H = −E Π H 0 − Π H 1 , where Π H k = |e k e k | with k = 0, 1 are the rank-1 projectors onto the two energy eigenstates and E > 0. This Hamiltonian arises when the spin is exposed to an external magnetic field B (0) . The energy separation of the aligned ground state, |e 0 , and anti-aligned excited state, |e 1 , is 2E = 2 | µ| | B (0) |, where µ is the magnetic moment of the spin. A general initial state ρ is not diagonal in the basis {|e 0 , |e 1 }, in other words the spin's eigenstates are superpositions with respect to the energy eigenbasis, |0 = α * |e 0 + β * |e 1 and |1 = β |e 0 − α |e 1 with |α| 2 + |β| 2 = 1. The spin's Blochvector,0, is then not parallel to the B-field, B (0) . Emmy wants to obtain the state where the coherences with respect to the energy basis {|e 0 , |e 1 } have been removed, whereê 0 = Tr[|e 0 e 0 | σ] is the unit vector in the Blochsphere pointing from the origin to the state |e 0 . Since geometrically the mapping ρ → η H is a projection of s ρ onto the vertical axis in the Blochsphere, the length of the final Blochvector, s η , is shorter than the initial Blochvector, s ρ . This shortening is associated with an entropy increase 1 . When describing the process in the following we assume that p ≥ 1 2 in accordance with the illustration in Fig. SI.1a. At the end of this section we come back to the case p < 1 2 . Emmy proceeds with three steps made up of quantum thermodynamic primitives with known work and heat contributions 2  (SI. 6) In the first step, (ρ, H) 1 −→ (ρ 1 , H (1) ), Emmy isolates the spin from the bath and rotates the B-field such that the variation of the field induces a unitary transformation of the spin into the energy eigenbasis, with unitary V = |e 0 0| + |e 1 1|. The state after this step is (SI.7) Figure SI.1: Illustration of optimal three step process in the Blochsphere (a) and in configuration space (b). For readability the superscript of η has been dropped. a, The Blochvector of a spin-1/2 state, sρ, is shown as the black arrow in the sphere, and it is exposed to an external B-field, B (0) , indicated on the left. The first step rotates sρ on the green-dashed circle to the green arrow sρ 1 , while the B-field changes to B (1) . The second step shortens sρ 1 to sη while the B-field decreases to B (2) . In the last step the B-field returns to its initial value, B (0) , while the state remains η H . b, Thermodynamic steps can be illustrated in the configuration space of pairs of states and Hamiltonians 2 . Unitary evolutions are shown as blue arrows while thermalization processes are indicated by red horizontal arrows. Thermal states are denoted by red circles and non-equilibrium configurations by blue squares. The three step process is also optimal for any finite-dimensional quantum system (see Methods Summary). It starts with a unitary transforming the initial non-equilibrium configuration (ρ, H) to the thermal configuration (ρ 1 , H (1) ). A quasi-static process then brings (ρ 1 , H (1) ) to (η H , H (2) ), illustrated as infinitesimally small steps consisting of unitary evolution followed by thermalization. Finally, a unitary quench from the thermal configuration (η H , H (2) ) to the non-equilibrium configuration (η H , H) concludes the process.
The B-field after this step, B (1) , is chosen such that the new Hamiltonian where k B is the Boltzmann constant and T is the temperature of the heat bath that Emmy will use in the next step. This choice of the B-field makes the state ρ 1 a thermal state with respect to H (1) at temperature T , i.e. ρ 1 = e −βH (1) Z (1) with Z (1) = Tr e −βH (1) and inverse temperature β = 1 k B T . Since the system was isolated in the first step no heat exchange was possible and the entire average energy change of the system is drawn from the system as work W (1) = −Tr[ρ 1 H (1) − ρ H]. Physical constraints may make this process difficult to realise, for instance, pure initial states would require a B-field, B (1) , of infinite magnitude because thermal states at any finite temperature are only pure if the energy gap is infinite. In this case there is a trade-off between the maximal magnitude the B-field can reach and the precision with which the process is carried out. In the following we assume that the maximal B-field is large enough to make the error in the precision negligibly small.
In the second step, (ρ 1 , H (1) ) 2 −→ (η H , H (2) ), Emmy brings the spin in contact with the bath at temperature T , not affecting the spin's state as it is already thermal. She then quasi-statically decreases the magnitude of the B-field, while keeping the system in contact with the bath at all times, such that the final Hamiltonian is ln p 1−p where p is the probability of measuring −E in the initial state, ρ. The quasi-static evolution means that the system is thermalised at all times, arriving in the final state which is thermal with respect to H (2) where Z (2) = Tr e −βH (2) . This state is exactly η H , the desired final state after the projection. The quasi-static process considered here has a known average work given by the free energy difference 2-4 , Finally, in the third step, (η H , H (2) ) 3 −→ (η H , H), Emmy isolates the spin from the bath and changes the energy levels of the Hamiltonian such that it becomes the initial Hamiltonian H again. This step is done quickly so that the state of the spin does not change. Because the system is isolated the energy change in this step is entirely due to work W (3) = −Tr[η H (H − H (2) )]. In total, this thermodynamic process has brought the spin from the quantum state (ρ, H) to the state (η H , H) while not changing the energy of the spin, Tr[(ρ−η H ) H] = 0. The overall average work drawn from the spin is showing the optimality of the three step process for the spin example, cf. Eq. (1) in the main text. The above example assumed p ≥ 1 2 . Suppose now that the probability to find the final state η H in the ground state |e 0 with respect to the Hamiltonian H was smaller than to find it in the excited state |e 1 , i.e. p < 1 2 . Proceeding through the three steps described one finds that the mathematics is exactly the same. In particular, after Step 2 η H is a thermal state with respect to H (2) at inverse temperature β. The only difference occurs in the interpretation as for the Hamiltonian H (2) the ground state is |e 1 because E (2) = k B T 2 ln p 1−p < 0 is negative. This is feasible by making the B-field B (2) negative, thus swapping the ground and the excited state. Consequently the analysis above and the resulting expression of the total extracted work remain the same.
The work extracted in the individual steps of the thermodynamic projection process can be either positive or negative, depending on the initial state ρ, the Hamiltonian H and the temperature T of the heat bath. Their sum, W , is strictly positive whenever the initial state was not diagonal in the energy eigenbasis, a consequence of the entropy increase 1 from ρ to η H . On the other hand for classical states -all diagonal in the energy basis -the optimal work for such a projection is always zero. The Methods Summary in the main text extends the optimality proof of the above three step process, illustrated in Fig SI.1b, to the general finite-dimensional case.
A note on optimal work extraction at constant average energy. Assume we are given an initial state ρ and a non-degenerate Hamiltonian H for a quantum system. The goal is to find the maximal work that can be obtained in a thermodynamic process that involves a heat bath at temperature T under the restriction that the average energy of the system after the process is the same as it was before the process, U := Tr[ρ H]. Using Eq. (2) in the main text together with the condition that internal energy does not change this amounts to finding the maximum over the set of states σ with Tr[σ H] = U , (SI.10) It is well-known that at a fixed expectation value of an observable H the Gibbs states σ λ = e −λH /Tr[e −λH ] are the states of maximal entropy 13,14 . Here the parameter λ has to be chosen such that the energy of the Gibbs state matches U -therefore there is only one σ λ * , with λ * such that Tr[σ λ * H] ≡ U , that gives the maximum here. The maximum entropy is then and the maximum average work that can be extracted from ρ at fixed average energy U is then For the special case that the system is a qubit (two-dimensional) the optimum Gibbs state for work extraction σ λ * is identical to the projected state η H = k=0,1 Π (k) ρ Π (k) and the maximal work that can be drawn from a system starting in state ρ, while keeping its average energy fixed, is W opt in Eq. (1) in the main text. To see (1) ), so that σ λ * has just the right energy Tr[σ λ * H] = U . On the other hand the projection state has the same expansion, We note that this coincidence is not true for higher dimensional systems where the energy-projected state η H will in general have a non-monotonous, non-canonical distribution in its energy eigenbasis, while σ λ * must be Gibbs-distributed.
Considering the illustration in Fig. SI.1a, the qubit states σ fulfilling the condition Tr[σ H] are located on the plane which contains ρ and is perpendicular to the |e 0 -|e 1 -axis. On the other hand, in the Bloch picture a state has higher entropy the closer it is to the center of the sphere. Hence, the optimal final state when extracting work from ρ while conserving the average energy of the system is the state ρ projected to the |e 0 -|e 1 -axis, i.e. η H .

B Work storage system
In the previous section it was stated that work can be drawn from a quantum system when undergoing a thermodynamic projection process. But where has the work gone to?
There are two approaches of accounting for work that are mirror images to each other. One approach 2-10 focusses on the work that the system exchanges, as described above. Here it is often not explicitly mentioned where the work goes to, but the only place it can go to are the externally controlled energy sources, see Fig. 1 in the main text. Another way of accounting is to explicitly introduce a work system to store the work drawn 11,12 . One way of doing so in an average scenario is to introduce 11 a 'suspended weight on a string', described by a quantum system W , that could be raised or lowered to store work or draw work from it. Specifically, the Hamiltonian of the work storage system is defined as H W = m g x, representing the energy of a weight of mass m in the gravitational field with acceleration g at height x. In addition, an explicit thermal bath B is introduced 13,14 consisting of a separate quantum system in a thermal (or Gibbs) state τ B . Both, the explicit work storage system and the heat bath are illustrated in Fig. 1 in the main text. In the latter approach the total system starts in a product state of system S (e.g. spin), bath B, and weight W , ρ SBW = ρ S ⊗ τ B ⊗ ω W , which together undergo average energy conserving unitary evolution with V : The assumption is that the total Hamiltonian is the sum of local terms, Both the implicit and the explicit treatment of work are equivalent in the sense that the results obtained in one language can be translated in the other and vice versa. In particular, the implicit description used in this text 2 has an equivalent explicit formulation 11 . In the next section we will discuss single-shot extractable work in a projection process. One possibility to define work in this context is to chose the explicit work storage system as a 'work qubit' with a specific energy gap which has to be in a pure energy eigenstate before and after the protocol 12 . This way it is guaranteed that full knowledge about its state is present at all times and the work is stored in an ordered form. In this scenario the allowed unitary operations V on the whole system SBW have to conserve the energy exactly, not only on average, which amounts to [V, H SBW ] = 0.

C Single-shot analysis
Instead of performing a thermodynamic process on an ensemble of N identical and independent copies one can consider a single run of the process. Two major recent frameworks 4,12 have been developed to describe the optimal work that can be drawn from a system in a single run. The proposal byÅberg 4 , involves changes of the Hamiltonian and identifies work with the deterministic energy change of the system when undergoing a unitary process. The proposal by Horodecki-Oppenheim 12 , is formulated in terms of thermal operations 15 , where work is associated with raising a two-level system, called the 'work qubit', with energy gap W deterministically from the ground to the excited state.
However, when attempting to apply these two frameworks to find the single-shot work for the energy projections ρ → η H captured by Eq. (1) in the main text one encounters an obstacle: both frameworks only apply to processes between initial and final states that are classical, i.e. states that are diagonal in the energy basis.Åberg discusses coherences in a separate framework 16 , which does however not cover single-shot work extraction and only focusses on average quantities, similar to those in other references 2,11 . Horodecki-Oppenheim suggest that quantum states with coherences with respect to the energy eigenbasis are first decohered before applying the single-shot protocol. As discussed, apart from decohering there are other thermodynamic projection processes that map the initial state with coherences, ρ, to the final state η H = k Π H k ρ Π H k without coherences, where Π H k are the projectors on the energy eigenstates of the Hamiltonian, H. Eq. (1) shows that the average work extracted in an optimal thermodynamic projection process is strictly positive while the decoherence process has zero work. Therefore one may expect a positive optimal work for projections also in the single-shot setting, with decohering a suboptimal choice, see Since our focus here is the N → ∞ limit we will not aim to construct the single-shot case. Instead, to establish a notion of consistency between the average analysis and previous single-shot work results we consider the sequence (ρ, H) Step a of this sequence rotates the initial non-diagonal state ρ to the diagonal state ρ 1 . As discussed, it cannot be treated with  (η H , H). a, Decohering the state in the energy basis extracts no work. b, To perform a consistency check between the average and single-shot results it is possible to split the process into a basis rotation to (ρ 1 , H) with unknown single-shot work, but known average work, and two thermal operations that pass through the thermal state (τ H , H) and are treatable in the single-shot framework 12 . c, General quantum thermodynamic processes could allow coherences and need not pass through intermediate fixed states.
the single-shot framework 4,12 but it is possible to associate an average extracted work with this unitary process, W (a) = Tr[(ρ − ρ 1 ) H]. A single-shot analysis according to Horodecki-Oppenheim 12 can then be performed for the diagonal steps b and c. This is possible because the steps go via the thermal state τ H .
Step b brings ρ 1 to τ H and allows the extraction of the single-shot work 12 where D ε min is the smooth min-relative entropy 17 and ε ≥ 0 is the allowed failure probability of the process. Similarly, in Step c the final state η H is formed from the thermal state by applying a protocol that costs work. This work is 12 where D ε max is the smooth max-relative entropy 17 . In total, the single-shot work associated to Steps b and c of the process is with failure probability at most 2ε − ε 2 ≈ 2ε, when ε is small.
To show consistency we now consider the average expected work extracted per copy if the single-shot protocol is carried out on N → ∞ i.i.d. copies of the system. In such a calculation the work computed is an average value which is why W (a) , the average work contribution of the basis rotation in Step a, can be taken into account too. One obtains a total average work per copy of where we have used the quantum asymptotic equipartition theorem for relative entropies 18,19 in the second line. D(·||·) is the standard quantum relative entropy defined by D(η H ||τ H ) = Tr[ η H (log η H − log τ H ) ] and likewise for ρ 1 , where log is the logarithm to base 2. The quantities D ε min and D ε max as well as their regularized version, the standard quantum relative entropy D, can be seen as different measures characterizing the distance between two states. When applied here, they measure the 'distance' between the thermal state τ H and another diagonal state in such a way that the operational meaning of this distance is given by the work one has to invest or is able to extract when transforming one into the other.
The derivation shows that in the asymptotic limit the optimal average work is recovered from the single-shot components. But it is important to realise that from Eq. (SI.17) one cannot conclude that the above single-shot process forming η H from ρ 1 is optimal. Going via the thermal state is just one option which is particularly convenient in this case as the processes of maximal work extraction and work of formation from the thermal state have been treated in the single-shot scenario 12  After making public our results on average work associated with removing coherences in thermodynamic projection processes very recently a paper appeared 21 that derives the work that can be extracted when removing coherences in a single-shot setting. In this paper the previously mentioned framework describing the catalytic role of coherence in thermodynamics byÅberg 16 is used together with insights from reference frames in quantum information theory. These results are in agreement with our findings and strengthen our conclusion that coherences are a fundamental feature distinguishing quantum from classical thermodynamics. where p τ m,n are the transition probabilities for energy jumps starting in |e

D Quantum work fluctuation relation
simplifying the exponentiated average work to The completeness of the projectors, n Π . Similarly, the average work extracted from the system is the average energy difference between ρ 0 and ρ τ where p τ m := n p τ m,n = Tr[ρ τ Π While the experimentally observed average energy difference is not affected by the measurement step, i.e. U (η τ ) − U (ρ 0 ) = U (ρ τ ) − U (ρ 0 ), the entropy difference does change, i.e. S(η τ ) − S(ρ 0 ) = S(ρ τ ) − S(ρ 0 ) = 0. This means that the system may absorb heat, Q abs , during the measurement step, indicated in Fig. SI.3b. Its actual value depends on how the measurement is conducted with the optimal heat positive, Q abs opt = k B T (S(η τ ) − S(ρ τ )) ≥ 0. Since ∆U = Q abs − W (T1 in main text) this implies that in an experimental implementation of the Jarzynski relation the work done by the system on average can be more than previously thought, with the optimal value being W opt = W unitary + k B T (S(η τ ) − S(ρ τ )). In the special case that the average heat Q abs is zero it is possible (although not necessary) that Eq. (SI.18), and thus the standard Jarzynski expression e βW = e −β∆F , are correct. In particular this applies to classical measurements. We conclude that the suitability of identifying W = −∆E, and hence the validity of the quantum Jarzynski work relation depends on the details of the physical process that implements the measurement.
Quantum work fluctuation relations that have only one measurement 23,24 , instead of the two discussed above, offer a feasible route of measuring work fluctuations experimentally. Instead of measuring separately the initial and final fluctuating energies, E (0) n and E (τ ) m , to establish their joint probabilities, this method acquires only knowledge of the joint probabilities by measuring energy differences ∆E directly. But also here is one final measurement, in general on a non-diagonal state, needed.

E Lower bound on entropy change
The entropy change during a projection with projectors {Π P k = |φ k φ k |} k can be lower bounded. In the following, B 2 = Tr[B † B] denotes the Hilbert-Schmidt norm of a linear operator B acting on a d-dimensional Hilbert space describing the quantum system of interest. The lower bound reads 25 Here, S is the von Neumann entropy, ρ the initial state and η P = k Π P k ρ Π P k the final state after the projection process. Furthermore, ∆A P is the second smallest eigenvalue of the matrix 1 − M T M where M is the doubly stochastic matrix given by the entries M kl = | φ k | l | 2 and {| l } l is the eigenbasis of the initial state ρ.
Considering the two main terms on the right hand side of Eq. (SI.25) separately, ρ − 1/d 2 2 and ∆A P , it becomes apparent that they characterise different properties of the initial state. The first term measures the distance of ρ to the fully mixed state, 1/d, and quantifies the purity of ρ. It is maximal for all pure initial states and zero if and only if ρ = 1/d. In the special case of a spin-1/2 system it can be directly related to the length of the Bloch vector describing ρ in the Bloch representation, a link that will be established below. The second term, ∆A P , is related to the overlap of the eigenbasis of ρ, {| l } l , and the projective basis, {|φ k } k . It is zero if they are the same and maximal if they are mutually unbiased 26,27 . This can be seen as follows: if the two bases are the same, then the matrix M is a permutation and consequently M T M is the identity. In this case, 1 − M T M is the zero matrix and thus ∆A P = 0. If {|φ k } k and {| l } l are mutually unbiased, i.e. if they fulfil | φ k | l | 2 = 1/d for all k, l, the matrix M and thus also M T M is a rank-1 projector onto the space spanned by the vector (1, . . . , 1) T . Hence, 1 − M T M has eigenvalues {0, 1, . . . , 1}. One finds that the second largest eigenvalue is ∆A P = 1, which is also the maximal eigenvalue the matrix 1 − M T M can have 28 .
In the special case of the spin-1/2 system shown in Fig. SI.1a, the bound reads ∆S H (ρ) ≥ 1 4 | s ρ | 2 sin 2 θ, where s ρ is the Bloch vector of the initial state and θ is the angle between the eigenbasis of ρ, {|0 , |1 }, and the projective energy basis, {|e 0 , |e 1 }. Let ρ = a |0 0| + (1 − a) |1 1| be the initial state of the qubit. Furthermore, let η H = p |e 0 e 0 | + (1 − p) |e 1 e 1 | be the final state after the energy projection, where p = Tr[ |e 0 e 0 | ρ ] is the probability to obtain |e 0 . As argued in Section SI A w.l.o.g. we can assume that a ≥ 1 2 , p ≥ 1 2 . In the Bloch representation one can write ρ = 1 2 (1 + s · σ) and η H = 1 2 (1 + t · σ). Here we used a different notation for the Bloch vectors of ρ, s := s ρ , and η H , t := s η , for readability. The Pauli matrices are self-adjoint and fulfil Tr[σ i σ j ] = 2δ ij . Hence we find where | · | is the Euclidean metric in R 3 . This proves the form of the first factor in the bound. For the factor ∆A H notice that by assumption a ≥ 1 2 , p ≥ 1 2 and thus we can write |e 0 e 0 | = 1 2 (1 + t | t| · σ) and |0 0| = 1 2 (1 + s | s| · σ). Therefore To further illustrate the bound consider the special case when the initial state ρ is pure and its eigenbasis mutually unbiased with respect to the energy eigenbasis, {|e 0 , |e 1 }. In this case the final state after the projection, η H , is maximally mixed and we find ∆S H (ρ) = S(η H ) − S(ρ) = ln 2 − 0 = ln 2 ≈ 0.69 . (SI.30) Here, the lower bound is equal to 1 4 = 0.25 because | s ρ | = 1 for a pure state ρ and sin 2 θ = 1 for mutually unbiased bases. Thus in this example the bound is not particularly tight.

F Access to correlated auxiliary systems
Similarly to erasure with a correlated memory 29 one can consider projections on a system S that is correlated with an ancilla A the experimenter has access to. Assuming a total Hamiltonian H SA = H S ⊗ 1 A + 1 S ⊗ H A , we denote the global initial state by ρ SA and its marginals on S and A by ρ S = Tr A [ρ SA ] and ρ A = Tr S [ρ SA ], respectively.
A note on notation. For clarity we employ a slightly different notation here. The roles of initial state ρ and final state η are the same as in the main text and the previous sections of the Supplement. However, now the superscripts of the final state η no longer denote the projection basis but the system for which η describes the state. For instance, η S denotes the reduced state after the projection on system S alone. The same holds for the superscript of the initial state, ρ SA , ρ S and ρ A , and the Hamiltonians H SA , H S and H A . Only the superscript P of the mutually orthogonal rank-1 projectors {Π P k } k acting on system S is kept to indicate which basis is being projected in.
For an initial global state ρ SA of system and ancilla a local projection map on S results in a new global state (SI. 31) Due to the properties of the projectors the marginal state on A is unchanged, The reduced state of the system becomes η S ≡ Tr , and the conditional states on A after the process are denoted The global entropy change associated with the local projection is In the second equality it was used that η SA is a classical-quantum-state and S({p k }) = − k p k ln p k stands for the classical Shannon entropy 30 which is equal to the von Neumann entropy of η S because the final state on S is a classical mixture of states from the projective basis. Here we defined a measure of correlations between the ancilla and the system, δ P (A : S) = S(ρ S ) − S(ρ SA ) + k p k S(η A k ), related to the quantum discord. It depends on the projectors {Π P k } k and is always positive 31,32 . Thus the entropy change of SA can be bigger than the local entropy change, ∆S P = S(η S ) − S(ρ S ), on the system alone.
As is shown in the main text, Eq. (2), the optimal extractable work in a thermodynamic projection process on system S alone is W opt = k B T ∆S P − ∆U P , where ∆S P is the entropy change of the system and ∆U P its change in internal energy. This result stays intact when generalizing to projections in the presence of ancillary systems if one takes the total changes of these quantities on SA instead of the change on S only. In the global process the total internal energy change is equal to the energy change of the system only as the local state of the ancilla is unchanged and the total Hamiltonian is the sum of local Hamiltonians. Thus using side information the overall optimal extractable work amounts to W opt = k B T ∆S P − ∆U P = k B T ∆S P + k B T δ P (A : S) − ∆U P = W opt + k B T δ P (A : S), (SI. 34) where W opt is the work of an optimal thermodynamic projection process without access to correlated systems, Eq. (2) in the main text. Discord was first discussed in a thermodynamic context by Zurek 33 , where he related it to the advantage a quantum Maxwell demon could have over a classical one. In general the quantum discord, δ(A : S), is defined as the minimum of δ P (A : S) over all sets of projectors {Π P k } k whereas in our case this set is fixed (see e.g. Modi et al. 34 for a review). Therefore it is found that even for states with no quantum discord, usually referred to as classically correlated states, a difference in work associated with thermodynamic projection processes can be observed. This contrasts with the erasure process 29 where an advantage could only be gained for highly entangled states.
One may ask what global states on SA maximize W opt for a given state ρ S on S. Expectedly, it can be shown that purifications of ρ S yield the best improvement in terms of extracted work. Given ρ S = l a l |l l| any purification is, up to isometries on the purifying system 1 , equivalent to |Ψ = l √ a l |l S |l A for some orthonormal basis {|l A } l of A. For such a state the conditional states on A after the projection, ρ A k , are pure for all k which implies that they have zero entropy. This implies The optimal total extracted work from a purified state on SA in a thermodynamic projection process is therefore W opt = k B T S(η S ) − ∆U P which can be shown to be the maximum for fixed ρ S and projectors {Π P k } k . One way to see this is the Supplementary Lemma 35 : Proof. We model the process on S as an isometry Φ S→SS = k |ψ k S ⊗ Π S k , whereS is a copy of S and {|ψ k S } k is an orthonormal basis ofS. The state after applying the isometry is denoted η SSA = Φρ SA Φ † and we note that TrS[η SSA ] = η SA . Furthermore, isometries do not change (von Neumann)  where in the last equality we made use of the fact that η SA is a classical-quantum state.
Going back to Eq. (SI.33) and applying the the Supplementary Lemma we see that in general W opt = k B T ∆S P − ∆U P = k B T S(η S ) + k p k S(η A k ) − S(ρ SA ) − ∆U P ≤ k B T S(η S ) − ∆U P , which proves that purifications on SA yield the maximally possible extracted work.