## Introduction

What are the fundamental limits of nature on manipulation of quantum clocks? Suppose we have multiple clocks, all synchronized with the same reference clock, which are affected by noise. Then, by averaging the time read from these clocks we can obtain a more accurate estimate of the current time according to the reference clock. In other words, we can distill a less noisy clock from several noisy clocks. What are the limits of this distillation process for quantum clocks? Can we distill quantum clocks in pure states from those in mixed states, at a nonzero rate?

Interestingly, this question is related to another fundamental question about the manipulation of coherence in quantum thermodynamics. It is now well-understood that coherence between different energy eigenstates is a resource, independent of other thermodynamic resources such as work, and can be used to implement operations which are otherwise impossible1,2,3,4. A fundamental open question in this context is whether the laws of quantum mechanics and thermodynamics allow the existence a coherence distillation machine, i.e., a machine that consumes work to obtain pure coherent states from mixed ones at a nonzero rate (See Fig. 1). The connection between these two questions arises from the fact that the minimum requirement for a system to be a clock is to be in a state which contains coherence (i.e., off-diagonal terms) with respect to the energy-eigenbasis; otherwise, the system will be time-independent, and hence useless as a clock.

In this article, we investigate coherence distillation in the context of quantum thermodynamics, both in the single-shot and asymptotic regimes. In particular, we settle the above questions, which have been open heretofore5,6, and show that the answer to both of them is negative. In other words, the coherence distillation machine, depicted in Fig. 1, is impossible. This is surprising, especially when compared to the previously known results on resource distillation in the entanglement theory and other quantum resource theories (See e.g., 7,8,9,10,11), and reveals important aspects of coherence in quantum thermodynamics. In particular, we will see that, in some precise sense, the coherence content of a single two-level system can be infinitely large. Furthermore, we find that, even though distillation with a non-zero rate is impossible, it is still possible to distill a sublinear number of pure coherent states with a vanishing error. We also consider coherence distillation in the single-shot regime and derive a simple formula for the maximum achievable fidelity.

## Results

### Distillation of quantum clocks

A quantum clock is characterized by its state and Hamiltonian, which usually generates a periodic time evolution12,13,14,15,16,17,18. By definition, the state of a clock should be time-dependent. Therefore, when we say a clock with Hamiltonian $$H$$ is in state $$\rho$$, we actually mean its state is $$\rho$$ at a particular time, say $$t=0$$, with respect to a reference clock. Then, at an arbitrary time $$t$$ the state of clock is $${e}^{-iHt}\rho {e}^{iHt}$$ (Throughout this paper we assume $$\hslash =1$$). Here, we focus on the systems with bounded Hamiltonians, with periodic dynamics, whose period is equal to a fixed (but arbitrary) parameter $$\tau$$, such that $$\tau =\min \{t\ > \ 0:{e}^{-iHt}\rho {e}^{iHt}=\rho \}$$; otherwise, the state and Hamiltonian are completely arbitrary. In the following, when we talk about $$n$$ copies of a system with state $$\rho$$ and Hamiltonian $$H$$, we mean $$n$$ non-interacting systems, with the total Hamiltonian $${\sum }_{i=1}^{n}{H}^{(i)}$$, where $${H}^{(i)}={I}^{\otimes (i-1)}\otimes H\otimes {I}^{\otimes (n-i-1)}$$, and with the joint state $${\rho }^{\otimes n}$$.

Suppose Alice is given a quantum clock with Hamiltonian $${H}_{{\rm{in}}}$$ and state $${\rho }_{{\rm{in}}}$$, synchronized with a standard reference clock owned by Bob. Assume she does not have any additional information about Bob’s clock. In other words, she knows at time $$t$$ relative to Bob’s clock, her quantum clock is in state $${e}^{-i{H}_{{\rm{in}}}t}{\rho }_{{\rm{in}}}{e}^{i{H}_{{\rm{in}}}t}$$; however, the parameter $$t$$ itself is unknown to her.

Now suppose Alice wants to transform this clock to a different clock, with possibly different Hamiltonian $${H}_{{\rm{out}}}$$, which is still synchronized with Bob’s clock, such that at any time $$t$$ relative to his clock the new quantum clock is in state $${e}^{-i{H}_{{\rm{out}}}t}{\rho }_{{\rm{out}}}{e}^{i{H}_{{\rm{out}}}t}$$. For instance, the input clock with Hamiltonian $${H}_{{\rm{in}}}$$ can be multiple copies of a noisy two-level clock in a mixed state, whereas the output clock is a single two-level system, which is more accurate than any single copy at the input, i.e., conveys more information about the parameter $$t$$ (This is an example of single-copy distillation of clocks, which will be discussed later). This means that Alice wants to implement the state conversion

$${e}^{-i{H}_{{\rm{in}}}t}{\rho }_{{\rm{in}}}{e}^{i{H}_{{\rm{in}}}t}\to {e}^{-i{H}_{{\rm{out}}}t}{\rho }_{{\rm{out}}}{e}^{i{H}_{{\rm{out}}}t},\ \forall t\in [0,\tau )\ .$$
(1)

However, since parameter $$t$$ is unknown to her, this conversion should be implemented by a fixed process, independent of $$t$$; i.e., there should exist a physical process, described by a completely positive trace-preserving19,20 map $${\mathcal{E}}$$, such that $${\mathcal{E}}({e}^{-i{H}_{\rm{in}}t}{\rho }_{\rm{in}}{e}^{i{H}_{\rm{in}}t})={e}^{-i{H}_{\rm{out}}t}{\rho }_{\rm{out}}{e}^{i{H}_{\rm{out}}t}$$, for all time $$t\in [0,\tau )$$. It turns out that this is possible if, and only if, the single state conversion $${\rho }_{{\rm{in}}}\to {\rho }_{{\rm{out}}}$$ is possible under a Time-translation Invariant (TI) process, i.e., a process satisfying the covariance condition

$${e}^{-i{H}_{\rm{out}}s}{{\mathcal{E}}}_{\rm{TI}}(\sigma ){e}^{i{H}_{\rm{out}}s}={\mathcal{E}}_{\rm{TI}}\left({e}^{-i{H}_{\rm{in}}s}\sigma {e}^{i{H}_{\rm{in}}s}\right),$$
(2)

for all times $$s$$, and input $$\sigma$$21,22. Therefore, rather than studying the state conversions for the family of states in Eq. (1), one can equivalently study state conversion for the single input-output pair $${\rho }_{{\rm{in}}}$$ and $${\rho }_{{\rm{out}}}$$ under the restricted set of TI operations.

The covariance condition in Eq. (2) means that TI processes are those which can be defined, and hence implemented, independent of  any reference clock. Furthermore, they can be implemented without interfering with the intrinsic time evolution generated by the system Hamiltonian. An example of this type of processes is energy-conserving unitary transformations, i.e., those which commute with the Hamiltonian (assuming the input and output systems have identical Hamiltonians). There are also TI operations which are not energy-conserving, such as, preparing the system in an incoherent state, i.e., any state $$\rho$$ commuting with the system Hamiltonian (Note that in the case of composite systems, the joint state is incoherent if it commutes with the total Hamiltonian).

In summary, we conclude that for distillation or manipulation of quantum clocks, we can restrict our attention to the set of TI operations. In the language of quantum resource theories7,23,24,25,26, these are the free operations for the resource theory of quantum clocks, which is a special case of the resource theory of asymmetry.

It is worth emphasizing that the notion of resource distillation, which can be abstractly defined in any resource theory, has a clear operational interpretation in this framework: it is the process in which one combines noisy clocks, affected by independent noise processes, to obtain less, but more accurate clocks in pure states. More precisely, the information content of each output clock about the unknown parameter $$t$$, i.e., the current time relative to the standard clock, is greater than the information content of each input clock. Hence, using a distillation protocol, one can increase the efficiency of storage and transmission of quantum clocks. Intuitively, one expects that to maximize the information content about parameter $$t$$, the state of quantum clock should be pure. This intuition is confirmed by the fact that pure states maximize any convex measure of information (about the time parameter $$t$$) such as Holevo quantity19,20,27 or quantum Fisher information27,28,29,30. Similarly, from the point of view of parameter estimation, to minimize the error in the estimation of the time parameter $$t\in [0,\tau )$$, as quantified by any cost function which is a linear functional of state, such as mean squared error27,31, the system should be prepared in a pure state.

Interestingly, as we see next, the set of TI operations also naturally arises in the study of coherence in quantum thermodynamics. It is worth mentioning that, in this paper we focus on a notion of coherence which is relevant in the context of quantum clocks and quantum thermodynamics, known as unspeakable coherence6,32. This notion of coherence is a special case of a more general property, called asymmetry32,33,34,35. There are other resource theoretic approaches to coherence, capturing a different notion of coherence, known as speakable coherence6,32 (In these resource theories the eigenvalues of the system Hamiltonian do not play any role).

### Coherence distillation machines

A coherence distillation machine, as depicted in Fig. 1, receives systems in a mixed coherent state, and transforms them to pure coherent states, at a non-zero rate. Recall that a quantum state contains coherence, or is coherent, if its density operator does not commute with its Hamiltonian. In the following, we consider two different frameworks for describing coherence distillation machines and, interestingly, find that they are equivalent and both lead to the notion of TI operations.

Our first approach is to consider the most general processes which can be interpreted as “coherence distillation machines”. What are the constraints on such operations? Clearly, a distillation machine should not generate coherence itself, i.e., should transform incoherent states to incoherent states; otherwise, the coherence at the output cannot be interpreted as distilled coherence. This should hold even if the input is entangled with another closed system with an arbitrary Hamiltonian; if their initial joint state commutes with their total Hamiltonian, then their final state should also commute, and hence be incoherent (See Fig. 2).

We prove that a quantum operation satisfies this property, or is completely incoherence-preserving, iff it is a TI operation (See Supplementary Note 1). This means that, by proving the impossibility of coherence distillation using TI operations, we also establish its impossibility under completely incoherence-preserving operations, which describe the most general processes relevant to coherence distillation.

A different approach to formalizing coherence distillation is to use the framework of the resource theory of quantum thermodynamics (athermality) and the notion of thermal operations24,26,36,37,38,39,40. Thermal operations are those which can be implemented by coupling the system to a thermal bath by energy-conserving unitaries. It turns out that under these operations coherence and work are two independent resources1,2. Therefore, to focus on coherence, one can supplement a thermal operation with an unlimited amount of work at the input (using a battery or work reservoir), which can be modeled as an auxiliary system in an energy eigenstate. What is the set of all operations which can be implemented in this way? Interestingly, it turns out that the answer is again TI operations. In particular, any TI operation $${\mathcal{E}}_{\rm{TI}}$$ on a system $$S$$ with Hamiltonian $${H}_{S}$$ can be implemented by coupling the system to an auxiliary system (battery) with Hamiltonian $${H}_{{\rm{bat}}}$$, such that

$${{\mathcal{E}}}_{{\rm{TI}}}(\sigma )={{\rm{Tr}}}_{{\rm{bat}}}U(\sigma \otimes | E\rangle \langle E{| }_{{\rm{bat}}}){U}^{\dagger },$$
(3)

where (i) the initial state $${\left|E\right\rangle }_{{\rm{bat}}}$$ of the auxiliary system is an eigenstate of its Hamiltonian $${H}_{{\rm{bat}}}$$, and (ii) the unitary $$U$$ that couples it to the system $$S$$ conserves the total energy $${H}_{{\rm{tot}}}={H}_{S}\otimes {I}_{{\rm{bat}}}+{I}_{S}\otimes {H}_{{\rm{bat}}}$$, i.e., $$[U,{H}_{{\rm{tot}}}]=0$$ (See Supplementary Note 1, ref. 41, and theorem 25 of ref. 22).

We conclude that formalizing the notion of coherence distillation machines in the framework of the resource theory of quantum thermodynamics (athermality), again leads us to the notion of TI operations.

To summarize, we saw three different properties, each of which can characterize exactly the same set of operations, namely TI operations: (a) invariance under time-translations, (b) being completely incoherence-preserving, and (c) being implementable with thermal operations supplemented with an arbitrary amount of work. Next, we study distillation of coherence using these processes.

### Main theorem: Typical states have no distillable coherence

An ideal coherence distillation machine is a TI operation (or, equivalently, a completely incoherence-preserving operation) which consumes copies of a system in a mixed state $$\rho$$ as the resource, to generate copies of a system in a pure coherent state $${\phi }_{{\rm{coh}}}$$, at rate $$R\ > \ 0$$, i.e., $${\rho }^{\otimes n}{\to }^{{\rm{TI}}}{\phi }_{{\rm{coh}}}^{\otimes \lceil Rn\rceil }$$. Note that, in general, the Hamiltonians and the Hilbert spaces of the input and output systems can be different. Also, note that $${\phi }_{{\rm{coh}}}$$ can be any pure state of the output system, except the energy eigenstates (For instance, one can choose a two-level system with Hamiltonian $$\pi {\sigma }_{z}/\tau$$, and state $$\left|{\phi }_{{\rm{coh}}}\right\rangle =(\left|0\right\rangle +\left|1\right\rangle )/\sqrt{2}$$, where $$\tau$$ is the period).

In practice, exact transformations are often impossible and physically intractable. Therefore, we can allow a small error $$\epsilon$$ in infidelity20, provided that it vanishes in the limit of infinite copies, i.e., $${\rho }^{\otimes n}{\to }^{{\rm{TI}}}\mathop{\approx }\limits^{\epsilon }{\phi }_{{\rm{coh}}}^{\otimes \lceil Rn\rceil }\,{\rm{as}}\; n\to \infty ,\epsilon \to 0$$ (Recall that infidelity is one minus fidelity, i.e., $$1-\langle \psi | \sigma | \psi \rangle$$ for state $$\sigma$$ and a pure state $$\psi$$. Infidelity is closely related to the trace distance20). Then, by the Helstrom’s theorem20,28, in the limit $$n\to \infty$$, the actual output state is indistinguishable from the desired state $${\phi }_{{\rm{coh}}}^{\otimes \lceil Rn\rceil }$$.

Consider an arbitrary system with bounded Hamiltonian $$H$$ and state $$\rho$$. The distillable coherence $${C}_{{\rm{d}}}^{{\rm{TI}}}(\rho )$$, relative to any standard pure coherent state $${\phi }_{{\rm{coh}}}$$, is the maximum rate at which copies of $${\phi }_{{\rm{coh}}}$$ can be obtained from copies of this system using TI operations (or, equivalently, using completely incoherence-preserving operations),

$${C}_{{\rm{d}}}^{{\rm{TI}}}(\rho )\equiv \sup \ R:{\rho }^{\otimes n}{\to }^{{\rm{TI}}}\mathop{\approx }\limits^{\epsilon }{\phi }_{{\rm{coh}}}^{\otimes \lceil Rn\rceil }\,{\rm{as}}\ n\to \infty ,\epsilon \to 0,$$
(4)

where the error $$\epsilon$$ is vanishing in infidelity (one minus fidelity). Note that this definition resembles the definition of the distillable entanglement9,10,11,42,43, or, more generally, distillable resource in any resource theory (See e.g., 7,8). We prove the following fundamental no-go theorem on coherence distillation:

Theorem. If the projector to the support of state $$\rho$$ commutes with the system Hamiltonian $$H$$, then the rate of distillation of any system in a pure coherent state $${\phi }_{{\rm{coh}}}$$ is zero, i.e., $${C}_{{\rm{d}}}^{{\rm{TI}}}(\rho )=0$$. Thus, for a typical state $$\rho$$, which has full-rank density operator, this rate is zero.

Surprisingly, we find that the hypothetical coherence distillation machine depicted in Fig. 1 is impossible, i.e., starting from asymptotically many copies of a generic mixed state, using a thermal machine we cannot distill pure coherence at a nonzero rate, even if we spend an unlimited amount of work. In fact, it turns out that coherence distillation remains impossible even if, in addition to copies of state $$\rho$$, one is allowed to consume a finite helper system in a pure state, provided that its Hamiltonian is bounded and its Hilbert space is finite-dimensional (See Supplementary Note 5). It is interesting to compare this result with the results of5 and8, which prove that the rate of distillation of speakable coherence is generally non-zero.

Finally, it is worth mentioning that although for a typical mixed state the distillable coherence is zero, there are also mixed states with non-zero distillable coherence. The problem of classifying all such states, and determining the optimal rate of conversion remains open. In Supplementary Note 6 we present examples of such states, and find an achievable distillation rate, which is closely related to a Petz-Rényi relative entropy. These examples rely on the previously known results on state conversions between pure states33,44,45,46, which show that the optimal rate of conversion from a system with the pure state $${\psi }_{1}$$ and Hamiltonian $${H}_{1}$$ to another system with the pure state $${\psi }_{2}$$ and Hamiltonian $${H}_{2}$$, provided that they have the same period, is $$R={V}_{{H}_{1}}({\psi }_{1})/{V}_{{H}_{2}}({\psi }_{2})$$, where $${V}_{H}(\psi )\equiv \langle \psi | {H}^{2}| \psi \rangle -{\langle \psi | H| \psi \rangle }^{2}$$ is the energy variance for state $$\psi$$.

Next, we explain how the above no-go theorem follows from an interesting relation between two quantifiers of coherence, namely quantum Fisher information and a new quantifier, called the purity of coherence.

### Purity of coherence

In recent years, many quantifiers of coherence and asymmetry have been studied (See, for instance,22,34,47,48,49,50,51,52,). These previously known examples, however, all fail to see a simple, yet fundamental feature of coherence: Given any finite copies of a generic mixed state, it is impossible to generate a single copy of a pure coherent state (with a non-zero probability), using only TI operations. Here, we introduce a new quantifier of coherence which captures the missing part of the picture and predicts the unreachability of pure coherent states.

For a system with state $$\rho$$, let the Purity of Coherence with respect to the eigenbasis of an observable $$H$$ be

$${P}_{H}(\rho )\equiv {\rm{Tr}}(H{\rho }^{2}H{\rho }^{-1})-{\rm{Tr}}(\rho {H}^{2})$$
(5)
$$=\sum _{j,k}\frac{{p}_{k}^{2}-{p}_{j}^{2}}{{p}_{j}}| \langle {\psi }_{k}| H| {\psi }_{j}\rangle {| }^{2}\ ,$$
(6)

if $${\rm{supp}}(H\rho H)\subseteq {\rm{supp}}(\rho )$$,  and  $${P}_{H}(\rho )=\infty$$ otherwise, where $$\rho ={\sum }_{j}{p}_{j}\left|{\psi }_{j}\right\rangle \left\langle {\psi }_{j}\right|$$ is the spectral decomposition of $$\rho$$.

As we discuss below, this function is an example of a generalized family of Fisher information introduced by Petz53,54. Also, in Supplementary Note 2 we show that this function can be thought as the second derivative of Petz-Rényi relative entropy (for $$\alpha =2$$)55,56. Using this fact we show that purity of coherence is (i) non-negative and it becomes zero iff state is incoherent, (ii) non-increasing under any TI operation $${{\mathcal{E}}}_{{\rm{TI}}}$$, i.e., $${P}_{{H}_{{\rm{out}}}}({{\mathcal{E}}}_{{\rm{TI}}}(\rho ))\le {P}_{{H}_{{\rm{in}}}}(\rho )$$. In particular, it is invariant under energy-conserving unitaries. (iii) Additive: for uncorrelated composite systems which are not interacting with each other, i.e., $${P}_{{H}_{{\rm{tot}}}}({\rho }_{1}\otimes {\rho }_{2})={P}_{{H}_{1}}({\rho }_{1})+{P}_{{H}_{2}}({\rho }_{2})$$, where $${H}_{{\rm{tot}}}={H}_{1}\otimes {I}_{2}+{I}_{1}\otimes {H}_{2}$$, and (iv) a convex function of $$\rho$$.

The above definition implies that for pure states the purity of coherence is $$\infty$$, unless the state is an energy eigenstate, in which case it is zero. This unboundedness of the purity of coherence, captures the unreachability of pure coherent states from generic mixed states: Suppose there exists a TI operation which receives $$n$$ copies of a system with state $${\rho }_{1}$$ and Hamiltonian $${H}_{1}$$, and with probability of success $$p$$, transforms them to a single copy of a system with state $${\rho }_{2}$$ and Hamiltonian $${H}_{2}$$. Using properties (i-iv), in Supplementary Note 2 we show

$$n\ge p\times \frac{{P}_{{H}_{2}}({\rho }_{2})}{{P}_{{H}_{1}}({\rho }_{1})}\ .$$
(7)

Thus, to generate a single copy of a pure coherent state $${\rho }_{2}$$, we need $$n=\infty$$ or $${P}_{{H}_{1}}({\rho }_{1})=\infty$$. These properties of purity of coherence make it a powerful tool to study coherence distillation, both in the asymptotic and single-shot regimes.

### Relation with Quantum Fisher Information

It turns out that the purity of coherence has an interesting relation with Quantum Fisher Information (QFI), and this relation plays a crucial role in the proof of our no-go theorem. Recall that for the family of states $${\{{e}^{-iHt}\rho {e}^{iHt}\}}_{t}$$, QFI associated to the time parameter $$t$$ is

$${F}_{H}(\rho )=2\sum _{j,k}\frac{{({p}_{j}-{p}_{k})}^{2}}{{p}_{j}+{p}_{k}}| \langle {\psi }_{j}| H| {\psi }_{k}\rangle {| }^{2}\ .$$
(8)

where $$\rho ={\sum }_{j}{p}_{j}\left|{\psi }_{j}\right\rangle \left\langle {\psi }_{j}\right|$$ is the spectral decomposition of $$\rho$$. QFI is the central quantity of quantum metrology and estimation theory27,28,29,30,54, and has found extensive applications in different areas of physics (See e.g.,57,58,59,60,61,62). QFI satisfies properties (i-iv) listed above for the purity of coherence. In particular, it is additive and monotone under TI operations.

A closer look at the properties of the purity of coherence and QFI reveals an interesting relation between them: First, comparing Eq. (6) and Eq. (8), one can easily show that the purity of coherence is always larger than or equal to QFI, i.e., $${P}_{H}(\rho )\ \ge \ {F}_{H}(\rho )$$, and the equality holds iff $$\rho$$ is incoherent. Furthermore, for two-level systems, we find the nice formula

$${P}_{H}(\rho )=\frac{{F}_{H}(\rho )}{2[1-{\rm{Tr}}({\rho }^{2})]},$$
(9)

i.e., the purity of coherence is determined by a combination of QFI and the purity, $${\rm{Tr}}({\rho }^{2})$$. This means that, for states close to the maximally mixed state, $${P}_{H}(\rho )/{F}_{H}(\rho )\approx 1$$, whereas for states close to a generic pure state, $${P}_{H}(\rho )$$ can be arbitrarily larger than $${F}_{H}(\rho )$$. We show that these properties hold beyond two-level systems: In general, if $$\rho$$ is $$\epsilon$$-close to the maximally mixed state in infidelity, then $$\frac{{P}_{H}(\rho )}{{F}_{H}(\rho )}=1+{\mathcal{O}}(\sqrt{\epsilon })$$. In the opposite limit, where $$\rho$$ is close to a pure state, we find $${P}_{H}(\rho )\ \ge \ \frac{1}{4}{F}_{H}({\psi }_{\max })\times [\frac{{p}_{\max }^{2}}{1-{p}_{\max }}-1],$$ where $${p}_{\max }$$ is the largest eigenvalue of $$\rho$$, and $${\psi }_{\max }$$ is the corresponding eigenvector (See Supplementary Note 3). Again, as $$\rho$$ converges to a pure state, the purity $${\rm{Tr}}({\rho }^{2})$$ and $${p}_{\max }$$ converge to one. In this case, $${P}_{H}(\rho )$$ diverges, unless the pure state is an energy eigenstate.

We conclude that, roughly speaking, the purity of coherence $${P}_{H}(\rho )$$ is lower bounded by the ratio of QFI (for a pure state close to $$\rho$$) to one minus the purity of state; hence, higher $${P}_{H}(\rho )$$ means more pure coherence, which justifies its name.

It is interesting to note that the relation between the purity of coherence and QFI is analogous to the relation between the total and free energies in thermodynamics; the latter distinguishes ordered (low-entropy) energy and disordered (high-entropy) energy. Similarly, the purity of coherence, can recognize the distinction between the pure and mixed coherence. It turns out that for some operations, such as coherence distillation, the same amount of coherence quantified by QFI in states with more purity is a more useful resource.

### RLD and SLD Fisher information

It is worth mentioning that both of these quantifiers of coherence, i.e., the purity of coherence $${P}_{H}$$ and QFI $${F}_{H}$$, are specials cases of a generalized family of Fisher Information. Classically, Fisher information is the unique (up to a normalization) stochastically monotone Riemannian metric on the space of probability distributions63. In the quantum case, on the other hand, there is a family of monotone metrics on the space of density operators, which is fully characterized by Petz53,54 (See also ref. 63). Interestingly, functions $${P}_{H}$$ and $${F}_{H}$$ are extremal points in this family: they are, respectively, the maximal and minimal monotone metrics calculated for the one-parameter family of states $${\{{e}^{-iHt}\rho {e}^{iHt}\}}_{t}$$. In quantum estimation literature, these functions are often respectively called Right Logarithmic Derivative (RLD) and Symmetric Logarithmic Derivative (SLD) Fisher Information. Following the physics literature convention, here we have referred to SLD Fisher information as Quantum Fisher Information (QFI).

Remarkably, these two extremal functions have also distinguished roles in the resource theory of (unspeakable) coherence and quantum clocks: it has been recently shown that QFI (SLD Fisher Information) determines the coherence cost, i.e., the minimum rate of consumption of standard pure coherent states that is needed to generate the desired mixed state, using TI operations46. Also, it is well-known that QFI determines the lowest achievable mean square error for estimating the time parameter. On the other hand, it turns out that the purity of coherence (RLD Fisher Information) is relevant in the context of coherence distillation (See Fig. 3), and provides a powerful tool for proving our no-go theorem on coherence distillation.

### Proof of the main theorem

To prove the impossibility of coherence distillation machines, we use the properties of the purity of coherence, namely its monotonicity and additivity, and its relation with QFI. Note that the impossibility of distillation cannot be shown using QFI alone, because it increases linearly in $$n$$, for both the input and the desired output states. As we explain in the following, the main challenge in proving this theorem is the fact that QFI and the purity of coherence are not asymptotically continuous64.

In Supplementary Note 4 we prove the following result, which is of independent interest: Consider $$m$$ non-interacting systems, each with Hamiltonian $$H$$, and with the total Hamiltonian $${H}_{{\rm{tot}}}={\sum }_{i=1}^{m}{H}^{(i)}$$, in the joint state $${\sigma }_{m}$$. Suppose the fidelity of $${\sigma }_{m}$$ and state $${\left|\phi \right\rangle }^{\otimes m}$$, is $${\langle \phi {| }^{\otimes m}{\sigma }_{m}| \phi \rangle }^{\otimes m}=1-\epsilon$$. Then, for sufficiently large $$m$$, e.g., $$m\ \ge \ 70\frac{| \langle \phi | {H}^{3}| \phi \rangle {| }^{2}}{{V}_{H}^{3}(\phi )}$$ and sufficiently small $$\epsilon$$, e.g., $$\epsilon \le 1{0}^{-3}$$, QFI and the purity of coherence of state $${\sigma }_{m}$$ relative to the total Hamiltonian $${H}_{{\rm{tot}}}$$, are lower bounded by

$${F}_{{H}_{{\rm{tot}}}}({\sigma }_{m})\ge 4c\times m\times {F}_{H}(\phi ),$$
(10)
$${P}_{{H}_{{\rm{tot}}}}({\sigma }_{m})\ge c\times m\times {F}_{H}(\phi )\times \frac{1}{\epsilon },$$
(11)

where $$c$$ is a positive constant, e.g., $$c=1{0}^{-2}$$ (Recall that for a pure state $$\phi$$, QFI is $${F}_{H}(\phi )=4{V}_{H}(\phi )$$). Note that similar to the case of a single qubit in Eq. (9), the lower bound on the purity of coherence in Eq. (11) grows linearly with $${\epsilon }^{-1}$$.

At first glance, these bounds might seem intuitive from our previous discussions: For instance, Eq. (10) means that to be able to have a large fidelity with state $${\phi }^{\otimes m}$$, QFI of state $${\sigma }_{m}$$ should also grow (at least) linearly with $$m$$, which might be expected from the additivity of QFI. However, a more careful analysis is needed: the Hamiltonian $${H}_{{\rm{tot}}}$$ has eigenvalues of order $$m\times \parallel H\parallel$$, which means relative to this Hamiltonian, two states with infidelity $$\epsilon$$ can have QFI’s which differ by order $$\epsilon \times {m}^{2}\parallel H{\parallel }^{2}$$. Thus, while one state can have a large QFI, e.g., linear in $$m$$, the other might have a negligible QFI. This makes the proof of the above bounds non-trivial.

Now suppose there exists a TI operation $${{\mathcal{E}}}_{n}$$ which converts $${\rho }^{\otimes n}$$ to state $${\sigma }_{m(n)}$$ whose fidelity with the desired state $${\phi }_{{\rm{coh}}}^{\otimes m(n)}$$ is $$1-{\epsilon }_{n}$$. To simplify the notation, we assume the Hamiltonian of each copy at the input is the same as the Hamiltonian of each copy at the output, which is denoted by $$H$$ (This assumption is not needed for the proof). Then, using the additivity of the purity of coherence, the total purity of coherence of the input is $$n\ \times \ {P}_{H}(\rho )$$. Since this quantity is monotone under TI operations, the purity of coherence of the output is $${P}_{{H}_{{\rm{tot}}}}({\sigma }_{m(n)})\le n\ \times \ {P}_{H}(\rho )$$. Combined with Eq. (11), this leads to

$$\frac{m(n)}{n} \le \frac{1}{c}\times \frac{{P}_{H}(\rho )}{{F}_{H}({\phi }_{{\rm{coh}}})}\times {\epsilon }_{n}.$$
(12)

This interesting inequality implies that to make error $${\epsilon }_{n}$$ small, the yield $$m(n)/n$$ should also be small, unless $${F}_{H}({\phi }_{{\rm{coh}}})=0$$, i.e., $${\phi }_{{\rm{coh}}}$$ is incoherent, or $${P}_{H}(\rho )=\infty$$. Thus, if $${P}_{H}(\rho )$$ is bounded and $${\phi }_{{\rm{coh}}}$$ is coherent, then to have vanishing error $${\epsilon }_{n}\to 0$$, we also need to have vanishing yield, $$\mathop{\mathrm{lim}}\limits_{n\to \infty }m(n)/n=0$$, which means the distillable coherence is zero. We show that for a bounded Hamiltonian $$H$$, $${P}_{H}(\rho )\ <\ \infty$$ iff $${\Pi }_{\rho }$$, the projector to the support of $$\rho$$, commutes with $$H$$. We conclude that if $$[{\Pi }_{\rho },H]=0$$, then the distillable coherence is zero, which proves the theorem.

### Sub-linear Coherence Distillation: Trade-off between the maximum achievable yield and fidelity

Even though for states with finite purity of coherence the distillable coherence is zero, interestingly, it turns out that any state which contains coherence can still be used to distill a sub-linear number of pure coherent states. In the above scenario, let $${m}_{{\rm{opt}}}(n)$$ be the maximum number of copies of $${\phi }_{{\rm{coh}}}$$ which can be distilled with error less than $${\epsilon }_{n}$$, and $${r}_{{\rm{opt}}}(n)={m}_{opt}(n)/n$$ be the maximum achievable yield. Assuming the input and output systems have the same period, the ratio of $${r}_{{\rm{opt}}}(n)$$ to error $${\epsilon }_{n}$$ satisfies

$$4[1-o(1)]\times \frac{{F}_{H}(\rho )}{{F}_{H}({\phi }_{{\rm{coh}}})}\le \frac{{r}_{{\rm{opt}}}(n)}{{\epsilon }_{n}}\le \frac{1}{c}\times \frac{{P}_{H}(\rho )}{{F}_{H}({\phi }_{{\rm{coh}}})},$$
(13)

where the upper bound on $${r}_{{\rm{opt}}}(n)/{\epsilon }_{n}$$ follows from Eq. (12), and holds assuming the number of distilled copies is sufficiently large, e.g., $${m}_{opt}(n)\ \ge \ 70\frac{| \langle {\phi }_{{\rm{coh}}}| {H}^{3}| {\phi }_{{\rm{coh}}}\rangle {| }^{2}}{{V}_{H}^{3}({\phi }_{{\rm{coh}}})}$$, and error $${\epsilon }_{n}$$ is sufficiently small, e.g., $${\epsilon }_{n}\le 1{0}^{-3}$$. These assumptions are not required for the lower bound.

This means that there is a trade-off between fidelity and yield. For instance, for sufficiently large $$n$$, one can achieve the yield $$r(n)=4\frac{{F}_{H}(\rho )}{{F}_{H}({\phi }_{{\rm{coh}}})}{n}^{-\alpha }$$, for arbitrary exponent $$\alpha \ > \ 0$$, with infidelity $${\epsilon }_{n}={n}^{-(\alpha -\delta )}$$ where $$\delta \ > \ 0$$ can be arbitrary small. Choosing smaller $$\alpha \ > \ 0$$, means higher yield and also larger error. This should be compared with the recent results on distillation of speakable coherence65,66,67,68 (In particular, in the case of strictly incoherent operations5,69, there are bound states, which cannot be converted to a single copy of a pure coherent state with a vanishing error, even if one is given an arbitrary many copies of state66,67,68). This tradeoff and the linear relation between the yield and error, which highlights the significance of yield-to-error ratio as a fundamental quantity, are unique features of this resource theory, which have practical implications in the context of quantum clocks, and are worth further study.

In the Methods section, we also discuss an interesting corollary of this result, namely a novel operational explanation of the violation of the monotonicity of Petz-Rényi relative entropy under data processing, for the parameter range $$\alpha \ > \ 2$$55,56.

To establish the lower bound on $${r}_{{\rm{opt}}}(n)/{\epsilon }_{n}$$ in Eq. (13), we consider a TI process defined based on a parameter estimation task: Suppose one is given $$n$$ copies of state $${e}^{-iHt}\rho {e}^{iHt}$$, where $$t\in [0,\tau )$$ is unknown (Recall that $$\tau$$ is the period of both the input and the desired output systems). Measuring these systems, one can obtain an estimate $${t}_{{\rm{est}}}\in [0,\tau )$$ of $$t$$, with probability density $$p({t}_{{\rm{est}}}| t)$$. We can assume the estimator is invariant under time-translations, such that $$p({t}_{{\rm{est}}}| t)=p({t}_{{\rm{est}}}-s| t-s):\forall s\in [0,\tau )$$, where the subtraction is mod $$\tau$$; if this is not the case, one can always make the estimator invariant by adding a random time translation to the input state, and then canceling it at the output of the estimator (See Supplementary Note 7). Suppose after obtaining the estimate $${t}_{{\rm{est}}}$$ one prepares $$m(n)$$ copies of state $${e}^{-iH{t}_{{\rm{est}}}}\left|{\phi }_{{\rm{coh}}}\right\rangle$$. Then, the entire measure-and-prepare process will be described by a TI operation. Furthermore, as we show in Supplementary Note 7, applying this TI operation on the input $${\rho }^{\otimes n}$$, the fidelity of the resulting state with the desired state $${\left|{\phi }_{{\rm{coh}}}\right\rangle }^{\otimes m(n)}$$ is

$$\begin{array}{l}\int _{0}^{\tau }d{t}_{{\rm{est}}}p({t}_{{\rm{est}}}| t=\, 0)| \langle {\phi }_{{\rm{coh}}}| {e}^{iH{t}_{{\rm{est}}}}| {\phi }_{{\rm{coh}}}\rangle {| }^{2m(n)}\\ \ge 1-m(n){F}_{H}({\phi }_{{\rm{coh}}})\times \langle \delta {t}^{2}\rangle /4,\end{array}$$
(14)

where $${F}_{H}({\phi }_{{\rm{coh}}})$$ is four times the energy variance of $${\phi }_{{\rm{coh}}}$$, and $$\langle \delta {t}^{2}\rangle ={\int }_{0}^{\tau }d{t}_{{\rm{est}}}p({t}_{{\rm{est}}}| t){(t-{t}_{{\rm{est}}})}^{2}$$ is the Mean Squared Error (MSE) of the estimator (Note that because of time-translation symmetry, MSE is independent of $$t$$). Therefore, the ratio of the yield $$r(n)=m(n)/n$$ to infidelity $${\epsilon }_{n}$$, satisfies

$$\frac{r(n)}{{\epsilon }_{n}}\ge \frac{4}{{F}_{H}({\phi }_{{\rm{coh}}})\times n\langle \delta {t}^{2}\rangle }.$$
(15)

For any reasonable estimator the MSE $$\langle \delta {t}^{2}\rangle$$ scales as $$1/n$$. Therefore, as $$n$$ goes to infinity, the above lower bound remains positive. In particular, as shown in30,70, there exists an estimator working based on the classical Maximum Likelihood (ML) estimator, which achieves MSE $$\langle \delta {t}^{2}\rangle =1/(n{F}_{H}(\rho ))+o(1/n)$$, i.e., saturates the Quantum Cramér-Rao bound27,30,71. Therefore, using Eq. (15), we find that the ratio $$r(n)/{\epsilon }_{n}$$ for this estimator, satisfies the lower bound in Eq. (13).

It is worth noting that in the high noise regime, where each input copy $$\rho$$ is close to the maximally mixed state, we have $${P}_{H}(\rho )/{F}_{H}(\rho )\approx 1$$, and therefore the lower and upper bounds in Eq. (13) coincide, up to a constant factor $$1/c$$. Therefore, in this regime we can achieve close to optimal distillation using a measure-and-prepare strategy. Furthermore, because asymptotically the optimal MSE can be achieved using local adaptive measurements on individual copies30,70, this distillation process does not require any entangling interactions between the input copies. On the other hand, as we discuss in Methods section, such measure-and-prepare TI operations are, in general, sub-optimal for distillation in the low-noise regime.

### Single-shot Coherence Distillation: Exact formula

Next, we consider the problem of coherence distillation in the single-shot regime: suppose we are given $$n$$ copies of a system in a mixed state $$\rho$$ as the resource, and we want to obtain a single copy of a system in a pure coherent state $$\psi$$, using only TI operations? What is the maximum achievable fidelity $${\max }_{{{\mathcal{E}}}_{{\rm{TI}}}}\langle \psi | {{\mathcal{E}}}_{{\rm{TI}}}({\rho }^{\otimes n})| \psi \rangle$$, where the maximization is over all TI operations.

Using the approach of72, we find a simple general formula for the maximum achievable fidelity:

$${\max }_{{{\mathcal{E}}}_{{\rm{TI}}}}\langle \psi | {{\mathcal{E}}}_{{\rm{TI}}}({\rho }^{\otimes n})| \psi \rangle ={2}^{-{H}_{\min }{({\rm{out}}| {\rm{in}})}_{\Omega }},$$
(16)

where $${H}_{\min }{({\rm{out}}| {\rm{in}})}_{\Omega }$$ is the conditional min-entropy56,73, for the bipartite state $${\Omega }_{{\rm{in}},{\rm{out}}}$$, obtained by dephasing state $${({\rho }^{\otimes n})}_{{\rm{in}}}\otimes \left|\psi \right\rangle {\left\langle \psi \right|}_{{\rm{out}}}$$ in the eingenbasis of Hamiltonian $${H}_{{\rm{in}}}\otimes {I}_{{\rm{out}}}-{I}_{{\rm{in}}}\otimes {H}_{{\rm{out}}}$$. Here, $${I}_{{\rm{in}}}$$ and $${I}_{{\rm{out}}}$$ are the identity operators, and $${H}_{{\rm{in}}}={\sum }_{i=1}^{n}{H}^{(i)}$$ and $${H}_{{\rm{out}}}$$ are the input and output Hamiltonians, respectively. See Supplementary Note 9, for the proof and further discussion about this formula.

Although important, Eq. (16) does not clearly show the asymptotic behavior of the maximum achievable fidelity. On the other hand, our results on the purity of coherence and sub-linear coherence distillation yield simple general upper and lower bounds on the maximum achievable fidelity. Note that in Eq. (15), the number of distilled copies $$m(n)$$ is arbitrary and can be independent of $$n$$. In fact, as we explain in Supplementary Note 7, for any (fixed) finite $$m(n)=m$$, Eq. (15) is tight in the regime $$n\to \infty$$, and for ML estimator, $$n\times {\epsilon }_{n}$$ converges to $$m{F}_{H}({\phi }_{{\rm{coh}}})/4{F}_{H}(\rho )$$, where $${\epsilon }_{n}$$ is the infidelity of the output with $$m$$ copies of $${\phi }_{{\rm{coh}}}$$.

### Example: Single-shot distillation of a two-level system

The smallest quantum clock is a system with two different energy levels. Without loss of generality we assume the Hamiltonian of this system is $$H=\pi {\sigma }_{z}/\tau$$. Suppose we want to prepare this clock in state $$\left|{\phi }_{{\rm{coh}}}\right\rangle =(\left|0\right\rangle +\left|1\right\rangle )/\sqrt{2}$$, but we have access to a noisy version of this state, i.e., $$\rho =\lambda \left|{\phi }_{{\rm{coh}}}\right\rangle {\left\langle {\phi }_{{\rm{coh}}}\right|}_{{\rm{}}}+(1-\lambda )I/2$$, with $$0\ <\ \lambda \ <\ 1$$. The goal is to use $$n\gg 1$$ copies of $$\rho$$ to obtain a state with higher fidelity with $$\left|{\phi }_{{\rm{coh}}}\right\rangle$$. What is the lowest achievable infidelity? Using the properties of the purity of coherence and, in particular, Eq. (7), in Supplementary Note 10 we show that the infidelity is lower bounded by

$$1-{\max }_{{{\mathcal{E}}}_{{\rm{TI}}}}\langle {\phi }_{{\rm{coh}}}| {{\mathcal{E}}}_{{\rm{TI}}}({\rho }^{\otimes n})| {\phi }_{{\rm{coh}}}\rangle \ge \frac{1}{n}\frac{1-{\lambda }^{2}}{4{\lambda }^{2}}+{\mathcal{O}}(\frac{1}{{n}^{2}}).$$
(17)

Therefore, in the limit of large $$n$$, infidelity times $$n$$ is lower bounded by $$(1-{\lambda }^{2})/4{\lambda }^{2}$$. In Fig. 3 we compare this lower bound with the infidelity achieved by two different TI processes: (i) an operation related to quantum Schur transformation, studied previously in74, which has full SU(2) symmetry, and hence is also TI. As we discuss in Supplementary Note 10, the results of74 implies that using this process we can achieve the infidelity $$(1-\lambda )/2n{\lambda }^{2}$$. (ii) The measure-and-prepare process based on the ML estimator, discussed in the previous section, which achieves the infidelity $${n}^{-1}\times {F}_{H}(\Phi )/4{F}_{H}(\rho )=1/(4n{\lambda }^{2})$$.

Remarkably, we find that the bound imposed by the purity of coherence in Eq. (17) is tight in both high-noise ($$\lambda \to 0$$) and low-noise ($$\lambda \to 1$$) regimes. This suggests that this bound is achievable for all values of $$\lambda$$, and, at least in this example, the purity of coherence determines the ultimate limit of coherence distillation in the single-shot regime.

### Discussion

In recent years there has been a significant progress in understanding the concept of coherence in the context of quantum thermodynamics (See e.g.,1,2,3,4,6,40,75,76). Nevertheless, some aspects of coherence are still not well-understood. Here, we highlighted an important feature of quantum coherence which manifests itself, for instance, in the unreachability of pure coherent states from mixed states in both the single-shot and asymptotic regimes, and the fact that (in some precise sense) the coherence content of a single qubit can be arbitrarily large. To quantify this feature of coherence, we introduced a new quantifier of coherence, called the purity of coherence and showed that the monotonicity of this quantity under TI operations gives a tight bound on the coherence distillation in the single-shot regime. The tightness of this bound supports the idea that the purity of coherence is adequately quantifying the unreachability of pure coherent states from mixed states.

In this paper, we focused on the implications of our results in the context of quantum clocks and thermodynamics. Another important area of applications is quantum metrology32,47,77,78,79, which will be discussed in future works.

## Methods

### Limited power of TI measure-and-prepare processes for distillation

In the above example, it is interesting to note that in the high noise regime, the optimal distillation can be achieved using a measure-and-prepare TI process. On the other hand, in the opposite limit, where the input state $$\rho$$ is almost pure, measure-and-prepare TI processes are not optimal for coherence distillation. In fact, as it can be seen in Fig. 3, even if the input is $$n$$ copies of a pure coherent state $${\phi }_{{\rm{coh}}}$$, the output of a measure-and-prepare distillation process can not be a pure coherent state for any finite $$n$$.

To understand this fact better, in the following we derive a strong constraint on the power of measure-and-prepare TI processes for manipulation of coherence. This constraint is a corollary of the following result: For any state $$\rho$$ and any Measure-and-Prepare TI process $${{\mathcal{E}}}_{{\rm{MP}}-{\rm{TI}}}$$, it holds that

$${P}_{{H}_{{\rm{out}}}}({{\mathcal{E}}}_{{\rm{MP}}-{\rm{TI}}}(\rho ))\le {F}_{{H}_{{\rm{in}}}}(\rho )\le {P}_{{H}_{{\rm{in}}}}(\rho ),$$
(18)

i.e., the purity of coherence of the output is upper bounded by the input QFI, where $${H}_{{\rm{in}}}$$ and $${H}_{{\rm{out}}}$$ are, respectively, the input and output Hamiltonians (See below for further discussion). This means that for input $${\rho }^{\otimes n}$$, the purity of coherence of the output of a measure-and-prepare TI process is upper bounded by $$n\ \times \ {F}_{{H}_{{\rm{in}}}}(\rho )$$. On the other hand, for a general TI process the purity of coherence of the output can be as large as $$n\ \times \ {P}_{{H}_{{\rm{in}}}}(\rho )$$, which is much larger than $$n\ \times \ {F}_{{H}_{{\rm{in}}}}(\rho )$$, if $$\rho$$ is close to a coherent pure state (For instance, in the above example, Schur transformation reaches this bound in the low noise regime).

Combining this result with the lower bound on the purity of coherence in Eq. (11), we find that if one applies a measure-and-prepare TI process to $$n$$ copies of $$\rho$$ to obtain $$m(n)$$ copies of a pure coherent state $${\phi }_{{\rm{coh}}}$$ with error $${\epsilon }_{n}$$, then for sufficiently large $$m(n)$$ and small error $${\epsilon }_{n}$$, the yield $$r(n)=m(n)/n$$ and error $${\epsilon }_{n}$$ satisfy $$r(n)/{\epsilon }_{n}\le 1/c\ \times \ {F}_{H}(\rho )/{F}_{H}({\phi }_{{\rm{coh}}})$$. Therefore, if QFI of state $$\rho$$ is finite, which is always the case for systems with bounded Hamiltonians, then using measure-and-prepare TI processes it is not possible to achieve a finite yield $$r(n)\ > \ 0$$ with a vanishing error $${\epsilon }_{n}\to 0$$, even if $$\rho$$ is a pure coherent state, i.e., has an unbounded purity of coherence.

In Supplementary Note 8 we present the proof of inequality $${P}_{{H}_{{\rm{out}}}}({{\mathcal{E}}}_{{\rm{MP}}-{\rm{TI}}}(\rho ))\le {F}_{{H}_{{\rm{in}}}}(\rho )$$ in Eq. (18). We also note that this inequality follows from the previous result of80. The main idea is the following: By definition any measure-and-prepare process can be realized by a measurement on the input followed by a state preparation at the output, which solely depends on the classical outcome of the measurement. For input states $${\{{e}^{-i{H}_{{\rm{in}}}t}\rho {e}^{i{H}_{{\rm{in}}}t}\}}_{t}$$, consider the distribution of outcomes of this measurement, as a function of parameter $$t$$. Then, the (classical) Fisher information corresponding to parameter $$t$$ is upper bounded by QFI of the input state, i.e., $${F}_{{H}_{{\rm{in}}}}(\rho )$$. As we show in Supplementary Note 8, this classical Fisher information, itself, is an upper bound on $${P}_{{H}_{{\rm{out}}}}({{\mathcal{E}}}_{{\rm{MP}}-{\rm{TI}}}(\rho ))$$, the purity of coherence of the output (This also has been shown previously in80). Roughly speaking, this is true because at the classical level, the distinction between Fisher information and the purity of coherence vanishes (This is related to Čencov’s theorem63 which asserts that, up to a normalization, Fisher information is the unique monotone metric on the space of classical probability distributions).

### Violation of monotonicity of Petz-Rényi relative entropy in the light of coherence distillation

Our results on coherence distillation, and in particular Eq. (13) and Eq. (17), provide a novel operational understanding of the violation of monotonicity of Petz-Rényi relative entropy under data-processing, for $$\alpha \ > \ 2$$. Recall that for $$\alpha \ > \ 1$$, Petz-Rényi relative entropy is defined as $$D_{\alpha }(\rho \parallel \sigma )=\frac{1}{\alpha -1}{\mathrm{log}}{\rm{Tr}}({\rho}^{\alpha }{\sigma }^{1-\alpha })$$, if $${\rm{supp}}(\rho )\subseteq {\rm{supp}}(\sigma )$$ and $${D}_{\alpha }(\rho \parallel \sigma )=\infty$$, otherwise55,56. For $$\alpha \in (1,2]$$, and any completely positivity trace-preserving map $$\mathcal{E}$$, $$D_{\alpha }(\mathcal{E}(\rho )\parallel \mathcal{E}(\sigma ))\le {D}_{\alpha }(\rho \parallel \sigma )$$, whereas this bound is violated for $$\alpha \ > \ 2$$55,56. As we mentioned before, the purity of coherence can be derived from the second derivative of the Petz-Rényi relative entropy for $$\alpha =2$$, and its monotonicity under TI operations follows from the monotonicity of this relative entropy (See Supplementary Note 2). Considering the second derivative of Petz-Rényi relative entropy for other values of $$\alpha \in (1,\infty )$$, we can generalize the purity of coherence, and obtain the family of functions defied by the formula $$P_{H,\alpha}(\rho )\equiv {\rm{Tr}}({\rho}^{\alpha}H{\rho}^{1-\alpha}H)-{\rm{Tr}}(\rho {H}^{2})$$, if the projector to the support of $$\rho$$ commutes with $$H$$, and $$P_{H,\alpha }(\rho )=\infty$$ otherwise. Similar to the purity of coherence, all these functions are (i) additive, (ii) non-zero iff state is coherent, and (iii) bounded if the projector to the support of $$\rho$$ commutes with $$H$$. Furthermore, for any state $$\rho$$ whose infidelity with a pure coherent state is $$\epsilon$$, $${P}_{H,\alpha }(\rho )$$ scales (at least) as $${\epsilon }^{1-\alpha }$$. It follows that, if instead of the purity of coherence we use other monotone functions in this family, we obtain other  lower bounds on the achievable infidelity. In particular, such a bound would imply that if the purity of coherence of a mixed state $$\rho$$ is finite, then to distill a single copy of a pure coherent state $${\phi }_{{\rm{coh}}}$$ with error $$\epsilon$$, the required number of copies of $$\rho$$ is, at least, of order $${\epsilon }^{1-\alpha }$$, i.e., $$n\in \Omega ({\epsilon }^{1-\alpha })$$. For $$\alpha \ > \ 2$$ this bound is asymptotically stronger than the bound imposed by purity of coherence, which is linear in $${\epsilon }^{-1}$$.

However, as we have seen in the proof of Eq. (13) and also in Fig. 3, there exists a TI process based on the ML estimator which achieves errors of order $$\epsilon$$, by consuming only order $${\epsilon }^{-1}$$ copies of $$\rho$$. Therefore, if Petz-Rényi relative entropy was monotone for $$\alpha \ > \ 2$$, we had a lower bound on the number of required copies, which was violated by this coherence distillation process. This provides an operational explanation that why the Petz-Rényi relative entropy cannot be monotone under data-processing for $$\alpha \ > \ 2$$: $$\alpha =2$$ is the largest value for which the monotonicity of Petz-Rényi relative entropy is not violated by coherence distillation processes.

### Proofs

All the results in the paper are rigorously proven in the Supplementary Notes 1-10.