The flourishing interplay between quantum computation and machine learning has inspired a wealth of algorithmic invention in recent years1,2,3. Among the most promising proposals are quantum classification algorithms that aspire to leverage the exponentially large Hilbert space uniquely accessible to quantum algorithms to either drastically speed up computational bottlenecks in classical protocols4,5,6,7, or to construct quantum-enhanced kernels that are practically prohibitive to compute classically8,9,10. Although these quantum classifiers are recognized as having the potential to offer quantum speedup or superior predictive accuracy, they are shown to be just as vulnerable to input perturbations as their classical counterparts11,12,13,14. These perturbations can occur either due to imperfect implementation that is prevalent in the noisy, intermediate-scale quantum (NISQ) era15, or, more menacingly, due to adversarial attacks where a malicious party aims to fool a classifier by carefully crafting practically undetectable noise patterns that trick a model into misclassifying a given input.

In order to address these short-comings in reliability and security of quantum machine learning, several protocols in the setting of adversarial quantum learning, i.e., learning under the worst-case noise scenario, have been developed11,12,16,17,18. More recently, data encoding schemes are linked to robustness properties of classifiers with respect to different noise models in ref. 19. The connection between provable robustness and quantum differential privacy is investigated in ref. 17, where naturally occurring noise in quantum systems is leveraged to increase robustness against adversaries. A further step toward robustness guarantees is made in ref. 18 where a bound is derived from elementary properties of the trace distance. These advances, though having accumulated considerable momentum toward a coherent strategy for protecting quantum machine learning algorithms against adversarial input perturbations, have not yet provided an adequate framework for deriving a tight robustness condition for any given quantum classifier. In other words, the known robustness conditions are sufficient but not, in general, necessary.

Thus, a major open problem remains that is significant on both the conceptual and practical levels. Conceptually, adversarial robustness, being an intrinsic property of the classification algorithms under consideration, is only accurately quantified by a tight bound, the absence of which renders the direct robustness comparison between different quantum classifiers implausible. Practically, an optimal robustness certification protocol, in the sense of being capable of faithfully reporting the noise tolerance and resilience of a quantum algorithm, can only arise from a robustness condition that is both sufficient and necessary. Here we set out to confront both aspects of this open problem by generalizing the state-of-the-art classical wisdom on certifiable adversarial robustness into the quantum realm.

The pressing demand for robustness against adversarial attacks is arguably even more self-evident under the classical setting in the present era of wide-spread industrial adaptation of machine learning13,14,20. Many heuristic defense strategies have been proposed but have subsequently been shown to fail against suitably powerful adversaries21,22. In response, provable defense mechanisms that provide robustness guarantees have been developed. One line of work, interval bound propagation, uses interval arithmetic23,24 to certify neural networks. Another approach makes use of randomizing inputs and adopts techniques from differential privacy25 and, to our particular interest, statistical hypothesis testing26,27 that has a natural counter-part in the quantum domain. Since the pioneering works by Helstrom28 and Holevo29, the task of quantum hypothesis testing (QHT) has been well studied and regarded as one of the foundational tasks in quantum information, with profound linkages with topics ranging from quantum communication30,31, estimation theory32, to quantum illumination33,34.

In this work, we lay bare a fundamental connection between QHT and the robustness of quantum classifiers against unknown noise sources. The methods of QHT enable us to derive a robustness condition that, in contrast to other methods, is both sufficient and necessary and puts constraints on the amount of noise that a classifier can tolerate. Due to tightness, these constraints allow for an accurate description of noise tolerance. Absence of tightness, on the other hand, would underestimate the true degree of such noise tolerance. Based on these theoretical findings, we provide (1) an optimal robustness certification protocol to assess the degree of tolerance against input perturbations (independent of whether these occur due to natural or adversarial noise), (2) a protocol to verify whether classifying a perturbed (noisy) input has had the same outcome as classifying the clean (noiseless) input, without requiring access to the latter, and (3) tight robustness conditions on parameters for amplitude and phase damping noise. In addition, we will also consider randomizing quantum inputs, what can be seen as a quantum generalization to randomized smoothing, a technique that has recently been applied to certify the robustness of classical machine learning models26. The conceptual foundation of our approach is rooted in the inherently probabilistic nature of quantum classifiers. Intuitively, while QHT is concerned with the question of how to optimally discriminate between two given states, certifying adversarial robustness aims at giving a guarantee for which two states cannot be discriminated. These two seemingly contrasting notions go hand in hand and, as we will see, give rise to optimal robustness conditions fully expressible in the language of QHT. Furthermore, while we focus on robustness in a worst-case scenario, our results naturally cover narrower classes of known noise sources and can potentially be put in context with other areas such as error mitigation and error tolerance in the NISQ era. Finally, while we treat robustness in the context of quantum machine learning, our results in principle do not require the decision function to be learned from data. Rather, our results naturally cover a larger class of quantum algorithms whose outcomes are determined by the most likely measurement outcome. Our robustness conditions on quantum states are then simply conditions under which the given measurement outcome remains the most likely outcome.

The remainder of this paper is organized as follows. We first introduce the notations and terminologies and review results from QHT essential for our purpose. We then proceed to formally define quantum classifiers and the assumptions on the threat model. In “Results”, we present our main findings on provable robustness from QHT. In addition, these results are demonstrated and visualized with a simple toy example for which we also consider the randomized input setting and analyze specifically randomization with depolarization channel. In “Discussion” we conclude with a higher-level view on our findings and layout several related open problems with an outlook for future research. Finally, in “Methods”, we give proofs for central results: the robustness condition in terms of type-II error probabilities of QHT, the tightness of this result and, finally, the method used to derive robustness conditions in terms of fidelity.




Let \({\mathcal{H}}\) be a Hilbert space of finite dimension \(d:=\dim ({\mathcal{H}})\,<\,\infty\) corresponding to the quantum system of interest. The space of linear operators acting on \({\mathcal{H}}\) is denoted by \({\mathcal{L}}({\mathcal{H}})\) and the identity operator on \({\mathcal{H}}\) is written as \({\mathbb{1}}\). If not clear from the context, the dimensionality is explicitly indicated through the notation \({\mathbb{1}}_d\). The set of density operators (i.e., positive semidefinite trace-one Hermitian matrices) acting on \({\mathcal{H}}\) is denoted by \({\mathcal{S}}({\mathcal{H}})\) and elements of \({\mathcal{S}}({\mathcal{H}})\) are written in lowercase Greek letters. The Dirac notation will be adopted whereby Hilbert space vectors are written as \(\left|\psi \right\rangle\) and their dual as \(\left\langle \psi \right|\). We will use the terminology density operator and quantum state interchangeably. For two Hermitian operators \(A,\ B\in {\mathcal{L}}({\mathcal{H}})\) we write A > B (A ≥ B) if A − B is positive (semi-)definite and A < B (A ≤ B) if A − B is negative (semi-)definite. For a Hermitian operator \(A\in {\mathcal{L}}({\mathcal{H}})\) with spectral decomposition A = ∑iλiPi, we write \(\{A\,>\,0\}:={\sum }_{i:{\lambda }_{i} \,{>}\,0}{P}_{i}\) (and analogously \(\{A\,<\,0\}:={\sum }_{i:{\lambda }_{i}\,{<}\,0}{P}_{i}\)) for the projection onto the eigenspace of A associated with positive (negative) eigenvalues. The Hermitian transpose of an operator A is written as A and the complex conjugate of a complex number \(z\in {\mathbb{C}}\) as \(\bar{z}\). For two density operators ρ and σ, the trace distance is defined as \(T(\rho ,\ \sigma ):=\frac{1}{2}| | \rho -\sigma| {| }_{1}\) where 1 is the Schatten 1-norm defined on \({\mathcal{L}}({\mathcal{H}})\) and given by \(| | A| {| }_{1}:={\rm{Tr}}\left[\left|A\right|\right]\) with \(\left|A\right|=\sqrt{{A}^{\dagger }A}\). The Uhlmann fidelity between density operators ρ and σ is denoted by F and defined as \(F(\rho ,\ \sigma ):={\rm{Tr}}{\left[\sqrt{\sqrt{\rho }\sigma \sqrt{\rho }}\right]}^{2}\) that for pure states reduces to the squared overlap \(F(\left|\psi \right\rangle ,\ \left|\phi \right\rangle )={\left|\langle \psi | \phi \rangle \right|}^{2}\). Finally, the Bures metric is denoted by dB and is closely related to the Uhlmann fidelity via \({d}_{{\rm{B}}}(\rho ,\ \sigma )={[2(1-\sqrt{F(\rho ,\sigma )})]}^{\frac{1}{2}}\).

Quantum hypothesis testing

Typically, QHT is formulated in terms of state discrimination where several quantum states have to be discriminated through a measurement28. In binary QHT, the aim is to decide whether a given unknown quantum system is in one of two states corresponding to the null and alternative hypothesis. Any such test is represented by an operator 0 ≤ M ≤ \({\mathbb{1}}_d\), which corresponds to rejecting the null in favor of the alternative. The two central quantities of interest are the probabilities of making a type-I or type-II error. The former corresponds to rejecting the null when it is true, while the latter occurs if the null is accepted when the alternative is true. Specifically, for density operators \(\sigma \in {\mathcal{S}}({\mathcal{H}})\) and \(\rho \in {\mathcal{S}}({\mathcal{H}})\) describing the null and alternative hypothesis, the type-I error probability is defined as α(M) and the type-II error probability as β(M), so that

$$\alpha (M;\ \sigma ) :={\rm{Tr}}\left[\sigma M\right]\,\qquad\qquad\qquad{{\text{(type}}{\hbox{-}}{\text{I}}\, {\text{error)}}}$$
$$\beta (M;\ \rho ):={\rm{Tr}}\left[\rho ({\mathbb{1}}-M)\right]\,\qquad\qquad\qquad{{\text{(type}}{\hbox{-}}{\text{II}}\, {\text{error)}}}$$

In the Bayesian setting, the hypotheses σ and ρ occur with some prior probabilities π0 and π1 and are concerned with finding a test that minimizes the total error probability. A Bayes optimal test M is one that minimizes the posterior probability π0α(M) + π1β(M).

In this paper, we consider asymmetric hypothesis testing (Neyman–Pearson approach)32, where the two types of errors are associated with a different cost. Given a maximal allowed probability for the type-I error, the goal is to minimize the probability of the type-II error. Specifically, one aims to solve the semidefinite program (SDP)

$$\begin{array}{ll}{\beta }_{{\alpha }_{0}}^{* }(\sigma ,\ \rho ):={\rm{minimize}}\ \ \ &\beta (M;\ \rho )\\ \qquad\qquad\qquad{\rm{s.t.}}\ \ \ &\alpha (M;\ \sigma )\le {\alpha }_{0},\\ &0\le M\le {{\mathbb{1}}}_{d}\end{array}$$

Optimal tests can be expressed in terms of projections onto the eigenspaces of the operator ρ − tσ where t is a non-negative number. More specifically, for t ≥ 0 let Pt,+: = {ρ − tσ > 0}, Pt,−: = {ρ − tσ < 0} and Pt,0: = \(\mathbb{1}\) − Pt,+ − Pt,− be the projections onto the eigenspaces of ρ − tσ associated with positive, negative, and zero eigenvalues. The quantum analog to the Neyman–Pearson Lemma35 shows optimality of operators of the form

$${M}_{t}:={P}_{t,+}+{X}_{t},\ \ \ 0\le {X}_{t}\le {P}_{t,0}.$$

The choice of the scalar t ≥ 0 and the operator Xt is such that the preassigned type-I error probability α0 is attained. An explicit construction for these operators is based on the inequalities

$$\alpha ({P}_{\tau ({\alpha }_{0}),+})\le {\alpha }_{0}\le \alpha ({P}_{\tau ({\alpha }_{0}),+}+{P}_{\tau ({\alpha }_{0}),0})$$

where α0 (0, 1) and τ(α0) is the smallest non-negative number such that \(\alpha ({P}_{\tau ({\alpha }_{0}),+})\le {\alpha }_{0}\), i.e., \(\tau ({\alpha }_{0}):=\inf \{t\ge 0:\ \alpha ({P}_{t,+})\le {\alpha }_{0}\}\). These inequalities can be seen from the observation that the function tα(Pt,+) is non-increasing and right-continuous while tα(Pt,+ + Pt,0) is non-increasing and left-continuous. A detailed proof for this is given in Supplementary Notes 1 and 2. We will henceforth refer to operators of the form (4) as Helstrom operators32.

Quantum classifiers

We define a K-class quantum classifier of states of the quantum system \({\mathcal{H}}\), described by density operators, as a map \({\mathcal{A}}:{\mathcal{S}}({\mathcal{H}})\to {\mathcal{C}}\) that maps states \(\sigma \in {\mathcal{S}}({\mathcal{H}})\) to class labels \(k\in {\mathcal{C}}=\{1,\ \ldots ,\ K\}\). Any such classifier is described by a completely positive and trace-preserving (CPTP) map \({\mathcal{E}}\) and a positive-operator valued measure (POVM) \({\{{{{\Pi }}}_{k}\}}_{k}\). Formally, a quantum state σ is passed through the quantum channel \({\mathcal{E}}\) and then the measurement \({\{{{{\Pi }}}_{k}\}}_{k}\) is performed. Finally, the probability of measuring outcome k is identified with the class probability yk(σ), i.e.,

$$\sigma\, \mapsto \,{{\bf{y}}}_{k}(\sigma ):={\rm{Tr}}\left[{{{\Pi }}}_{k}{\mathcal{E}}(\sigma )\right].$$

We treat the POVM element Πk as a projector \({{{\Pi }}}_{k}=\left|k\right\rangle \left\langle k\right|\otimes {{\mathbb{1}}}_{d/K}\) that determines whether the output is classified into class k. This can be done without loss of generality by Naimark’s dilation since \({\mathcal{E}}\) is kept arbitrary and potentially involves ancillary qubits and a general POVM element can be expressed as a projector on the larger Hilbert space. The final prediction is given by the most likely class

$$\begin{array}{r}{\mathcal{A}}(\sigma )\equiv \mathop{{\mathrm{arg}}\, {\mathrm{max}}}\limits_{k}{{\bf{y}}}_{k}(\sigma ).\end{array}$$

Throughout this paper, we refer to \({\mathcal{A}}\) as the classifier and to y as the score function. In the context of quantum machine learning, the input state σ can be an encoding of classical data by means of, e.g., amplitude encoding or otherwise19,36, or inherently quantum input data, while \({\mathcal{E}}\) can be realized, e.g., by a trained parametrized quantum circuit potentially involving ancillary registers37. However, it is worth noting that the above-defined notion of quantum classifier more generally describes the procedure of a broader class of quantum algorithms whose output is obtained by repeated sampling of measurement outcomes.

Quantum adversarial robustness

Adversarial examples are attacks on classification models where an adversary aims to induce a misclassification using typically imperceptible modifications of a benign input example. Specifically, given a classifier \({\mathcal{A}}\) and a benign input state σ, an adversary can craft a small perturbation σ → ρ that results in a misclassification, i.e., \({\mathcal{A}}(\rho )\,\ne \,{\mathcal{A}}(\sigma )\). An illustration for this threat scenario is given in Fig. 1. In this paper, we seek a worst-case robustness guarantee against any possible attack: as long as ρ does not differ from σ by more than a certain amount, then it is guaranteed that \({\mathcal{A}}(\sigma )={\mathcal{A}}(\rho )\) independently of how the adversarial state ρ has been crafted. Formally, suppose the quantum classifier \({\mathcal{A}}\) takes as input a benign quantum state \(\sigma \in {\mathcal{S}}({\mathcal{H}})\) and produces a measurement outcome denoted by the class \(k\in {\mathcal{C}}\) with probability \({{\bf{y}}}_{k}(\sigma )={\rm{Tr}}\left[{{{\Pi }}}_{k}{\mathcal{E}}(\sigma )\right]\). Recall that the prediction of \({\mathcal{A}}\) is taken to be the most likely class \({k}_{{\rm{A}}}=\arg {\max }_{k}{{\bf{y}}}_{k}(\sigma )\). An adversary aims to alter the output probability distribution so as to change the most likely class by applying an arbitrary quantum operation \({{\mathcal{E}}}_{{\rm{A}}}:{\mathcal{S}}({\mathcal{H}})\to {\mathcal{S}}({\mathcal{H}})\) to σ resulting in the adversarial state \(\rho ={{\mathcal{E}}}_{{\rm{A}}}(\sigma )\). Finally, we say that the classifier y is provably robust around σ with respect to the robustness condition \({\mathcal{R}}\), if for any ρ that satisfies \({\mathcal{R}}\), it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\).

Fig. 1: Adversarial attack.
figure 1

a A quantum classifier correctly classifies the (toxic) mushroom as “poisonous”. b An adversary perturbs the image to fool the classifier into believing that the mushroom is “edible”.

In the following, we will derive a robustness condition for quantum classifiers with the QHT formalism, which provides a provable guarantee for the outcome of a computation being unaffected by the worst-case input noise or perturbation under a given set of constraints. In the regime where the most likely class is measured with probability lower bounded by pA > 1/2 and the runner-up class is less likely than pB = 1 − pA, we prove tightness of the robustness bound, hence demonstrating that the QHT condition is at least partially optimal. The QHT robustness condition, in its full generality, has an SDP formulation in terms of the optimal type-II error probabilities. We then simplify this condition and derive closed form solutions in terms of Uhlmann fidelity, Bures metric, and trace distance between benign and adversarial inputs. The closed form solutions in terms of fidelity and Bures metric are shown to be sufficient and necessary for general states and in the same regime where the SDP formulation is proven to be tight. In the case of trace distance, this can be claimed for pure states, while the bound for mixed states occurs to be weaker. These results stemming from QHT considerations are then contrasted and compared with an alternative approach that directly applies Hölder duality to trace distances to obtain a sufficient robustness condition. The different robustness bounds and robustness conditions are summarized in Table 1.

Table 1 Summary of results.

Robustness condition from quantum hypothesis testing

Recall that QHT is concerned with the question of finding measurements that optimally discriminate between two states. A measurement is said to be optimal if it minimizes the probabilities of identifying the quantum system to be in the state σ, corresponding to the null hypothesis, when in fact it is in the alternative state ρ, and vice versa. When considering provable robustness, on the other hand, one aims to find a neighborhood around a benign state σ where the class that is most likely to be measured is constant or, expressed differently, where the classifier cannot discriminate between states. It becomes thus clear that QHT and classification robustness aim to achieve a similar goal, although viewed from different angles. Indeed, as it turns out, QHT determines the robust region around σ to be the set of states (i.e., alternative hypotheses) for which the optimal type-II error probability β* is larger than 1/2.

To establish this connection more formally, we identify the benign state with the null hypothesis σ and the adversarial state with the alternative ρ. We note that, in the Heisenberg picture, we can identify the score function y of a classifier \({\mathcal{A}}\) with a POVM \({\{{{{\Pi }}}_{k}\}}_{k}\). For \({k}_{{\rm{A}}}={\mathcal{A}}(\sigma )\), the operator \({\mathbb{1}}-{{{\Pi }}}_{{k}_{{\rm{A}}}}\) (and thus the classifier \({\mathcal{A}}\)) can be viewed as a hypothesis test discriminating between σ and ρ. Notice that, for pA [0, 1] with \({{\bf{y}}}_{{k}_{{\rm{A}}}}(\sigma )={\rm{Tr}}\left[{{{\Pi }}}_{{k}_{{\rm{A}}}}\sigma \right]\ge {p}_{{\rm{A}}}\), the operator \({{\mathbb{1}}}_{d}-{{{\Pi }}}_{{k}_{{\rm{A}}}}\) is feasible for the SDP \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\) in (3) and hence

$${{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )=\beta ({{\mathbb{1}}}_{d}-{{{\Pi }}}_{{k}_{{\rm{A}}}};\ \rho )\ge {\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho ).$$

Thus, it is guaranteed that \({k}_{{\rm{A}}}={\mathcal{A}}(\rho )\) for any ρ with \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\,>\,1/2\). The following theorem makes this reasoning concise and extends to the setting where the probability of measuring the second most likely class is upper-bounded by pB.

Theorem 1 (QHT robustness bound) Let \(\sigma ,\ \rho \in {\mathcal{S}}({\mathcal{H}})\) be benign and adversarial quantum states and let \({\mathcal{A}}\) be a quantum classifier with score function y. Suppose that for \({k}_{{\rm{A}}}\in {\mathcal{C}}\) and pA, pB [0, 1], the score function y satisfies

$${{\bf{y}}}_{{k}_{{\rm{A}}}}(\sigma )\ge {p}_{{\rm{A}}}\,>\,{p}_{{\rm{B}}}\ge \mathop{\max }\limits_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}(\sigma ).$$

Then, it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\) for any ρ with

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )+{\beta }_{{p}_{{\rm{B}}}}^{* }(\sigma ,\ \rho )\,>\,1$$

To get some more intuition of Theorem 1, we first note that for pB = 1 − pA, the robustness condition (10) simplifies to

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\,>\,1/2$$

with this, the relation between QHT and robustness becomes more evident: if the optimal hypothesis test performs poorly when discriminating the two states, then a classifier will predict both states to belong to the same class. In other words, viewing a classifier as a hypothesis test between the benign input σ and the adversarial ρ, the optimality of the Helstrom operators implies that the classifier y is a worse discriminator and will also not distinguish the states, or, phrased differently, it is robust. This result formalizes the intuitive connection between QHT and robustness of quantum classifiers. While the former is concerned with finding operators that are optimal for discriminating two states, the latter is concerned with finding conditions on states for which a classifier does not discriminate.


The robustness condition (10) from QHT is provably optimal in the regime of pA + pB = 1, which covers binary classifications in full generality and multiclass classification where the most likely class is measured with probability larger than \({p}_{{\rm{A}}}\,>\,\frac{1}{2}\). The robustness condition is tight in the sense that, whenever condition (10) is violated, then there exists a classifier \({{\mathcal{A}}}^{\star }\) that is consistent with the class probabilities (9) on the benign input but that will classify the adversarial input differently from the benign input. The following theorem demonstrates this notion of tightness by explicitly constructing the worst-case classifier \({{\mathcal{A}}}^{\star }\).

Theorem 2 (Tightness) Suppose that pA + pB = 1. Then, if the adversarial state ρ violates condition (10), there exists a quantum classifier \({{\mathcal{A}}}^{\star }\) that is consistent with the class probabilities (9) and for which \({{\mathcal{A}}}^{\star }(\rho )\,\ne \,{{\mathcal{A}}}^{\star }(\sigma )\).

The main idea of the proof relies on the explicit construction of a “worst-case” classifier with Helstrom operators and that classifies ρ differently from σ while still being consistent with the class probabilities (9). We refer the reader to “Methods” for a detailed proof. Whether or not the QHT robustness condition is tight for pA + pB < 1 is an interesting open question for future research. It turns out that a worst-case classifier that is consistent with pA and pB for benign input but leads to a different classification on adversarial input upon violating condition (10), if exists, is more challenging to construct for these cases. If such a tightness result for all class probability regimes would be proven, there would be a complete characterization for the robustness of quantum classifiers.

Closed form robustness conditions

Although Theorem 1 provides a general condition for robustness with provable tightness, it is formulated as an SDP in terms of type-II error probabilities of QHT. To get a more intuitive and operationally convenient perspective, we wish to derive a condition for robustness in terms of a meaningful notion of difference between quantum states. Specifically, based on Theorem 1, here we derive robustness conditions expressed in terms of Uhlmann’s fidelity F, Bures distance dB, and in terms of the trace distance T. To that end, we first concentrate on pure state inputs and will then leverage these bounds to mixed states. Finally, we show that expressing robustness in terms of fidelity or Bures distance results in a tight bound for both pure and mixed states, while for trace distance the same can only be claimed in the case of pure states.

Pure states

We first assume that both the benign and the adversarial states are pure. This assumption allows us to first write the optimal type-II error probabilities \({\beta }_{\alpha }^{* }(\rho ,\ \sigma )\) as a function of α and the fidelity between ρ and σ. This leads to a robustness bound on the fidelity and subsequently to a bound on the trace distance and on the Bures distance. Finally, since these conditions are equivalent to the QHT robustness condition (10), Theorem 2 implies tightness of these bounds.

Lemma 1 Let \(\left|{\psi }_{\sigma }\right\rangle ,\ \left|{\psi }_{\rho }\right\rangle \in {\mathcal{H}}\) and let \({\mathcal{A}}\) be a quantum classifier. Suppose that for \({k}_{{\rm{A}}}\in {\mathcal{C}}\) and pA, pB [0, 1], we have \({k}_{{\rm{A}}}={\mathcal{A}}({\psi }_{\sigma })\) and suppose that the score function y satisfies (9). Then, it is guaranteed that \({\mathcal{A}}({\psi }_{\rho })={\mathcal{A}}({\psi }_{\sigma })\) for any ψρ with

$${\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\,>\,\frac{1}{2}\left(1+\sqrt{g({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})}\right),$$

where the function g is given by

$$\begin{array}{ll}g({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})=1-{p}_{{\rm{B}}}-{p}_{{\rm{A}}}(1-2{p}_{{\rm{B}}}) +\,2\sqrt{{p}_{{\rm{A}}}{p}_{{\rm{B}}}(1-{p}_{{\rm{A}}})(1-{p}_{{\rm{B}}})}.\end{array}$$

This condition is equivalent to (10) and is hence both sufficient and necessary whenever pA + pB = 1.

This result thus provides a closed form robustness bound that is equivalent to the SDP formulation in condition (10) and is hence sufficient and necessary in the regime pA + pB = 1. We remark that, under this assumption, the robustness bound (12) has the compact form

$${\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\,>\,\frac{1}{2}+\sqrt{{p}_{{\rm{A}}}(1-{p}_{{\rm{A}}})}.$$

Due to its relation with the Uhlmann fidelity, it is straight forward to obtain a robustness condition in terms of Bures metric. Namely, the condition

$${d}_{{\rm{B}}}(\left|{\psi }_{\rho }\right\rangle ,\ \left|{\psi }_{\sigma }\right\rangle )\,<\,{\left[2-\sqrt{2(1+\sqrt{g({p}_{{\rm{A}}},{p}_{{\rm{B}}})})}\right]}^{\frac{1}{2}}$$

is equivalent to (10). Furthermore, since the states are pure, we can directly link (12) to a bound in terms of the trace distance via the relation \(T{(\left|{\psi }_{\rho }\right\rangle ,\left|{\psi }_{\sigma }\right\rangle )}^{2}=1-{\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\), so that

$$T(\left|{\psi }_{\rho }\right\rangle ,\ \left|{\psi }_{\sigma }\right\rangle )\,<\,{\left[\frac{1}{2}\left(1-\sqrt{g({p}_{{\rm{A}}},{p}_{{\rm{B}}})}\right)\right]}^{\frac{1}{2}}$$

is equivalent to (10). Due to the equivalence of these bounds to (10), Theorem 2 applies and it follows that both bounds are sufficient and necessary in the regime where pA + pB = 1. In the following, we will extend these results to mixed states and show that both the fidelity and Bures metric bounds are tight.

Mixed states

Reasoning about the robustness of a classifier if the input states are mixed, rather than just for pure states, is practically relevant for a number of reasons. First, in a realistic scenario, the assumption that an adversary can only produce pure states is too restrictive and gives an incomplete picture. Second, if we wish to reason about the resilience of a classifier against a given noise model (e.g., amplitude damping), then the robustness condition needs to be valid for mixed states as these noise models typically produce mixed states. Finally, in the case where we wish to certify whether a classification on a noisy input has had the same outcome as on the noiseless input, a robustness condition for mixed states is also required. For these reasons, and having established closed form robustness bounds that are both sufficient and necessary for pure states, here we aim to extend these results to the mixed state setting. The following theorem extends the fidelity bound (12) for mixed states. As for pure states, it is then straight forward to obtain a bound in terms of the Bures metric.

Theorem 3 Let \(\sigma ,\ \rho \in {\mathcal{S}}({\mathcal{H}})\) and let \({\mathcal{A}}\) be a quantum classifier. Suppose that for \({k}_{{\rm{A}}}\in {\mathcal{C}}\) and pA, pB [0, 1], we have \({k}_{{\rm{A}}}={\mathcal{A}}(\sigma )\) and suppose that the score function y satisfies (9). Then, it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\) for any ρ with

$$F(\rho ,\ \sigma )\,>\,\frac{1}{2}\left(1+\sqrt{g({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})}\right)=:{r}_{{\rm{F}}}$$

where g is defined as in (13). This condition is both sufficient and necessary if pA + pB = 1.

Proof To show sufficiency of (17), we notice that y can be rewritten as

$${{\bf{y}}}_{k}(\sigma )={\rm{Tr}}\left[{{{\Pi }}}_{k}{\mathcal{E}}(\sigma )\right]$$
$$={\rm{Tr}}\left[{{{\Pi }}}_{k}({\mathcal{E}}\circ {{\rm{Tr}}}_{{\rm{E}}})(\left|{\psi }_{\sigma }\right\rangle \left\langle {\psi }_{\sigma }\right|)\right]$$

where \(\left|{\psi }_{\sigma }\right\rangle\) is a purification of σ with purifying system E and \({{\rm{Tr}}}_{{\rm{E}}}\) denotes the partial trace over E. We can thus view y as a score function on the larger Hilbert space that admits the same class probabilities for σ and any purification of σ (and equally for ρ). It follows from Uhlmann’s Theorem that there exist purifications \(\left|{\psi }_{\sigma }\right\rangle\) and \(\left|{\psi }_{\rho }\right\rangle\) such that \(F(\rho ,\ \sigma )={\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\). Robustness at ρ then follows from (17) by (18) and Lemma 1. To see that the bound is necessary when pA + pB = 1, suppose that there exists some \({\widetilde{r}}_{{\rm{F}}}<{r}_{{\rm{F}}}\) such that \(F(\sigma ,\ \rho )\,>\,{\widetilde{r}}_{{\rm{F}}}\) implies that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\). Since pure states are a subset of mixed states, this bound must also hold for pure states. In particular, suppose \(\left|{\psi }_{\rho }\right\rangle\) is such that \({\widetilde{r}}_{{\rm{F}}}<{\left|\langle {\psi }_{\rho }| {\psi }_{\sigma }\rangle \right|}^{2}\le {r}_{{\rm{F}}}\). However, this is a contradiction, since \({\left|\langle {\psi }_{\rho }| {\psi }_{\sigma }\rangle \right|}^{2}\ge {r}_{{\rm{F}}}\) is both sufficient and necessary in the given regime, i.e., by Theorem 2, there exists a classifier \({{\mathcal{A}}}^{\star }\) whose score function satisfies (9) and for which \({{\mathcal{A}}}^{\star }({\psi }_{\sigma })\,\ne \,{{\mathcal{A}}}^{\star }({\psi }_{\rho })\). It follows that \({\widetilde{r}}_{{\rm{F}}}\ge {r}_{{\rm{F}}}\) and hence the claim of the theorem. □

Due to the close relation between Uhlmann fidelity and the Bures metric, we arrive at a robustness condition for mixed states in terms of dB, namely

$${d}_{{\rm{B}}}(\rho ,\ \sigma )<{\left[2-\sqrt{2(1+\sqrt{g({p}_{{\rm{A}}},{p}_{{\rm{B}}})})}\right]}^{\frac{1}{2}}$$

that inherits the tightness properties of the fidelity bound (17). In contrast to the pure state case, here it is less straight forward to obtain a robustness bound in terms of trace distance. However, we can still build on Lemma 1 and the trace distance bound for pure states (16) to obtain a sufficient robustness condition. Namely, when assuming that the benign state is pure, but the adversarial state is allowed to be mixed, we have the following result.

Corollary 1 (Pure benign and mixed adversarial states) Let \(\sigma ,\ \rho \in {\mathcal{S}}({\mathcal{H}})\) and suppose that \(\sigma =\left|{\psi }_{\sigma }\right\rangle \left\langle {\psi }_{\sigma }\right|\) is pure. Let \({\mathcal{A}}\) be a quantum classifier and suppose that for \({k}_{{\rm{A}}}\in {\mathcal{C}}\) and pA, pB [0, 1], we have \({k}_{{\rm{A}}}={\mathcal{A}}(\sigma )\) and suppose that the score function y satisfies (9). Then, it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\) for any ρ with

$$T(\rho ,\ \sigma )\,<\,\delta ({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})\left(1-\sqrt{1-\delta {({p}_{{\rm{A}}},{p}_{{\rm{B}}})}^{2}}\right)$$

where \(\delta ({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})={[\frac{1}{2}\left(1-g({p}_{{\rm{A}}},{p}_{{\rm{B}}})\right)]}^{\frac{1}{2}}.\)

We refer the reader to Supplementary Note 4 for a detailed proof of this result. Intuitively, condition (21) is derived by noting that any convex mixture of robust pure states must also be robust; thus, membership of the set of mixed states enclosed by the convex hull of robust pure states (certified by Eq. (16)) is a natural sufficient condition for robustness. As such, the corresponding robustness radius in condition (21) is obtained by lower-bounding, with triangle inequalities, the radius of the maximal sphere centered at σ within the convex hull. However, the generalization from Lemma 1 and Eq. (16) to Corollary 1, mediated by the above geometrical argument, results in a sacrifice of tightness. How or to what extent such loosening of the explicit bound in the cases of mixed states may be avoided or ameliorated remains an open question. In the following, we compare the trace distance bounds from QHT with a robustness condition derived from an entirely different technique.

We note that a sufficient condition can be obtained from a somewhat straightforward application of Hölder duality for trace norms:

Lemma 2 (Hölder duality bound) Let \(\sigma ,\ \rho \in {\mathcal{S}}({\mathcal{H}})\) be arbitrary quantum states and let \({\mathcal{A}}\) be a quantum classifier. Suppose that for \({k}_{{\rm{A}}}\in {\mathcal{C}}\) and pA, pB [0, 1], we have \({k}_{{\rm{A}}}={\mathcal{A}}(\sigma )\) and the score function y satisfies (9). Then, it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\) for any ρ with

$$\frac{1}{2}\left|| \rho -\sigma \right|{| }_{1}\,<\,\frac{{p}_{{\rm{A}}}-{p}_{{\rm{B}}}}{2}.$$

Proof Let \(\delta :=\frac{1}{2}| | \rho -\sigma | {| }_{1}={\sup }_{0\le P\le {\mathbb{I}}}{\rm{Tr}}\left[P(\rho -\sigma )\right]\), which follows from Hölder duality. We have that \({{\bf{y}}}_{{k}_{{\rm{A}}}}(\sigma )-{{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )\le \delta\) and that \({{\bf{y}}}_{{k}_{{\rm{A}}}}(\sigma )\ge {p}_{{\rm{A}}}\), hence \({{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )\ge {p}_{{\rm{A}}}-\delta\). We also have, for \(k^{\prime}\) such that \({{\bf{y}}}_{{\rm{k}}^{\prime} }(\rho )={\max }_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}(\rho )\), that \({{\bf{y}}}_{{\rm{k}}^{\prime} }(\rho )-{{\bf{y}}}_{{\rm{k}}^{\prime} }(\sigma )\le \delta\), and that \({{\bf{y}}}_{{\rm{k}}^{\prime} }(\sigma )\le {p}_{{\rm{B}}}\), hence \({\max }_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}(\rho )\le {p}_{{\rm{B}}}+\delta\). Thus, \(\frac{1}{2}\left|| \rho -\sigma \right|{| }_{1}<\frac{{p}_{{\rm{A}}}-{p}_{{\rm{B}}}}{2}\ \iff \ {p}_{{\rm{A}}}-\delta >{p}_{{\rm{B}}}+\delta \ \Rightarrow \ {{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )>{\max }_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}(\rho )\). □

We acknowledge that the above robustness bound from Hölder duality was independently discovered in Lemma 1 of ref. 18. For intuitive insights, it is worth remarking that condition (22) stems from comparing the maximum probability of distinguishing σ and ρ with the optimal measurement (Hölder measurement) with the gap between the first two class probabilities on σ. Since no classifier can distinguish σ and ρ better than the Hölder measurement by definition, (22) is clearly a sufficient condition. However, the Hölder measurement on σ does not necessarily result in class probabilities consistent with Eq. (9). Without additional constraints on desired class probabilities on the benign input, the robustness condition (22) from Hölder duality is stronger than necessary. In contrast, the QHT bound from Theorem 1, albeit implicitly written in the language of hypothesis testing, naturally incorporates such desired constraints. Hence, as expected, this gives rise to a tighter robustness condition.

In summary, the closed form solutions in terms of fidelity and Bures metric completely inherit the tightness of Theorem 1, while for trace distance, tightness is inherited for pure states, but partially lost in Corollary 1 for mixed adversarial states. The numerical comparison between the trace distance bounds from QHT and the Hölder duality bound is shown in a contour plot in Fig. 2.

Fig. 2: Comparison between robustness bounds in terms of trace distance.
figure 2

a Difference rQ − rH between the pure state bound derived from QHT rQ, given in Eq. (16) and the Hölder duality bound rH from Lemma 2. b Difference \({r}_{{\rm{H}}}-{\widetilde{r}}_{{\rm{Q}}}\) between the Hölder duality bound rH and the bound \({\widetilde{r}}_{{\rm{Q}}}\) derived from the convex hull approximation to the QHT robustness condition from Theorem 1 for mixed adversarial states. It can be seen that the pure state bound rQ is always larger than rH which in turn is always larger than the convex hull approximation bound \({\widetilde{r}}_{{\rm{Q}}}\).

Toy example with single-qubit pure states

We now present a simple example to highlight the connection between QHT and classification robustness. We consider a single-qubit system that is prepared either in the state σ or ρ described by

$$\left|\sigma \right\rangle =\left|0\right\rangle ,$$
$$\left|\rho \right\rangle =\cos ({\theta }_{0}/2)\left|0\right\rangle +\sin ({\theta }_{0}/2){e}^{i{\phi }_{0}}\left|1\right\rangle$$

with θ0 [0, π) and ϕ0 [0, 2π). The state σ corresponds to the null hypothesis in the QHT setting and to the benign state in the classification setting. Similarly, ρ corresponds to the alternative hypothesis and adversarial state. The operators that are central to both QHT and robustness are the Helstrom operators (4) that are derived from the projection operators onto the eigenspaces associated with the non-negative eigenvalues of the operator ρ − tσ. For this example, the eigenvalues are functions of t ≥ 0 and given by

$${\eta }_{1}=\frac{1}{2}(1-t)+R\,>\,0,$$
$${\eta }_{2}=\frac{1}{2}(1-t)-R\le 0$$
$$R=\frac{1}{2}\sqrt{{(1-t)}^{2}+4t(1-{\left|\gamma \right|}^{2})}$$

where γ is the overlap between σ and ρ and given by \(\gamma =\cos ({\theta }_{0}/2)\). For t > 0, the Helstrom operators are then given by the projection onto the eigenspace associated with the eigenvalue η1 > 0. The projection operator is given by \({M}_{t}=\left|{\eta }_{1}\right\rangle \left\langle {\eta }_{1}\right|\) with

$$\left|{\eta }_{1}\right\rangle =(1-{\eta }_{1}){A}_{1}\left|0\right\rangle -\gamma {A}_{1}\left|\rho \right\rangle$$
$${\left|{A}_{1}\right|}^{-2}=2R\left|{\eta }_{1}-{\sin }^{2}({\theta }_{0}/2)\right|$$

where A1 is a normalization constant ensuring that 〈η1η1〉 = 1. Given a preassigned probability α0 for the maximal allowed type-I error probability, we determine t such that α(Mt) = α0.

Hypothesis testing view

In QHT, we are given a specific alternative hypothesis ρ and error probability α0 and are interested in finding the minimal type-II error probability. In this example, we pick θ0 = π/3, ϕ0 = π/6 for the alternative state and set the type-I error probability to α0 = 1 − pA = 0.1. These states are graphically represented on the Bloch sphere in Fig. 3. We note that, for this choice of states, we obtain an expression for the eigenvector \(\left|{\eta }_{1}\right\rangle\) given by

$$\left|{\eta }_{1}\right\rangle =\frac{9-\sqrt{3}}{\sqrt{30}}\left|0\right\rangle -3\sqrt{\frac{2}{5}}\left|\rho \right\rangle .$$

that yields the type-II error probability

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )=\ \beta ({M}_{t})=1-{\left|\langle {\eta }_{1}| \rho \rangle \right|}^{2}\approx 0.44\,<\,1/2.$$

We thus see that the optimal hypothesis test can discriminate σ and ρ with error probabilities less than 1/2 since on the Bloch sphere they are located far enough apart. However, since β(Mt)  1/2, Theorem 1 implies that ρ is not guaranteed to be classified equally as σ by a classifier that makes a prediction on σ with confidence at least 0.9. In other words, the two states are far enough apart to be easily discriminated by the optimal hypothesis test but too far apart to be guaranteed to be robust.

Fig. 3: Example classifier for single-qubit quantum states.
figure 3

The decision boundary is represented by the gray disk passing through the origin of the Bloch sphere. The robust region around σ is indicated by the dark spherical cap. States belonging to different classes are marked with + and – and are color red if not classified correctly. The colorbar indicates different values for the optimal type-II error probability \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\). We see that, for the given classifier, the state ρ is not contained in the robust region around σ since the optimal type-II error probability is less than 1/2 as indicated by the colorbar. The state ρ is thus not guaranteed to be classified correctly by every classifier with the same class probabilities. In the asymmetric hypothesis testing view, an optimal discriminator that admits 0.1 type-I error probability for testing σ against ρ has type-II error probability 0.44.

Classification robustness view

In this scenario, in contrast to the QHT view, we are not given a specific adversarial state ρ, but rather aim to find a condition on a generic ρ such that the classifier is robust for all configurations of ρ that satisfy this condition. Theorem 1 provides a necessary and sufficient condition for robustness, expressed in terms of β*, which, for pB = 1 − pA and pA > 1/2, reads

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\,>\,1/2.$$

Recall that the probability and pA > 1/2 is a lower bound to the probability of the most likely class and in this case we set pB = 1 − pA to be the upper bound to the probability of the second most likely class. For example, as the QHT view shows, for α0 = 1 − pA = 0.1 we have that \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\approx 0.44\,<\,1/2\) for a state ρ with θ0 = π/3. We thus see that it is not guaranteed that every quantum classifier, which predicts σ to be of class kA with probability at least 0.9, classifies ρ to be of the same class. Now, we would like to find the maximum θ0, for which every classifier with confidence greater than pA is guaranteed to classify ρ and σ equally. Using the fidelity bound (17), we find the robustness condition on θ0

$$\begin{array}{l}{\left|\langle \rho | \sigma \rangle \right|}^{2}={\cos }^{2}({\theta }_{0}/2)\,>\,\frac{1}{2}+\sqrt{{p}_{{\rm{A}}}(1-{p}_{{\rm{A}}})}\\ \ \quad\qquad\iff \ {\theta }_{0}\,<\,2\cdot \arccos \sqrt{\frac{1}{2}+\sqrt{{p}_{{\rm{A}}}(1-{p}_{{\rm{A}}})}}.\end{array}$$

In particular, if pA = 0.9, we find that angles \({\theta }_{0}\,<\,2\cdot \arccos (\sqrt{0.8})\approx 0.93\,<\,\pi /3\) are certified. Figure 3 illustrates this scenario: the dark region around σ contains all states ρ for which it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\) for any classifier \({\mathcal{A}}\) with confidence at least 0.9.

Classifier example

We consider a binary quantum classifier \({\mathcal{A}}\) that discriminates single-qubit states on the upper half of the Bloch sphere (class+) from states on the lower half (class–). Specifically, we consider the dichotomic POVM {Πθ,ϕ, \({\mathbb{1}}_2\) − Πθ,ϕ} defined by the projection operator \({{{\Pi }}}_{\theta ,\phi }=\left|{\psi }_{\theta ,\phi }\right\rangle \left\langle {\psi }_{\theta ,\phi }\right|\) where

$$\left|{\psi }_{\theta ,\phi }\right\rangle :=\cos (\theta /2)\left|0\right\rangle +\sin (\theta /2){e}^{i\phi }\left|1\right\rangle$$

with \(\theta =2\cdot \arccos (\sqrt{0.9})\approx 0.644\) and ϕ = π/2. Furthermore, for the rest of this section, we assume that pA + pB = 1 so that pB is determined by pA via pB = 1 − pA. An illustration of this classification problem is given in Fig. 3, where the decision boundary of \({\mathcal{A}}\) is represented by the gray disk crossing the origin of the Bloch sphere. The states marked with a black + correspond to + states that have been classified correctly, states marked with a black − sign correspond to data points correctly classified as − and red states are misclassified by \({\mathcal{A}}\). It can be seen that since the state ρ has been shown to violate the robustness condition (i.e., \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\approx 0.44\,<\,1/2\)), it is not guaranteed that ρ and σ are classified equally. In particular, for the example classifier \({\mathcal{A}}\) we have \({\mathcal{A}}(\rho )\,\ne \,{\mathcal{A}}(\sigma )\).

In summary, as \({p}_{{\rm{A}}}\to \frac{1}{2}\), the robust radius approaches 0. In the QHT view, this can be interpreted in the sense that if the type-I error probability α0 approaches 1/2, then all alternative states can be discriminated from σ with type-II error probability less than 1/2. As pA → 1, the robust radius approaches π/2. In this regime, the QHT view says that if the type-I error probability α0 approaches 0, then the optimal type-II error probability is smaller than 1/2 only for states in the lower half of the Bloch sphere.

Robustness certification

The theoretical results in “Closed form robustness conditions” provide conditions under which it is guaranteed that the output of a classification remains unaffected if the adversarial (noisy) state and the benign state are close enough, measured in terms of the fidelity, Bures metric, or trace distance. Here, we show how this result can be put to work and make concrete examples of scenarios where reasoning about the robustness is relevant. Specifically, we first present a protocol to assess how resilient a quantum classifier is against input perturbations. Second, in a scenario where one is provided with a potentially noisy or adversarial input, we wish to obtain a statement as to whether the classification of the noisy input is guaranteed to be the same as the classification of a clean input without requiring access to the latter. Third, we analyze the robustness of quantum classifiers against known noise models, namely phase and amplitude damping.

Assessing resilience against adversaries

In security critical applications, such as the classification of medical data or home surveillance systems, it is critical to assess the degree of resilience that machine learning systems exhibit against actions of malicious third parties. In other words, the goal is to estimate the expected classification accuracy, under perturbations of an input state within 1 − ε fidelity. In the classical machine learning literature, this quantity is called the certified test set accuracy at radius r, where distance is typically measured in terms of p-norms, and is defined as the fraction of samples in a test set that has been classified correctly and with a robust radius of at least r (i.e., an adversary cannot change the prediction with a perturbation of magnitude less than r). We can adapt this notion to the quantum domain and, given a test set consisting of pairs of labeled samples \({\mathcal{T}}={\{({\sigma }_{{\rm{i}}},\ {y}_{{\rm{i}}})\}}_{{\rm{i}} = 1}^{\left|{\mathcal{T}}\right|}\), the certified test set accuracy at fidelity 1 − ε is given by

$$\frac{1}{\left|{\mathcal{T}}\right|}\mathop{\sum }\limits_{(\sigma ,y)\in {\mathcal{T}}}{\mathbb{1}}\{{\mathcal{A}}(\sigma )=y\ \wedge \ {r}_{{\rm{F}}}(\sigma )\le 1-\varepsilon \}$$

where rF(σ) is the minimum robust fidelity (17) for sample σ and \({\mathbb{1}}\) denotes the indicator function. To evaluate this quantity, we need to obtain the prediction and to calculate the minimum robust fidelity for each sample \(\sigma \in {\mathcal{T}}\) as a function of the class probabilities yk(σ). In practice, in the finite sampling regime, we have to estimate these quantities by sampling the quantum circuit N times. To that end, we use Hoeffding’s inequality so that the bounds hold with probability at least 1 − α. Specifically, we run the following steps to certify the robustness for a given sample σ:

  1. 1.

    Apply the quantum circuit N times to σ and perform the \(\left|{\mathcal{C}}\right|\)-outcome measurement \({\{{{{\Pi }}}_{k}\}}_{k = 1}^{\left|{\mathcal{C}}\right|}\) each time. Store the outcomes in variables nk for every \(k\in {\mathcal{C}}\).

  2. 2.

    Determine the most frequent measurement outcome kA and set \({\hat{p}}_{{\rm{A}}}={n}_{{k}_{{\rm{A}}}}/N-\sqrt{-{\mathrm{ln}}\,(\alpha )/2N}\).

  3. 3.

    If \({\hat{p}}_{{\rm{A}}}\,>\,1/2\), set \({\hat{p}}_{{\rm{B}}}=1-{\hat{p}}_{{\rm{A}}}\) and calculate the minimum robust fidelity rF according to (17) and return (kA, rF); otherwise abstain from certification.

Executing these steps for a given sample σ returns the true minimum robust fidelity with probability 1 − α, which follows from Hoeffding’s inequality

$$\Pr \left[\frac{{n}_{{\rm{k}}}}{N}-{\langle {{{\Lambda }}}_{k}\rangle }_{\sigma }\ge \delta \right]\le \exp \{-2N{\delta }^{2}\}$$

with \({{{\Lambda }}}_{k}={{\mathcal{E}}}^{\dagger }({{{\Pi }}}_{k})\) and setting \(\delta =\sqrt{-{\mathrm{ln}}\,(\alpha )/2N}\). In Supplementary Note 6, this algorithm is shown in detail in Protocol 1.

Certification for noisy inputs

In practice, inputs to quantum classifiers are typically noisy. This noise can occur either due to imperfect implementation of the state preparation device, or due to an adversary that interferes with state or gate preparation. Under the assumption that we know that the state has been prepared with fidelity at least 1 − ε to the noiseless state, we would like to know whether this noise has altered our prediction, without having access to the noiseless state. Specifically, given the classification result, which is based on the noisy input, we would like to have the guarantee that the classifier would have predicted the same class, had it been given the noiseless input state. This would allow the conclusion that the result obtained from the noisy state has not been altered by the presence of noise. To obtain this guarantee, we leverage Theorem 3 in the following protocol. Let ρ be a noisy input with F(ρ, σ) > 1 − ε where σ is the noiseless state and let \({\mathcal{A}}\) be a quantum classifier with quantum channel \({\mathcal{E}}\) and POVM \({\{{{{\Pi }}}_{k}\}}_{k}\). Similar to the previous protocol, we again need to take into account that in practice we can sample the quantum circuit only a finite number of times. Thus, we again use Hoeffding’s inequality to obtain estimates for the class probability pA that holds with probability at least 1 − α. The protocol then consists of the following steps:

  1. 1.

    Apply the quantum circuit N times to the (noisy) state ρ and perform the \(\left|{\mathcal{C}}\right|\)-outcome measurement \({\{{{{\Pi }}}_{k}\}}_{k = 1}^{\left|{\mathcal{C}}\right|}\) each time. Store the outcomes in variables nk for every \(k\in {\mathcal{C}}\).

  2. 2.

    Determine the most frequent measurement outcome kA and set \({\hat{p}}_{{\rm{A}}}={n}_{{k}_{{\rm{A}}}}/N-\sqrt{-{\mathrm{ln}}\,(\alpha )/2N}\).

  3. 3.

    If \({\hat{p}}_{{\rm{A}}}\,>\,1/2\), set \({\hat{p}}_{{\rm{B}}}=1-{\hat{p}}_{{\rm{A}}}\) and calculate the minimum robust fidelity rF according to (17) using \({\hat{p}}_{{\rm{A}}}\); otherwise, abstain from certification.

  4. 4.

    If 1 − ε > rF, it is guaranteed that \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\).

Running these steps, along with a classification, allows us to certify that the classification has not been affected by the noise, i.e., that the same classification outcome would have been obtained on the noiseless input state.

Robustness for known noise models

Now, we analyze the robustness of a quantum classifier against known noise models that are parametrized by a noise parameter γ. Specifically, we investigate robustness against phase damping and amplitude damping. Using Theorem 3, we calculate the fidelity between the clean input σ and the noisy input \({{\mathcal{N}}}_{\gamma }(\sigma )\) and rearrange the robustness condition (17) such that it yields a bound on the maximal noise that the classifier tolerates.

Phase damping describes the loss of quantum information without losing energy. For example, it describes how electronic states in an atom are perturbed upon interacting with distant electrical charges. The quantum channel corresponding to this noise model can be expressed in terms of Kraus operators that are given by

$${K}_{0}=\left(\begin{array}{ll}1&0\\ 0&\sqrt{1-\gamma }\end{array}\right),\ \ \ {K}_{1}=\left(\begin{array}{ll}0&0\\ 0&\sqrt{\gamma }\end{array}\right)$$

where γ is the noise parameter. From this description alone, we can see that a system that is in the \(\left|0\right\rangle\) or \(\left|1\right\rangle\) state is always robust against all noise parameters in this model as it acts trivially on \(\left|0\right\rangle\) and \(\left|1\right\rangle\). Any such behavior should hence be reflected in the tight robustness condition we derive from QHT. Indeed, for a pure state \(\left|\psi \right\rangle =\alpha \left|0\right\rangle +\beta \left|1\right\rangle\), Theorem 3 leads to the robustness condition γ ≤ 1 if α = 0 or β = 0 and, for any α, β ≠ 0,

$$\gamma \,<\,1 - \left(\max \left\{0,\ 1 + \frac{{{r}_{{\rm{F}}}-1}}{2{\left|\alpha \right|}^{2}{\left|\beta \right|}^{2}}\right\}\right)^{2}$$

where \({r}_{{\rm{F}}}=\frac{1}{2}(1+\sqrt{g({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})})\) is the fidelity bound from Theorem 3 and pA, pB are the corresponding class probability bounds. This bound is illustrated in Fig. 4 as a function of \({\left|\alpha \right|}^{2}\) and pA. The expected behavior toward the boundaries can be seen in the plot, namely that when \({\left|\alpha \right|}^{2}\to \{0,\ 1\}\), then the classifier is robust under all noise parameters γ ≤ 1.

Fig. 4: Robustness against known noise models.
figure 4

Both plots show the maximal noise parameter γ for which the classifier \({\mathcal{A}}\) is still guaranteed to be robust, for (a) phase damping and (b) amplitude damping, when classifying a pure state input \(\left|\psi \right\rangle =\alpha \left|0\right\rangle +\beta \left|1\right\rangle\). In a, we can see that for states \(\left|0\right\rangle\) and \(\left|1\right\rangle\), the classifier is robust against any γ ≤ 1, while for (b) the same holds if the input state is \(\left|1\right\rangle\).

Amplitude damping models effects due to the loss of energy from a quantum system (energy dissipation). For example, it can be used to model the dynamics of an atom that spontaneously emits a photon. The quantum channel corresponding to this noise model can be written in terms of Kraus operators

$${K}_{0}=\left(\begin{array}{ll}1&0\\ 0&\sqrt{1-\gamma }\end{array}\right),\ \ \ {K}_{1}=\left(\begin{array}{ll}0&\sqrt{\gamma }\\ 0&0\end{array}\right),$$

where γ is the noise parameter and can be interpreted as the probability of losing a photon. It is clear from the Kraus decomposition that the \(\left|0\right\rangle\) state remains unaffected. This again needs to be reflected by a tight robustness condition. For a pure state \(\left|\psi \right\rangle =\alpha \left|0\right\rangle +\beta \left|1\right\rangle\), Theorem 3 leads to the robustness condition γ ≤ 1 if \(\left|\alpha \right|=1\) and, for any α, β ≠ 0,

$$\begin{array}{ll}\gamma \,<\,1-\left[\frac{{|\alpha |}^{2}}{{|\alpha |}^{2}-{|\beta |}^{2}}\cdot \left(1\,-\,\sqrt{1-\frac{{|\alpha |}^{2}-{|\beta |}^{2}}{{|\alpha |}^{2}{|\beta |}^{2}}\cdot \frac{\max \{0,{{r}_{{\rm{F}}}}-{|\alpha |}^{2}\}}{{|\alpha |}^{2}}}\ \right)\right]^{2}\end{array}$$

where again \({r}_{{\rm{F}}}=\frac{1}{2}(1+\sqrt{g({p}_{{\rm{A}}},\ {p}_{{\rm{B}}})})\) is the fidelity bound from Theorem 3. This bound is illustrated in Fig. 4 as a function of \({\left|\alpha \right|}^{2}\) and pA. It can be seen again that the bound shows the expected behavior, namely that when \({\left|\alpha \right|}^{2}\to 1\), then the classifier is robust under all noise parameters γ ≤ 1.

We remark that, in contrast to the previous protocol, here we assume access to the noiseless state σ and we compute the robustness condition on the noise parameter based on the classification of this noiseless state. This can be used in a scenario where a quantum classifier is developed and tested on one device, but deployed on a different device with different noise sources.

Randomized inputs with depolarization smoothing

In the previous section, we looked at robustness of quantum classifiers against certain types of noise, either with respect to a known noise model, or with respect to unknown, potentially adversarial, noise. Here we take a different viewpoint, and investigate how robustness against unknown noise sources can be enhanced by harnessing depolarization noise. This is led by the intuition that noise can be exploited to increase robustness and privacy. We first provide background on randomized smoothing, a technique for provable robustness from classical machine learning. We then proceed to present provable robustness in terms of trace distance that is equivalent to the robustness condition (10) from Theorem 1 but with depolarized inputs. The bound is then compared numerically with the Hölder duality bound from Lemma 2 and with a result obtained recently from quantum differential privacy17.

Randomized smoothing

Randomized smoothing is a technique that has recently been proposed to certify the robustness and obtain tight provable robustness guarantees in the classical setting26. The key idea is to randomize inputs to classifiers by perturbing them with additive Gaussian noise. This results in smoother decision boundaries that in turn leads to improved robustness to adversarial attacks. In this section, we extend this concept to the quantum setting by interpreting quantum noise channels as “smoothing” channels. The idea of harnessing actively induced input noise in quantum classifiers to increase robustness has recently been proposed in ref. 17 where a robustness bound with techniques from quantum differential privacy has been derived. In the following, we take a similar path and consider a depolarization noise channel and analytically derive a larger robustness radius for pure single-qubit input states.

Quantum channel smoothing: depolarization

Consider depolarization noise that maps a state σ onto a linear combination of itself and the maximally mixed state

$$\sigma \,\mapsto\, {{\mathcal{E}}}_{p}^{{\rm{dep}}}(\sigma ):=(1-p)\sigma +\frac{p}{d}{{\mathbb{1}}}_{d}$$

where p (0, 1) is the depolarization parameter and d is the dimensionality of the underlying Hilbert space. In single-qubit scenarios, this can geometrically be interpreted as a uniform contraction of the Bloch sphere parametrized by p, pushing quantum states toward the completely mixed state. Analogously to classical randomized smoothing, we apply a depolarization channel to inputs before passing them through the classifier in order to artificially randomize the states and increase robustness against adversarial attacks. We then obtain a robustness guarantee by instantiating Theorem 1 in the following way. Let σ be a benign input state and suppose that the classifier \({\mathcal{A}}\) with score function y satisfies

$${{\bf{y}}}_{{k}_{{\rm{A}}}}({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\sigma ))\ge {p}_{{\rm{A}}}\,>\,{p}_{{\rm{B}}}\ge \mathop{\max }\limits_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\sigma )).$$

Then \({\mathcal{A}}\) is robust at \({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\rho )\) for any adversarial input state ρ that satisfies the robustness condition (10), where β* is the optimal type-II error probability for testing \({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\sigma )\) against \({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\rho )\). In particular, if σ and ρ are single-qubit pure states and in the case where we have pA + pB = 1, the robustness condition can be equivalently expressed in terms of the trace distance as T(ρ, σ) < rQ(p) with

$${r}_{{\rm{Q}}}(p)=\left\{\begin{array}{ll}\sqrt{\frac{1}{2}-\frac{\sqrt{g(p,\ {p}_{{\rm{A}}})}}{1-p}},&\,{p}_{{\rm{A}}}\,<\,\frac{1+3{(1-p)}^{2}}{2+2{(1-p)}^{2}}\\ \sqrt{\frac{p\cdot (2-p)\cdot {(1-2{p}_{{\rm{A}}})}^{2}}{8{(1-p)}^{2}\cdot (1-{p}_{{\rm{A}}})}},&\,{p}_{{\rm{A}}}\ge \frac{1+3\cdot {(1-p)}^{2}}{2+2\cdot {(1-p)}^{2}}\end{array}\right.$$


$$g(p,\ {p}_{{\rm{A}}})=\frac{1}{2}\left(2{p}_{{\rm{A}}}\left(1-{p}_{{\rm{A}}}\right)-p\left(1-\frac{p}{2}\right)\right).$$

A detailed derivation of this bound is given in Supplementary Note 5.

The Hölder bound from Lemma 2 can also be adapted to the noisy setting. Specifically, since for two states σ and ρ, the trace distance obeys \(T({{\mathcal{E}}}_{p}^{{\rm{dep}}}(\rho ),\ {{\mathcal{E}}}_{p}^{{\rm{dep}}}(\sigma ))=(1-p)\cdot T(\rho ,\ \sigma )\), Lemma 2 implies robustness given that the trace distance is less than T(ρ, σ) < rH(p) where


It has been shown in ref. 17 that naturally occurring noise in a quantum circuit can be harnessed to increase the robustness of quantum classification algorithms. Specifically, using techniques from quantum differential privacy, a robustness bound expressible in terms of the class probabilities pA and the depolarization parameter p has been derived. Written in our notation and for single-qubit binary classification, the bound can be written as


and robustness is guaranteed for any adversarial state ρ with T(ρ, σ) < rDP(p). The three bounds are compared graphically in Fig. 5 for different values of the noise parameter p, showing that the QHT bound gives rise to a tighter robustness condition for all values of p.

Fig. 5: Robustness bounds with depolarized input states.
figure 5

Comparison of robustness bounds for single-qubit pure states derived from quantum hypothesis testing rQ(p), Hölder duality rH(p), and quantum differential privacy rDP(p)17 with different levels of depolarization noise p.

It is worth remarking that although the QHT robustness bounds can be, as shown here for the case of applying depolarization channel, enhanced by active input randomization, it already presents a valid, non-trivial condition with noiseless (without smoothing) quantum input (Theorems 1, 3, Corollary 1, and Lemma 2). This contrasts with the deterministic classical scenario, where the addition of classical noise sources to the input state is necessary to generate a probability distribution corresponding to the input data, from which an adversarial robustness bound can be derived26. This distinction between the quantum and classical settings roots in the probabilistic nature of measurements on quantum states, which of course applies to both pure and mixed state inputs.


We have seen how a fundamental connection between adversarial robustness of quantum classifiers and QHT can be leveraged to provide a powerful framework for deriving optimal conditions for robustness certification. The robustness condition is provably tight when expressed in the SDP formulation in terms of optimal error probabilities for binary classifications or, more generally, for multiclass classifications where the probability of the most likely class is greater than 1/2. The corresponding closed form expressions arising from the SDP formulation are proved to be tight for general states when expressed in terms of fidelity and Bures distance, whereas in terms of trace distance, tightness holds only for pure states. These bounds give rise to (1) a practical robustness protocol for assessing the resilience of a quantum classifier against adversarial and unknown noise sources; (2) a protocol to verify whether a classification given a noisy input has had the same outcome as a classification given the noiseless input state, without requiring access to the latter, and (3) conditions on noise parameters for amplitude and phase damping channels, under which the outcome of a classification is guaranteed to remain unaffected. Furthermore, we have shown how using a randomized input with depolarization channel enhances the QHT bound, consistent with previous results, in a manner akin to randomized smoothing in robustness certification of classical machine learning.

A key difference between the quantum and classical formalism is that quantum states themselves have a naturally probabilistic interpretation, even though the classical data that could be embedded in quantum states do not need to be probabilistic. We now know that both classical and quantum optimal robustness bounds for classification protocols depend on bounds provided by hypothesis testing. However, hypothesis testing involves the comparison of probability distributions, which can only be possible in the classical case with the addition of stochastic noise sources if the classical data are initially non-stochastic. This means that the optimal robustness bounds in the classical case only exist for noisy classifiers that also require training under the additional noise26. This is in contrast to the quantum scenario. Our quantum adversarial robustness bound can be proved independently of randomized input, even though it can be enhanced by it, like through a depolarization channel. Thus, in the quantum regime, unlike in the classical deterministic scenario, we are not forced to consider training under actively induced noise.

Our optimal provable robustness bound and the connection to QHT also provide a first step toward more rigorously identifying the limitations of quantum classifiers in its power of distinguishing between quantum states. Our formalism hints at an intimate relationship between these fundamental limitations in the accuracy of distinguishing between different classes of states and robustness. This could shed light on the robustness and accuracy trade-offs observed in classification protocols38 and is an important direction of future research. It is also of independent interest to explore possible connections between tasks that use QHT, such as quantum illumination33 and state discrimination39, with accuracy and robustness in quantum classification.


Proof of Theorem 1

The proof of this theorem is based on showing that the measurement operators of the classifier can be viewed as an operator that is feasible for the SDP (3). Specifically, note that in the Heisenberg picture we can write the score function y of the classifier \({\mathcal{A}}\) as

$${{\bf{y}}}_{k}(\sigma )={\rm{Tr}}\left[{{\mathcal{E}}}^{\dagger }\left({{{\Pi }}}_{k}\right)\sigma \right]={\rm{Tr}}\left[{{{\Lambda }}}_{k}\sigma \right]$$

where \({{{\Lambda }}}_{k}:={{\mathcal{E}}}^{\dagger }({{{\Pi }}}_{k})\). Since \({\mathcal{E}}\) is a CPTP map, its dual is completely positive and unital and thus 0 ≤ Λk ≤ \(\mathbb{1}\) and

$$\mathop{\sum}\limits_{k}{{{\Lambda }}}_{k}=\mathop{\sum}\limits_{k}{{\mathcal{E}}}^{\dagger }({{{\Pi }}}_{k})={{\mathcal{E}}}^{\dagger }({\mathbb{1}})={\mathbb{1}}.$$

Note that the operator \({\mathbb{1}}-{{{\Lambda }}}_{{k}_{{\rm{A}}}}\) is feasible for the SDP \({\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\) since by assumption

$$\alpha ({\mathbb{1}}-{{{\Lambda }}}_{{k}_{{\rm{A}}}};\ \sigma )=1-{{\bf{y}}}_{{k}_{{\rm{A}}}}(\sigma )\le 1-{p}_{{\rm{A}}}.$$

It follows that

$${{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )=\beta ({\mathbb{1}}-{{{\Lambda }}}_{{k}_{{\rm{A}}}};\ \rho )\ge {\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho ).$$

Similarly, let k ≠ kA be arbitrary. Then, the operator Λk is feasible for the SDP \({\beta }_{{p}_{{\rm{B}}}}^{* }(\sigma ,\ \rho )\) since

$$\alpha ({{{\Lambda }}}_{k};\ \sigma )={{\bf{y}}}_{k}(\sigma )\le {p}_{{\rm{B}}}$$

and hence

$$1-{{\bf{y}}}_{k}(\rho )=\beta ({{{\Lambda }}}_{k};\ \rho )\ge {\beta }^{* }_{{p}_{{\rm{B}}}}(\sigma ,\ \rho )$$

Since k ≠ kA is arbitrary, it follows that if ρ satisfies

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )+{\beta }^{* }_{{p}_{{\rm{B}}}}(\sigma ,\ \rho )\,>\,1$$

then it is guaranteed that

$${{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )\,>\,\mathop{\max }\limits_{k\ne {k}_{{\rm{A}}}}{{\bf{y}}}_{k}(\rho )$$

and thus \({\mathcal{A}}(\rho )={\mathcal{A}}(\sigma )\). □

Proof of Theorem 2

Note that, since pB = 1 − pA by assumption, the robustness condition (10) reads

$${\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho )\,>\,1/2.$$

Let \({M}_{{\rm{A}}}^{\star }\) be an optimizer of the corresponding SDP such that \(\alpha ({M}_{{\rm{A}}}^{\star })=1-{p}_{{\rm{A}}}\) and

$$\beta ({M}_{{\rm{A}}}^{\star };\ \rho )={\beta }_{1-{p}_{{\rm{A}}}}^{* }(\sigma ,\ \rho ).$$

Consider the classifier \({{\mathcal{A}}}^{\star }\) with score function y defined by the POVM \(\{{\mathbb{1}}-{M}_{{\rm{A}}}^{\star },\ {M}_{{\rm{A}}}^{\star },\ 0\}\) where the number of 0 operators is such that y has the desired number of classes. The score function y is consistent with the class probabilities (9) since

$${{\bf{y}}}_{{k}_{{\rm{A}}}}^{\star }(\sigma )=\alpha ({\mathbb{1}}-{M}_{{\rm{A}}}^{\star };\ \sigma )={p}_{{\rm{A}}}$$
$${{\bf{y}}}_{{k}_{{\rm{B}}}}^{\star }(\sigma )=\alpha ({M}_{{\rm{A}}}^{\star };\ \sigma )=1-{p}_{{\rm{A}}}={p}_{{\rm{B}}}.$$

Furthermore, if ρ violates (55), then we have

$${{\bf{y}}}_{{k}_{{\rm{A}}}}(\rho )=\beta ({M}_{{\rm{A}}}^{\star };\ \rho )\le 1/2$$

and thus, in particular \({{\mathcal{A}}}^{\star }(\rho )\,\ne \,{k}_{{\rm{A}}}={{\mathcal{A}}}^{\star }(\sigma )\). □

Fidelity robustness condition

Recall that the robustness condition in Theorem 1 is expressed in terms of the SDP from the Neyman–Pearson approach to QHT. Thus, in order to use Theorem 1 to obtain robustness bounds in terms of a meaningful distance between quantum states, we need to connect the optimal type-II error with this distance. Here, we look specifically at the fidelity between pure quantum states and sketch the proof for Lemma 1. We refer the reader to Supplementary Note 3 for details.


Proof of Lemma 1 (sketch). The key challenge to proving this result is connecting the robustness condition (10), written in terms of type-II error probabilities, to the fidelity F which, for pure states, is given by the squared overlap \({\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\). It is well known that optimizers to the SDP (3) are given by Helstrom operators, Mt, which can be expressed in terms of the projection onto the positive and null eigenspaces of the operator ρ − tσ. The first step is thus to solve the eigenvalue problem

$$(\rho -t\sigma )\left|\eta \right\rangle =\eta \left|\eta \right\rangle$$

which, for pure states, can be expressed in terms of the squared overlap \({\left|\langle {\psi }_{\sigma }| {\psi }_{\rho }\rangle \right|}^{2}\). Given these solutions, one then derives an expression for the Helstrom operators \({M}_{{\rm{A}}}^{\star }\) and \({M}_{{\rm{B}}}^{\star }\) with type-I error probabilities 1 − pA and pB, respectively. This leads to the robustness condition \(\beta ({M}_{{\rm{A}}}^{\star };\ \rho )+\beta ({M}_{{\rm{B}}}^{\star };\ \rho )\,>\,1\) being an inequality that can be rewritten as a condition on the fidelity that takes the desired form (12). □

In a similar manner, one can derive the trace distance bound for depolarized input states presented in the “Results” section of this paper. The full proof for the robustness bound in Eq. (43) is given in Supplementary Note 5.