Introduction

As hardware is developed to implement quantum circuits on increasing numbers of qubits, it will be valuable to have economical benchmarks of fully quantum behavior. From the outset of quantum computing it has been clear that the advantage of a quantum computer lies somewhere in its ability to readily perform tasks that are physically challenging or impossible for a classical system. Therefore, ideal hardware benchmarks should certify the ability of the hardware to generate such nonclassical behavior. Indeed, a wide variety of benchmarking techniques have been developed recently,1,2 including gate-fidelity benchmarks using randomized gate sequences that avoid the state-preparation and measurement errors, and state-preparation benchmarks that certify particular states while avoiding the exponential scaling of state tomography.

Despite these recent achievements, quantifying the specific nonclassical resources that lead to quantum computational advantage has remained an elusive goal.3 Several earlier proposals for suitable measures like entanglement,4,5,6,7 Bell-nonlocality,8,9,10,11,12,13 or quantum discord and its variations,14,15,16 proved to be insufficient on their own due to the discovery of algorithmic counter-examples.17,18,19,20,21 Recent advances suggest a strong connection between quantum advantage and contextuality,22,23,24,25,26 which is a general structural feature of quantum mechanics that subsumes nonlocality. The most pragmatic metric of nonclassical behavior in quantum devices, however, has been the violation of two-qubit Bell inequalities, or similar entanglement witnesses that can apply to few-qubit subsets of a multi-qubit device.27

In this article, we provide a set of practical hardware benchmarks that naturally generalize two-qubit Bell inequality tests to N qubits, based on the Greenberger–Horne–Zeilinger (GHZ) theorem. As with Bell inequalities, our nonclassicality benchmarks use the experimental violation of a classical bound to quantify the nonclassical behavior of the circuit. Beyond quantifying nonclassicality via a bound-violation, these benchmarks also provide tight lower bounds on the fidelities with which particular stabilizer subspaces have been prepared, and thus witness genuine N-qubit entanglement for all states that lie within the targeted subspaces. These benchmarks are optimized for testing controllable qubit arrays with nearest-neighbor coupling. As such, we provide efficient circuits for preparing cluster states that maximally violate these benchmarks with controlled-Z entangling gates, using a constant gate depth of 4 (up to hardware-specific decompositions of the controlled-Z gate28,29,30,31,32,33). Though our benchmarks efficiently verify genuine N-qubit entanglement using cluster states, many of the benchmarks may be applied to other stabilizer states and we expect similar benchmarks to exist for all stabilizer states.

The benchmarks we present here generalize earlier work that was experimentally tested with N = 3, 4 photons,34 where they were compared to previously proposed state-dependent methods for efficiently verifying the fidelity of particular entangled N-qubit preparations.35,36 These prior methods have already been used to verify multi-qubit entanglement in state-of-the-art experiments with 12 qubits37 and 18 qubits,38 since the exponential scaling required for traditional state tomography is increasingly prohibitive. Notably, for large N our GHZ-based benchmarks produce a tighter preparation-fidelity bound than these existing methods and similarly produce entanglement witnesses with better scaling.

Results

Nonclassicality benchmarks

Our benchmarks consist of measurable correlators that are compared to derived upper bounds; violation of these bounds characterizes nonclassicality. Each such benchmark corresponds to a specific prepare-and-measure circuit on N-qubits with M ≤ N + 1 different measurement settings. The M observables form a structure called an ID (also called an identity product39), which is a set of mutually commuting N-qubit Pauli operators whose overall product is the N-qubit identity, up to a sign. We express an ID as an M × N table of single-qubit Pauli operators and the identity {Z, X, Y, I}, labeled Oij with i = 1, …, M and j = 1, …, N. We also define the shortened label \(O_i = \otimes _{j = 1}^NO_{ij}\) to indicate the N-qubit observable obtained as the product of the ith row of an ID. We omit tensor product symbols for compactness.

To obtain the Bell inequality for each ID,34 we choose a particular eigenspace Π represented by a projector of rank 2NM+1, which is specified by the set of N-qubit Pauli observables {Oi} that form the M rows of the ID (see Figs 1 and 2), and a specific choice of their respective eigenvalues {λi}. We then define the correlator observable for this chosen eigenspace,

$$\alpha = \mathop {\sum}\limits_{i = 1}^M {\lambda _i} {\mkern 1mu} O_i,$$
(1)

such that its expectation value in a state ρ has an upper bound of βQM = M, saturated by the chosen eigenspace ρ = Π

$$\langle \alpha \rangle = \mathop {\sum}\limits_{i = 1}^M {\lambda _i} {\mkern 1mu} {\mathrm{Tr}}(\rho {\mkern 1mu} O_i) \le \beta _{{\mathrm{QM}}} = M.$$
(2)

For example, we could prepare the joint eigenstate of the ID of Fig. 1a, with negative eigenvalue λ1 = −1 for the three-qubit Pauli observable O1 = YXY, and positive eigenvalues λ2 = λ3 = λ4 = +1 for the remaining observables O2 = YYZ, O3 = ZXZ, and O4 = ZYY. Then, 〈α〉 = Tr(Πα) = 4, since each term in the sum becomes +1.

Fig. 1
figure 1

Minimal benchmark IDs for N = 3, …, 9 qubits. Each table in ag has M rows of N observables Oij, with i = 1, …, M and j = 1, …, N. The product of each row defines \(O_{i} = {\otimes}_{j=1}^{N}O_{ij}\). Eigenvalues λi of Oi are also shown in each table, chosen to correspond to the state prepared by the circuit of Fig. 3 for the corresponding N, which lies in the specific eigenspace stabilized by the ID. Combining the rows of each ID with the appropriate eigenvalue defines a correlator observable \(\alpha = \sum \nolimits_{i} \lambda_{i} O_{i}\), from which we obtain the experimental benchmark score \({\cal{B}} = (\langle\alpha \rangle _{\exp} - M + 2)/2\) that witnesses nonlocal N-partite entanglement when \(0 \,{<}\, {\cal{B}} \leq 1\), as well as the lower bound \(F \ge F_{\mathrm{ID}} = ({\cal{B}} + 1)/2\) on the fidelity F for the state preparation to lie within the indicated eigenspace of the ID

Fig. 2
figure 2

Maximal benchmark IDs for (a) all even N ≥ 10 and (b) all odd N ≥ 11. These IDs can be extended in increments of two qubits and two observables by adding tiles as shown, and filling all other spaces with ‘I’s. The N = 10 and N = 11 are the cases of a, b, respectively, with zero tiles added. We can see from the asymmetric shape of the tiles that the added qubits must become entangled with the existing ones because the two-qubit observables in the added columns do not mutually commute. See the Supplementary Notes and Supplementary Fig. 6 for a proof that these IDs belong to the stabilizer group of the linear cluster state for all N

In the spirit of Bell,9,10 if one tries to explain the observed correlation by choosing a complete set of local hidden variables vZj, vXj, vYj {+1, −1} that predict the outcomes of the single-qubit Pauli measurements, then at least one of the terms in the correlator sum becomes −1, resulting in a smaller upper bound,

$$\langle \alpha \rangle \le \beta _{{\mathrm{LHVT}}} = M - 2.$$
(3)

Experimental violation of this bound thus indicates nonclassicality in the form of a violation of local realism. Though the locality loophole is always open for neighboring qubits on a chip, this violation is still a useful witness for nonclassical states prepared by the chip, much like for Bell inequalities or Bell–Leggett–Garg inequalities.40 The derivation of this bound is reviewed in the “Methods” section.

As an independent result, maximizing the expectation value of the correlator over all biseparable quantum states in the N-qubit Hilbert space produces the upper bound,

$$\langle \alpha \rangle \le \beta _{{\mathrm{bisep}}} = M - 2,$$
(4)

which happens to coincide with the bound for local hidden variable theories. Experimental violation of the bound thus also witnesses genuine N-partite entanglement. In the “Methods” section, we provide the proof that the joint eigenspaces of the IDs in this article are maximally entangled, as well as the derivation of this bound.

In light of the convenient fact that βbisep = βLHVT, we define the nonclassicality benchmark score for a given physical N-qubit device as the experimentally determined value,

$${\cal{B}} = \frac{{\langle \alpha \rangle _{{\mathrm{exp}}} - M + 2}}{2},$$
(5)

such that \({\cal{B}} \le 0\) fails to witness either entanglement or the violation of local realism, while \(0 \,{<}\, {\cal{B}} \le 1\) witnesses nonlocal N-partite-entangled states. The nonclassicality benchmark score thus serves as a metric of uniquely quantum behavior, with \({\cal{B}} = 1\) indicating maximum nonclassicality that saturates the correlator bound. Each N-qubit ID provides a benchmark corresponding to a distinct nonclassical eigenspace of an N-qubit physical device, and thus the hierarchy of IDs presented in Fig. 1 provides a corresponding hierarchy of benchmarks.

Lower bounding the fidelity

The correlator also serves to bound the fidelity from below,34

$$F \ge F_{{\mathrm{ID}}} = \frac{{\langle \alpha \rangle _{{\mathrm{exp}}} - M + 4}}{4} = \frac{{{\cal{B}} + 1}}{2},$$
(6)

where F = Tr(ρexpΠ)  [0, 1] is the fidelity that the experimentally prepared state ρexp lies within the eigenspace Π stabilized by the chosen ID. We provide a general derivation of this bound in the “Methods” section. Importantly, in the limit 〈αexp → βQM = M, we have FID → 1, and thus as the fidelity of the preparation is improved, this lower bound obviates the need for full tomography of these preparations.

Taken together, the inequalities of Eqs. (3), (4) and (6) provide a practical and efficient characterization of the prepared N-qubit state, as well as a robust benchmark of its nonclassical behavior, using only M ≤ N + 1 measurement settings. We present minimal benchmark IDs in Fig. 1 for N = 3, …, 9, and detail minimal IDs up to N = 33 qubits in Supplementary Figs 1 through 5. These minimal IDs saturate the conjectured bound N ≤ (M − 2)(M − 1)/2. We also present a family of maximal benchmark IDs in Fig. 2 for all N ≥ 10 that saturate the bound M − 1 ≤ N.

Benchmark circuits and simulation

The IDs in this article have been specially chosen so that the prepare-and-measure circuit for each measurement setting requires a gate depth of 4 on any array of N physical qubits with only nearest-neighbor controlled-Z couplings, making them a scalable and uniform set of benchmarks for implementations of this type. Figure 3 shows the circuits for N = 4, 5, from which the generalization to all N should be straightforward. In general, each circuit prepares an N-qubit linear cluster state, which is contained within the maximally entangled subspace of the corresponding ID.

Fig. 3
figure 3

Illustrative circuit diagrams for preparing the states for IDs in Fig. 1, with a corresponding to N = 4 in Fig. 1b and b corresponding to N = 5 in Fig. 1c. These two examples generalize to N-qubits, and produce linear cluster states. The local measurement settings for each observable Oij in the ID are implemented by the unitary operations Uij, assuming detectors that naturally measure the Z basis. This circuit allows the M different settings of an ID to be implemented with different Uij for different observables and qubits. For example, in the "five-qubit ID of Fig. 1c the first setting is ZYYZI, meaning that for the first and fourth qubits U11 = U14 = I, for the second and third qubits U12 = U13 = eiπX/4, and the fifth qubit is ignored

In order to evaluate the usefulness of these benchmarks in real-world physical implementations, we simulated the performance of these circuits for each of the IDs in Fig. 1. We simulated each circuit over a range of T1 energy relaxation times, T2 dephasing times, and angular jitter for the controlled-Z gate rotations, using the ranges given in Figs 4 and 5. We also considered the effect of initialization and readout error for each qubit. The ranges of values were chosen to match the reported values of the 9-qubit Google chip,31,32 with the experimental values roughly in the center of each simulated range. We ran one version of the simulation using a nominal initialization error for each qubit of Pe = 2%, and another version where we used the observed initialization errors for each of the nine qubits on the Google chip. Final readout error has been neglected as correctable for ensemble statistics. Selected plots from the simulations are shown in Fig. 4, while scatter plots of the lower fidelity bound, FID, are shown in Fig. 5 for the full ranges of simulated values. Note that in order to minimize the effect of the two worst qubits on the chip (boldface values in Figs 4 and 5), we always used the last N qubits on the chip to form our N-qubit IDs in the simulation. See the “Methods” section for additional details about how the numerical simulations were performed.

Fig. 4
figure 4

Nonclassicality benchmark scores (\({\cal{B}}\)), for selected simulations. The nonideality parameter ranges were T1 [5, 50] μs energy relaxation times, T2 [1, 19] μs dephasing times, and w [0.05, 0.5] rad of angular jitter widths for ZZ90 decompositions of controlled-Z (CZ) gates. We used a single-qubit gate time of Δt = 25 ns, and a two-qubit controlled-Z gate time of Δt = 45 ns, which are conservative estimates for the reported gate times. In plots e, f, the curves for qubit numbers N = 3, …, 9 are ordered starting from the top curve. In plots b, c, the N = 5 lines are below the N = 6 lines, and the N = 7 lines are below the N = 8 lines, due to the poor performance of chip qubits 5 and 7 (boldface values). ac Simulated data using Google’s 9-qubit-chip values31,32: {T1} = {18.6, 28.1, 22.0, 19.1, 41.1, 21.3, 39.2, 24.7, 26.3} μs and {Pe} = {1.8, 1.1, 1.7, 1.3, 4.8, 0.7, 6.7, 0.4, 1.5}%. df Simulated data for Pe = 2% initialization error, with parameter ranges centered on mean chip values. a, d \({\cal{B}}\) vs. N. Ideal curves have T2 = T1 = ∞ and w = 0. Median curves approximate the chip, with shading indicating the range of simulated values. b, e \({\cal{B}}\) vs. T2, fixing median chip values of w and T1. c \({\cal{B}}\) vs. w, fixing the median chip value of T2. f \({\cal{B}}\) vs. T1, fixing the median chip values of T2 and w

Fig. 5
figure 5

Scatterplots of the fidelity lower bound FID vs. true fidelity F for all simulated data. The lower bound is tight, thus as F → 1 so too does FID. All plots contain data for the nonideality ranges: T2 [1, 19] μs dephasing times, and w [0.05, 0.5] rad angular jitter widths for CZ gates. a Chip values {T1} = {18.6, 28.1, 22.0, 19.1, 41.1, 21.3, 39.2, 24.7, 26.3} μs energy relaxation times, and {Pe} = {1.8, 1.1, 1.7, 1.3, 4.8, 0.7, 6.7, 0.4, 1.5}% initialization error. b Pe = 2% initialization error, with range T1 [5, 50] μs. c Same ranges as the center plot, but with Pe = 0 to show the asymptotic approach FID → F as F → 1

Judging by our simulated data shown in Figs 4 and 5, we expect the nine-qubit Google chip to be able to violate the classicality bounds for all nine qubits. We can see clearly that the qubit initialization error is the dominant source of error as we try to move to larger N. This shows that our benchmarking scheme is immediately relevant, since it appears that similar hardware fidelity would only violate the bound for one or two more qubits—but certainly not all 72 on the Bristlecone chip41—once suitable IDs have been found beyond the nine presented here.

Discussion

The IDs and implementation circuits presented in this article are good benchmark tests for any physical implementation of qubits in a nearest-neighbor-connected array. They work naturally on a chip with more connectivity than this as well. While our simulations targeted a particular recent chip implementation for concreteness, this does not constrain the general usefulness of this protocol for other multi-qubit systems.

Although some other families of IDs with the same properties as those in Figs 1 and 2 are known,39,42 the minimal IDs, with the largest possible value of N for a given M, are not known in general (see the Supplementary Discussion and Supplementary Figs 1 through 5 for the best known cases). Because of their geometric nature, enumerating all of the representative IDs for given values of N and M is a highly nontrivial problem, related to solving the graph isomorphism problem on N × M colored vertices, and it is thus limited by computational resources. Furthermore, not every ID can be constructed using only nearest-neighbor couplings in linear circuits as in Fig. 3. The increased connectivity of more modern chips, like the Bristlecone chip from Google, should allow the use of more general IDs, although the circuit depth will likely increase by one or two gates.

Each of the IDs presented here also gives rise to a complete proof of the Kochen–Specker (KS) theorem for contextuality,22,43,44 which can be implemented for any initial state with a few alternative circuits for the different measurement contexts. In general, IDs are the natural building blocks of proofs of the KS theorem in the N-qubit Pauli group. This is a slightly more complicated setup, which could inspire different contextuality based benchmarks in future work.

Finally, maximally entangled IDs with M < N + 1 give rise to maximally entangled eigenspaces, each of dimension 2NM+1, which generalize the codespaces of error-correcting codes,45,46 and L = N − M + 1 is the number of logical qubits (where N is the number of physical qubits). All N-qubit-stabilizer-based error-correcting codes (including the toric code47) belong to the family of IDs, and while all IDs of this type are error-detecting codes, they cannot all be used to diagnose the syndrome of an error in order to correct it. Many of the well-known error-correcting codes generate an ID which proves the GHZ theorem, and all can be used as entanglement witnesses in the manner of this article.48 Nevertheless, these more general maximally entangled subspaces may be of significant interest for other applications in quantum information processing, which warrants further investigation. One straightforward application for these subspaces is to perform benchmarks that measure the physical qubits as described in this paper, while simultaneously benchmarking the performance of the logical qubits in some additional way. The two tests may be performed simultaneously because any general logical L-qubit state can be prepared for each benchmark, although the circuit is likely to be longer and more complex than Fig. 3, and the performance will be commensurately worse. It is remarkable to note that if the conjectured bound N ≤ (M − 2)(M − 1)/2 can be saturated, then the number of logical qubits is bounded by L ≤ ((M − 2)(M − 1)/2 − M + 1, and thus the ratio L/N → 1 in the limit M → ∞.

Methods

Proving the GHZ theorem

All of the IDs in Fig. 1 have sign −1, and for each qubit j, the number of entries Oij = Z in the ID is even, as is the number of entries with Oij = X and with Oij = Y. These properties indicate that these IDs give rise to proofs of the GHZ theorem,11 which is a logical version of Bell’s nonlocality theorem,9,10 without any inequalities. To see this, suppose that a joint eigenstate (i.e., any state in a joint eigenspace) of these observables is prepared. This eigenstate has M eigenvalues λi corresponding to the M observables, and \(\mathop {\prod}\nolimits_{i = 1}^M {\lambda _i} = - 1\), since the product of these M observables is −IN. Suppose that each of the N qubits are now mutually space-like separated, and each is subjected to random local Pauli measurements, and label their outcomes λij, when all N local measurement settings happen to correspond to observable i of the ID. The entanglement correlations that are obeyed by this state are \(\mathop {\prod}\nolimits_{j = 1}^N {\lambda _{ij}} = \lambda _i\). Putting these relations together we have \(\mathop {\prod}\nolimits_{i = 1}^M {\mathop {\prod}\nolimits_{j = 1}^N {\lambda _{ij}} } = - 1\). Now, in order for a local hidden variable theory (LHVT) to explain these entanglement correlations, each qubit j must carry local hidden variables vZj, vXj, vYj {+1, −1} which predict the outcomes λij, and are pre-arranged to satisfy the entanglement constraints. However, for such hidden variables we would have \(\mathop {\prod}\nolimits_{i = 1}^M {\mathop {\prod}\nolimits_{j = 1}^N {\lambda _{ij}} } = \mathop {\prod}\nolimits_{j = 1}^N {v_{Zj}^{n_j}} v_{Xj}^{m_j}v_{Yj}^{l_j} = + 1\), since nj, mj, and lj are all even for the IDs of this article, and thus is is impossible to choose local hidden variables which can satisfy the entanglement correlations of this state. This logical proof without inequalities can be converted into a Bell inequality for use as a benchmark of N-qubit nonlocality, as shown in the main text, by noting that for any complete assignment of local hidden variables vZj, vXj, vYj {+1, −1} to the ID, at least one of the observables has the wrong eigenvalue.

In general, proving the GHZ theorem does not prove that nonlocal correlations exist between more than just a single pair of qubits among the N,49,50,51,52 nor does it generally witness genuine N-qubit entanglement. In contrast, the benchmark IDs we present in this article prove the GHZ theorem and are constructed to be N-partite entanglement witnesses,53,54 such that their corresponding Bell inequalities can only be violated by genuinely N-qubit-entangled states. To go further than the results we present here and prove nonlocal correlations exist between every pair of qubits among the N, one must violate the corresponding Svetlichny inequalities49,55 instead, but with the cost that the number of required measurement settings grows exponentially with N.49

Bounding the fidelity

An N-qubit ID with M observables {Oi} has a complete set of eigenspaces {Πk} satisfying \(\mathop {\sum}\nolimits_k {{\Pi}_k} = I\), each of which can be identified by a unique set of distinct eigenvalues {λik} of {Oi}. Only M − 1 of the observables in an ID are independent, and if M − 1 < N the eigenspaces Πk are degenerate, and each contains 2NM+1 mutually orthogonal vectors |κjk〉 which share the eigenvalue λik, with j = 1, …, 2NM+1, such that {|κjk〉} is a complete orthonormal eigenbasis of the ID. Each of the 2M−1 eigenspaces Πk corresponds to a unique correlator \(\alpha _k = \mathop {\sum}\nolimits_{i = 1}^M {\lambda _{ik}} O_i\). Each experimentally obtained quantity 〈αk〉 enables us to put a lower bound on the fidelity that an experimentally prepared pure state |ψ〉 lies within the eigenspace Πk.34

With no loss of generality, we will henceforth use correlator α1 and the target eigenspace Π1. We begin by expanding |ψ〉 in this eigenbasis as

$$|\psi \rangle = \mathop {\sum}\limits_{j = 1}^{2^{N - M + 1}} \left[ a_j|\kappa _{j1}\rangle + \mathop {\sum}\limits_{k = 2}^{M - 1} {b_{jk}} |\kappa _{jk}\rangle \right],$$
(7)

such that \(\mathop {\sum}\limits_j {\left(|a_j|^2 + \mathop {\sum}\limits_{k = 2}^{2^{M - 1}} | b_{jk}|^2\right)} = 1\).

Since the expansion is in an eigenbasis of α1, we find

$$\begin{array}{c}\langle \alpha _1\rangle _{{\mathrm{exp}}} = \langle \psi |\alpha _1|\psi \rangle \\ = \mathop {\sum}\limits_{j = 1}^{2^{N - M + 1}} \left[ |a_j|^2\langle \kappa _{j1}|\alpha _1|\kappa _{j1}\rangle + \mathop {\sum}\limits_{k = 2}^{2^{M - 1}} | b_{jk}|^2\langle \kappa _{jk}|\alpha _1|\kappa _{jk}\rangle \right].\end{array}$$
(8)

Note that \(\langle \kappa _{j1}|\alpha _1|\kappa _{j1}\rangle = \mathop {\sum}\nolimits_{k = 1}^M {\lambda _k^2} = M\), since all eigenvalues of |κ1〉 match those in the correlator α1 by construction, and thus square to 1. However, any other |κjk〉 does not lie within Π1, so is characterized by eigenvalues distinct from those characterizing Π1. Moreover, since the product of all eigenvalues for the observables of a given ID is fixed for any eigenstate, only even numbers of eigenvalues can differ from those characterizing Π1, which necessarily causes at least two terms of 〈κjk|α1|κjk〉 to become −1, resulting in an upper bound of 〈κjk|α1|κjk〉 ≤ M − 4 for those eigenstates. Using these two observations we obtain,

$$\langle \alpha _1\rangle _{{\mathrm{exp}}} \le \mathop {\sum}\limits_{j = 1}^{2^{N - M + 1}} \left[ |a_j|^2M + \mathop {\sum}\limits_{k = 2}^{2^{M - 1}} | b_{jk}|^2(M - 4)\right] = 4F + M - 4,$$
(9)

where \(F = \mathop {\sum}\nolimits_j {|a_j|^2}\), and we have used \(\mathop {\sum}\nolimits_j {(|a_j|^2 + \mathop {\sum}\nolimits_{k = 2}^{2^{M - 1}} | b_{jk}|^2)} = 1\). We can rewrite this relation as

$$F \equiv \langle \psi |{\Pi}_1|\psi \rangle \ge \frac{{\langle \alpha _1\rangle _{{\mathrm{exp}}} - M + 4}}{4} \equiv F_{{\mathrm{ID}}}.$$
(10)

Noting that the left hand side of this equation is the fidelity F for the preparation |ψ〉 to lie within the eigenspace Π1, the right hand side FID gives a lower bound F ≥ FID for the fidelity. For IDs with M = N + 1, the target subspace Π1 contains only one eigenvector, so the fidelity F is also a state preparation fidelity for the particular target eigenstate |κ1〉. For IDs with M < N + 1, the target subspace Π1 is degenerate, so the fidelity F is the fidelity for |ψ〉 to lie within that subspace.

Next we generalize the above derivation to the case of mixed states. For a general convex combination of m pure states,

$$\rho = \mathop {\sum}\limits_{j = 1}^m {c_l} |\psi _l\rangle \langle \psi _l|,$$
(11)

where \({\sum} {c_l} = 1\), we can expand each |ψl〉 using appropriate eigenbases of the ID as in Eq. (7) and follow the same arguments to obtain

$$\langle \alpha _1\rangle _{{\mathrm{exp}}} \le \mathop {\sum}\limits_{l = 1}^m {c_l} (4F_l + M - 4),$$
(12)

where Fl ≡ 〈ψl1|ψl〉. We can rewrite this as

$$F \equiv {\mathrm{Tr}}(\rho {\mkern 1mu} {\Pi}_1) = \mathop {\sum}\limits_{j = 1}^m {c_l} F_l \ge \frac{{\langle \alpha _1\rangle _{{\mathrm{exp}}} - M + 4}}{4} \equiv F_{{\mathrm{ID}}}.$$
(13)

As in the pure state case, the left-hand side is the fidelity F for the mixed state ρ to lie within the target subspace Π1, while the same expression for the right-hand side FID places a lower bound on this fidelity.

Witnessing genuine N-partite entanglement

An N-qubit ID provides an entanglement witness if it is maximally entangled.39,56 Entanglement is usually discussed in reference to the separability of states. However, there is a way to reason about the entanglement of a set of observables directly without reference to states. We define a maximally entangled set of N-qubit observables as one with the property that there exists no bipartition of the N qubits into subsets of R and N − R, such that all of the observables in each subset \(\mathop { \otimes }\nolimits_{k \in [1,R]} O_{ik}\) mutually commute. It follows from this definition that the joint eigenstates of this set are maximally entangled N-qubit stabilizer states.

To see this, consider that every stabilizer state (space) of N qubits has a stabilizer group of b = 2g mutually commuting Pauli observables {Si} and corresponding eigenvalues {λi}, and its density operator can be written as

$$\rho = \frac{1}{d}\mathop {\sum}\limits_{i = 1}^b {\lambda _i} S_i,$$
(14)

where g is the number of independent generators in the set, and d = 2N is the dimension of the Hilbert space. Note that if g < N, then ρ projects onto a subspace of rank r = 2Ng > 1, and that g = M − 1 for a minimal ID, which is just a specific subset of one or more complete stabilizer groups. If a stabilizer state is the tensor product of two smaller stabilizer states on subsystems A and B, it follows that its density operator can be written as

$$\rho _{AB} = \left(\frac{1}{{d^A}}\mathop {\sum}\limits_{i = 1}^{b^A} {\lambda _i^A} S_i^A\right) \otimes \left(\frac{1}{{d^B}}\mathop {\sum}\limits_{j = 1}^{b^B} {\lambda _j^B} S_j^B\right) = \frac{1}{{d^{AB}}}\mathop {\sum}\limits_{k = 1}^{b^{AB}} {\lambda _k^{AB}} S_k^{AB}.$$
(15)

For the bipartition of the system into A and B, all of the stabilizer operators \(S_i^A = \mathop { \otimes }\nolimits_{k \in A} O_{ik}\) mutually commute by definition. It follows that one can find such a mutually commuting bipartition for any separable state, and therefore if no such bipartition exists, then the set of observables is maximally entangled. All of the IDs presented in this article are maximally entangled in this way, which results in a witness inequality with the same bound as the Bell inequality.

All states within a maximally entangled eigenspace of an ID are maximally entangled, meaning that for all of them, the maximum squared-Schmid-coefficient across all bipartitions is 1/2. For such an eigenstate |ψ〉, a standard entanglement witness is \({\cal{W}} = 1/2 - |\psi \rangle \langle \psi |\), and an experimental measurement of \(\langle {\cal{W}}\rangle \,{<}\, 0\) is a witness of genuine N-partite entanglement.54 Noting that a superposition state a|ψ〉 + b|ψ〉 can only violate this bound for F = |a|2 > 1/2, we obtain FID ≤ F ≤ 1/2 for all biseparable states. Plugging this into FID = (〈αexp − M + 4)/4 yields 〈αbisep ≤ M − 2, which is Eq. (4).

Numerical simulation details

In the simulation, the state is first degraded by initialization error. That is, ideally the N qubits are prepared in an initial ground state \(\otimes _{i = 1}^N|0\rangle\). However, each qubit has an error probability \(P_{\mathrm{e}}^{(i)}\) of being initially excited, which produces a mixed initial bit state \((1 - P_{\mathrm{e}}^{(i)})|0\rangle \langle 0| + P_{\mathrm{e}}^{(i)}|1\rangle \langle 1| = (1 - 2P_{\mathrm{e}}^{(i)})|0\rangle \langle 0| + P_{\mathrm{e}}^{(i)}I\), and thus a degraded initial state \(\rho = \otimes _{i = 1}^n[(1 - 2P_{\mathrm{e}}^{(i)})|0\rangle \langle 0| + P_{\mathrm{e}}^{(i)}I]\) with ground state fidelity \(\mathop {\prod}\nolimits_{i = 1}^n {(1 - 2P_{\mathrm{e}}^{(i)})}\). The final readout error for an ensemble average can be corrected if the readout misidentification probabilities \(P_{\mathrm{e}}^i\) are known, and thus we have neglected the role of the readout error.

Each gate in Fig. 3 is then applied to the initial state ρ. For the Hadamard gate, it is sufficient to use a Y90 rotation, exp(−iYπ/4). We decompose the controlled-Z gate into an implementable ZZ90 entangling gate and single-qubit corrections: exp(/4)[exp(iZπ/4)  exp(iZπ/4)]exp(−iZZπ/4). We degraded each gate by T1 energy relaxation and T2 dephasing processes for the corresponding gate times Δt. For the energy relaxation time T1, the first-order corrections for each individual qubit are accumulated and then applied to ρ. For each qubit \(\Delta \rho _i = (a_i^\dagger \rho a_i - \frac{1}{2}\{ \rho ,a_i^\dagger a_i\} )\Delta t/T_1^i\), where ai is the lowering operator of the ith qubit tensored with identity for the other qubits, and \(\rho \to \rho + \mathop {\sum}\nolimits_i^N \Delta \rho _i\). This linear-order Lindblad-form update is sufficient, since \(\Delta t/T_1^i \ll 1\). For the dephasing time T2, we directly construct the matrix

$$D = \left( {\begin{array}{*{20}{c}} 1 & {{\mathrm {e}}^{ - \Delta t/T_2}} \\ {{\mathrm {e}}^{ - \Delta t/T_2}} & 1 \end{array}} \right)^{ \otimes N},$$
(16)

for efficiency and apply gate dephasing using element-wise multiplication (MATLAB syntax.*), as ρ → ρ.* D.

For simulating gate infidelity, we assume that the single-qubit gate fidelities are high enough for their errors to be neglected, and so simulate only a range of fidelities for the two-qubit controlled-Z gates. As a crude model for infidelity of a controlled-Z gate, we add a random angular jitter δφ only to the ZZ rotation step, exp[−iZZ(π/2 + δφ)/2], and average over the effect of this jitter using a raised cosine distribution with a width w, dP(δφ) = d(δφ)[1 + cos(πδφ/w)]/(2w), where δφ [−w, w] has compact angular support. This yields the averaged state update,

$$\begin{array}{c}\rho \to {\int} {{\mathrm {e}}^{ - i\zeta _i(\pi /2 + \delta \varphi )/2}} {\mkern 1mu} \rho {\mkern 1mu} {\mathrm {e}}^{i\zeta _i(\pi /2 + \delta \varphi )/2}{\mkern 1mu} {\mathrm {d}}P(\delta \varphi )\\ = \frac{1}{2}\left[\rho + \zeta _i\rho \zeta _i + i(\zeta _i\rho - \rho \zeta _i)\left(\frac{{\sin w}}{w} - \frac{{\sin w}}{{2(w + \pi )}} - \frac{{\sin w}}{{2(w - \pi )}}\right)\right],\end{array}$$
(17)

where ζi is the tensor product of Pauli Z for the two qubits the controlled-Z is acting on, and identity for all of the other qubits. The limit as w → 0 restores the unperturbed gate. This crude error model includes only one possible physical mechanism of infidelity for the controlled-Z gate, but gives an indication of the gate sensitivity to imprecise angular control. Since the initialization error dominates the infidelity, the effect of the angular jitter is small.