Introduction

Quantum technology research can be broadly categorized into two strands: on the one hand, the development of large-scale fault-tolerant quantum computers, and on the other hand, the development of quantum networks that link quantum computers together and allow for quantum communication-based tasks (such as clock synchronization1, anonymous communication2, and cryptography3,4), culminating in a Quantum Internet5 connecting quantum processing nodes. These nodes, for which physical platforms such as NV-centers in diamond6, ion traps7, and neutral atoms8 are currently being developed, possess quantum computing capacity, leading to the possibility of distributed or networked quantum computing9.

One of the major step changes in the development of quantum computers in the last decade has been the development of practical methods to characterize the quality of quantum operations, allowing experimentalists to quickly diagnose and improve a critical building block of a fault-tolerant quantum computing architecture (see e.g., ref. 10 and references therein). In this work, we consider the corresponding problem of the characterization of quantum communication links, a key feature of quantum networks that has no real counterpart in quantum computation. Several methods exist to assess the quality of a quantum network link which we briefly review. For entanglement-based networks, i.e., networks, where the quantum network link is established through entangled states between nodes, any characterization of the quality of entanglement, translates in principle to a quality measure of the network link. Many methods exist to assess the quality of entanglement (see e.g., refs. 11,12 for work on Bell inequalities and self-testing, and13 for quantum state tomography), which can be mapped to quality-assessment methods for quantum network links. Similarly, for direct transmission-based network links (abstractly modeled by a quantum channel), we may, from the transmission of qubit states in two distinct bases (typically the X and Z bases), make an inference about how well any state, or indeed entanglement, may be transmitted (see e.g., refs. 14,15,16). More generally, a procedure is known to estimate the capacity of a quantum channel17). Finally, ref. 18 gives a method to certify whether a quantum network of nodes connected by quantum links has attained a specific stage of development.

In this work, we aim to add to this toolbox by proposing network benchmarking: a procedure to assess the quality of transmission between quantum processing nodes in a quantum network in the so-called quantum memory network stage and above5, by yielding an estimate of the average fidelity of the effective quantum channel modeling a quantum network link. Network benchmarking is adapted from the randomized benchmarking protocol19,20, a gold-standard methodology for the characterization of quantum operations in quantum computers. It is lightweight, easy to implement, and inherits many of the robustness properties enjoyed by the original randomized benchmarking protocol. We also give a more general multi-node protocol that can be used to characterize the fidelity of a path of multiple nodes connected by quantum communication links, and can thus be seen as the quantum analogon to the classical “ping” operation.

Results

Overview

We introduce network benchmarking, a method that robustly and efficiently yields an estimate of the quality of a quantum network link. We propose two versions of this protocol: a two-node protocol that estimates the quality of a link between two quantum network nodes, and a multi-node protocol that estimates the quality of a path over several nodes in a network. We provide a theoretical analysis of these protocols, arguing that they estimate the average fidelity of the quantum channel modeling a quantum link. For network links implemented by noisy quantum teleportation, we prove that this network fidelity can be related to the average fidelity, a standard metric of the quality of quantum processes.

We supplement this theoretical work with numerical simulations using the quantum network simulator Netsquid21. By testing several realistic scenarios, we can argue that network benchmarking performs well under realistic conditions (e.g., noise, timing, and circuit decompositions), efficiently yielding accurate estimates of the network fidelity of a network link.

Finally, we analyze the statistical requirements of network benchmarking, with a particular focus on the number of times the quantum communication link must be used to get a good estimate of the average fidelity.

In the section Network model, we elaborate upon a model of a quantum network and recall aspects of the quantum channel formalism for noisy quantum operations. In the section Network benchmarking, we introduce the network benchmarking protocol, in its two-node and multi-node versions, and in the section Network fidelity, we connect the data it generates to the average fidelity. In the section Simulation results, we present results from numerical simulations of the network benchmarking protocol using the NetSquid simulation package for quantum networks. In the section Statistics of network benchmarking, we discuss the statistics of network benchmarking.

Network model

We consider an abstract model of a quantum network, consisting of nodes and connections between those nodes. We will label the nodes with capital letters (A, B, . .) and denote the connection between two nodes by a directional arrow (A → B). These nodes and connections are abstractions of the physical components of the network. Within the framework of5 we will assume that our nodes have the following functionalities:

  1. (1)

    The ability to store quantum states in memory (stage 4 in ref. 5). We model this by associating a memory register \({{{{\mathcal{H}}}}}_{A}\) to each node. We will assume this register can be initialized in some fixed initial state ρA and read out by measurement in a POVM \({\{{E}_{A}^{(i)}\}}_{i\in I}\) with I being some index set labeling the possible measurement outcomes. An example of an initial state is the all-zero \(\left|0\ldots 0\right\rangle\) state and an example of a POVM measurement is the standard computational basis measurement.

  2. (2)

    The ability to perform quantum operations on stored quantum states (stage 5 in ref. 5). We will model this by allowing the application of quantum gates U from a gateset \({\mathbb{G}}\). Ideally this gateset is universal, meaning that any unitary operation can be implemented on the quantum processing node by sequences of unitaries from \({\mathbb{G}}\), but we will only need a weaker property to perform network benchmarking, as we shall see in section Network benchmarking.

  3. (3)

    The ability to transmit quantum states from node A to node B (stage 3 in ref. 5). This can be implemented in various ways in the underlying hardware, such as through teleportation using entanglement, but we will consider it as an abstract functionality here. It is this ability that is predominantly tested by network benchmarking.

Noise and average fidelity

The appearance of noise in quantum devices is typically modeled by quantum channels. These are superoperators (linear maps that send matrices to matrices), that preserve physicality, i.e., they map quantum states to quantum states. For an extended introduction to quantum channels see ref. 22. We will denote quantum channels by Λ and denote the action of a quantum channel on a state ρ as Λ(ρ). We will use superscripts to indicate the function of a quantum channel and subscripts to denote the node with which they are associated. For instance, we will model the noise associated with state preparation in node A by \({{{\Lambda }}}_{A}^{SP}\) and the noise associated with measurement by \({{{\Lambda }}}_{A}^{M}\). We will also associate each quantum operation U (which we think of a as a superoperator U(ρ) = UρU, abusing notation somewhat) implemented on a node A a quantum channel \({{{\Lambda }}}_{A}^{U}\) modeling the noise associated to the operation U. This means that if node A is instructed to apply U to a state ρ (yielding (U(ρ) = UρU)) it actually outputs \(({{{\Lambda }}}_{A}^{U}U)(\rho )={{{\Lambda }}}_{A}^{U}(U\rho {U}^{{\dagger} })\). Note that we have made no mention of the physical mechanism by which \({{{\Lambda }}}_{A}^{U}\) arises, it is abstracted away. Finally, we associate to the quantum transmission link A → B the quantum channel ΛAB, modeling the noise incurred by a state in the process of transmission between A and B. Note that in general, we do not assume that ΛAB = ΛBA, although we will see examples where this is the case. We have included an illustration of the network model and associated noise maps in Fig. 1a.

Fig. 1: The quantum network model.
figure 1

a An example network with three nodes A, B, and C presented together with all associated quantum channels modeling state preparation noise (ΛSP), measurement noise ΛM, operation noise (ΛU), and modeling noise in the network link connecting the node (e.g., ΛAB for the link AB). b Graphical description of (a single run of m bounces of) the two-node network benchmarking protocol, with time running rightwards. Boxes indicate actions taken at nodes A, B, and colors (color online) associated with each box indicate what noise process (see a) affects these actions. See algorithm 1 for a detailed description of the two-node network benchmarking protocol.

Upon modeling (the act of transmission through) a network link AB with a quantum channel, we can ask how we can quantify the quality of this network link. This question is equivalent to asking how well the quantum channel ΛAB approximates the identity channel. One of the most common ways of quantifying this approximation is the average fidelity F(Λ) of a quantum channel Λ, which is defined as

$$F({{\Lambda }})=\int d\psi {{{\rm{Tr}}}}\left[{{\Lambda }}\left(\left|\psi \right\rangle \left\langle \psi \right|\right)\left|\psi \right\rangle \left\langle \psi \right|\right],$$
(1)

where the integral is taken uniformly over all pure quantum states \(\left|\psi \right\rangle\). One can interpret this quantity as measuring how much a generic quantum state changes when Λ acts on it, or equivalently as capturing the average behavior of Λ. The average fidelity is a standard metric used in reporting the quality of quantum operations in quantum computers. The goal of network benchmarking is to estimate quantities like FAB), the average fidelity of (the quantum channel modeling) a network link AB.

Finally we note that the average fidelity is closely related to another quantity23, which we call the depolarizing fidelity f(Λ). These two quantities are related as

$$F({{\Lambda }})=\frac{(d-1)f({{\Lambda }})+1}{d},$$
(2)

where d is the dimension of the underlying state space. The depolarizing fidelity does not have a clean operational interpretation of the average fidelity, but it will show up more naturally in the calculations below. Next, we move on to defining and analyzing the network benchmarking protocol.

Network benchmarking

In this section, we introduce network benchmarking. We will describe two versions of this protocol, a two-node protocol and a more general multi-node protocol. Network benchmarking can be seen as an adaption of the randomized benchmarking protocol19,20,24 for quantum networks, and will share many of its characteristics and theoretical analysis.

Consider two separated nodes A and B connected by a quantum network links AB and BA, with associated quantum channels ΛAB, ΛBA. The goal of two-node network benchmarking is to estimate the average fidelities FAB) and FBA). However, we desire that the procedure estimating these quantities satisfies several properties. The first property is efficiency: we demand that the estimation procedure is light on resource use (measured in the number of times a network link is used), and independent of the capacity of the network link. By this, we mean that we want to be able to estimate the fidelity of links sending many-qubit states in parallel without an exponential explosion in resource use. The second property is resistance to state preparation and measurement errors. We will not assume that the initialization of states and the measuring of POVMs in nodes is perfect, and we demand that that estimation procedure output the correct result even when state preparation and measurement (SPAM) is imperfect. Ideally, we would also like to demand the independence of noise in quantum operations performed locally, but this is not possible. However, given that gate fidelities are typically much higher than state preparation and measurement fidelities in many physical platforms for network nodes demanding only SPAM-robustness is a reasonable compromise.

Network benchmarking is not a device-independent protocol, and in order to guarantee that it outputs an estimate of the fidelity of the quantum network link, we have to make several assumptions on the behavior of the nodes and network link. These assumptions are essentially the same as those of standard randomized benchmarking, see ref. 25 for a general discussion of these assumptions. The central assumption we make is that of Markovianity: we will assume that the noise in the network link is always modeled by the same quantum channel, independent of the history of its use. We will similarly assume that the noise on state preparation ρA, measurements \({\{{E}_{A}^{(i)}\}}_{i\in I}\), and quantum gates UA, have noise models that only depend on node A (and not on external variables like time, history, etc..). Note that this assumption of Markovianity was already implicit in our earlier description of the network model. Markovianity is a standard assumption in the benchmarking literature, but it is not universally valid. Non-Markovian effects (see ref. 26 for definitions and in-depth discussion) have been studied27,28 in the dynamics of NV centers (leading to a breakdown of the Markovianity assumption on the quantum gates). Non-Markovian effects can also occur in photon detection (i.e., measurement), for instance, due to misclassified photons due to control pulses preceding the measurement (see e.g., Supplementary Fig. 3 in ref. 29).

We will also assume that the quantum gates UA has a so-called gate-independent noise model. This means we assume that there exists a quantum channel ΛA such that for all gates \(U\in {\mathbb{G}}\) the implementation of U is given by ΛA(UρU)). We stress, however, that this is merely a technical assumption standard in the randomized benchmarking literature, adopted to make the proof of correctness of network benchmarking easier to understand. It can be removed at the cost of a considerable increase in mathematical complexity, see ref. 25 for a general treatment.

Two-node network benchmarking

The two-node network benchmarking protocol involves two nodes, A and B connected by links AB and BA. This protocol produces an estimation of the (geometric) mean quality of the quantum channels ΛAB, ΛBA associated with the links. A formal specification of the two-node network benchmarking protocol is given in 1. An illustration of the steps of the protocol can also be found in Fig. 1b. Here, we give a more intuitive explanation of the steps taken.

The protocol begins with the initialization of a state ρA at node A. To this state, a quantum operation \({G}_{A}^{(1)}\) is applied and the resulting state is then sent (through ΛAB) to node B. Upon arrival at B another quantum operation \({G}_{B}^{(1)}\) is applied and the state is sent back to node A (through ΛBA). The quantum operations \({G}_{A}^{(1)},{G}_{B}^{(1)}\) must be chosen at random from a sufficiently large set of quantum operations \({\mathbb{G}}\). By sufficiently large we mean that the set must be at least a unitary two-design. A common choice for such a set is the multi-qubit Clifford group \({\mathbb{C}}\)24, which is also appropriate here. We will refer to the above sequence of “random operation at A - send to B- random operation at B- send to A” as a bounce. The protocol proceeds by performing such a bounce m times, where m is some pre-specified integer. After these m bounces a final operation \({G}_{A}^{({{{\rm{inv}}}})}\) is applied at node A after which the state is read out by a two-component POVM \(\{E,{\mathbb{1}}-E\}\). This operation \({G}_{A}^{({{{\rm{inv}}}})}\) is not chosen at random but is instead the inverse of the product of all preceding gates, plus some extra ending gate PA, in symbols

$${G}_{A}^{({{{\rm{inv}}}})}={P}_{A}{\left(\mathop{\prod }\limits_{i = 1}^{m}{G}_{B}^{(i)}{G}_{A}^{(i)}\right)}^{{\dagger} }.$$
(3)

This means that if, hypothetically, all gates and state transfer operations are perfectly noise-free, the overall operation applied to the initial state ρA is the ending gate PA. This ending gate must again be chosen at random, but this time from a restricted gateset of two operations: \({P}_{A}\in \{{\mathbb{1}},P\}\) where P is a unitary that sends ρA to a state orthogonal to ρA. If ρA is the all-zero state, a good choice for P would be the all-qubit Pauli X-gate. Upon measurement, a binary outcome b is produced, which is negated depending on whether PA is \({\mathbb{1}}\) or P. This is a post-processing trick originally proposed in ref. 24, making the processing of this output data easier (we will explain this in more detail in section IV A). The procedure outlined above must then be repeated for many different random choices of operations, to estimate the average outcome \({b}_{m}={\mathbb{E}}(b)\). Finally, the integer m must be varied, yielding a set of data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\) where \({\mathbb{M}}\) is some list of integers.

Algorithm 1

The two-node network benchmarking protocol

 1: for \(m\in {\mathbb{M}}\) do

 2:  for nm from 1 to Nm do

 3:   Prepare a state ρA at node A

 4:   for i from 1 to m do

 5:    Apply a random gate \({G}_{A}^{(i)}\) to ρA

 6:    Transfer ρA to node B using ΛAB

 7:    Apply a random gate \({G}_{B}^{(i)}\) to ρA

 8:    Transfer ρA to node A using ΛBA

 9:   end for

10:   Choose PA randomly from the set \(\{{\mathbb{1}},P\}\)

11:   Apply \({G}_{A}^{({{{\rm{inv}}}})}={P}_{A}\,\,{\left(\mathop{\prod }\nolimits_{i = 1}^{m}{G}_{B}^{(i)}{G}_{A}^{(i)}\right)}^{{\dagger} }\) to ρA.

12:   Measure the state ρA using the POVM \(\{E,{\mathbb{1}}-E\}\)

13:   and record the outcome \({b}_{{n}_{m}}\in \{0,1\}\)

14:   if PA is equal to P then

15:    Set \({b}_{{m}_{n}}\) to \(-{b}_{{m}_{n}}\)

16:   end if

17:  end for

18:  Compute the mean outcome

$${b}_{m}=\frac{1}{{N}_{m}}\mathop{\sum }\limits_{{n}_{m}=1}^{{N}_{m}}{b}_{{n}_{m}}$$
(4)

19: end for

20: Output the list \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\)

As we will argue in the next section, the output data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\) can be fitted to a single exponential

$${b}_{m}\,{ = }\,_{{{{\rm{fit}}}}}A{f}^{m}$$
(5)

where A depends on state preparation and measurement (SPAM) errors and f only depends on the noise incurred by the application of local gates and the channels ΛBA, ΛAB. We can extract the quantity f by performing a least-squares fit on the data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\). We will call the quantity f the network link fidelity (associated with nodes A, B). In the next section, we will see that, under the assumption that the unitary operations at each node have noise that is the same for each operation, i.e. that there exist quantum channels ΛA, ΛB such that \({{{\Lambda }}}_{A}^{G}={{{\Lambda }}}_{A}\) and \({{{\Lambda }}}_{B}^{G}={{{\Lambda }}}_{B}\) for each gate G, the network link fidelity f can be written as f = fABΛA)fBAΛB), with f(Λ) the depolarizing fidelity (as defined in eq. (2)). This means that f is related to the product of the depolarizing fidelities of ΛAB and ΛBA, but also depends on the local gate noise channels ΛA, ΛB. In practice, the local gates will have high fidelity relative to the communication links, so the network fidelity f will be dominated by the channels ΛAB, ΛBA.

Multi-node network benchmarking

The above protocol can be generalized to quantify the fidelity of a connected path of network nodes. This provides a quantum version of the classical “ping” command and could prove useful in day-to-day network operations. Consider nodes A1, …, AK that are connected by quantum channels \({{{\Lambda }}}_{{A}_{i}\to {A}_{i+1}}\) and \({{{\Lambda }}}_{{A}_{i+1}i\to {A}_{i}}\) for i {1, …K − 1}. The multi-node network benchmarking protocol works by sending a state from A1 to AK (along A2, A3, . . .) and then back to A1 with a random gate applied to this state at each intermediate node. By performing this multi-node bounce several times, one can extract an estimate of the fidelity of the composite link connecting A1 and AK. The protocol is specified in algorithm 2.

Algorithm 2

The multi-node network benchmarking protocol

 1: for \(m\in {\mathbb{M}}\) do

 2:  for nm from 1 to Nmdo

 3:   Prepare a state ρA at node A

 4:   for i from 1 to m do

 5:    for k from 1 to K − 1 do

 6:     Apply a random gate \({G}_{{A}_{k}}^{(i,1)}\) to ρA

 7:     Transfer ρA to node Ak+1 using \({{{\Lambda }}}_{{A}_{k}\to {A}_{k+1}}\)

 8:    end for

 9:    for k from K − 1 to 1 do

10:     Apply a random gate \({G}_{{A}_{k+1}}^{(i,2)}\) to ρA

11:     Transfer ρA to node Ak using \({{{\Lambda }}}_{{A}_{k}\to {A}_{k+1}}\)

12:    end for

13:   end for

14:   Choose PA randomly from the set \(\{{\mathbb{1}},P\}\)

15:   Apply the inverse

16:  \({G}_{{A}_{1}}^{({{{\rm{inv}}}})}={P}_{A}{\left(\mathop{\prod }\nolimits_{i = 1}^{m}\mathop{\prod }\nolimits_{k = K}^{1}{G}_{{A}_{k}}^{(i,2)}\mathop{\prod }\nolimits_{k = 1}^{K-1}{G}_{{A}_{k}}^{(i,1)}\right)}^{{\dagger} }\) to ρA.

17:   Measure the state ρA using the POVM \(\{E,{\mathbb{1}}-E\}\)

18:   and record the outcome \({b}_{{n}_{m}}\in \{0,1\}\)

19:   if PA is equal to P then

20:    Set \({b}_{{m}_{n}}\) to \(-{b}_{{m}_{n}}\)

21:   end if

22:  end for

23:  Compute the mean outcome

$${b}_{m}=\frac{1}{{N}_{m}}\mathop{\sum }\limits_{{n}_{m}=1}^{{N}_{m}}{b}_{{n}_{m}}$$
(6)

24: end for

25: Output the list \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\)

As in the two-node case, the output data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\) can be fitted to a single exponential

$${b}_{m}\,{ = }\,_{{{{\rm{fit}}}}}{A}_{{{{\rm{SPAM}}}}}{f}^{m}$$
(7)

Again assuming gate-independent noise for the local gates at each node, we can see (in an argument identical to that in the two-node case) that f will be given by

$$f=\mathop{\prod }\limits_{k=1}^{K-1}f({{{\Lambda }}}_{{A}_{k}\to {A}_{k+1}}{{{\Lambda }}}_{{A}_{k}})f({{{\Lambda }}}_{{A}_{k+1}\to {A}_{k}}{{{\Lambda }}}_{{A}_{k+1}}).$$
(8)

In other words, the output of the multi-node network benchmarking protocol is given by the product of the depolarizing fidelities of all intermediate communication links (up to local noise channels). We will refer to f as the network path fidelity (associated with the path A1, …, AK).

Discussion

In this work, we have presented the network benchmarking protocol, a robust and efficient tool for assessing the quality of network links between nodes in a quantum Internet. We have two versions of the protocol, a two-node version, analyzing the quality of a single connection, and a multi-node version, analyzing the quality of a path of nodes in a network. We gave a mathematical analysis of these protocols, arguing that under some assumptions, they output a quantity related tot the average fidelity of the quantum channels modeling the network links. We also argued that for a standard class of network link models, namely noisy quantum teleportation, the network fidelity can be exactly related to the average fidelity of the link. We supplemented this theoretical work with numerical simulations using the quantum network simulator Netsquid. From these simulations, we saw that network benchmarking works well in realistic environments. A natural next step would be to implement the network benchmarking protocol in real quantum networks, which are currently in development. On the theoretical side, it would be interesting to investigate further the use of network benchmarking as a tool for network discovery, integrating it as a subroutine in online routing algorithms for quantum networks30, which will have to take the quality of a network link into account when making routing decisions. Furthermore, one can conceive of variations of the network benchmarking protocol that measure other quality parameters such as the unitarity31 or the fidelity of some fixed interleaving unitary32. These could be constructed by adapting the computational benchmarking protocols (unitarity31 and interleaved32 benchmarking) that estimate these quantities, to the network setting.

Methods

Network fidelity

In this section, we will argue that the two-node network benchmarking protocol proposed in algorithm 1 yields an output related to the product of the fidelities of the maps ΛAB and ΛBA. This argument will easily generalize to the multi-node case. We will, for this section, assume that the network obeys the property of gate-independent noise. This means we assume that a gate GA acts as \({{{\Lambda }}}_{A}^{G}({G}_{A}{\rho }_{A}{G}_{A}^{{\dagger} })\) and \({{{\Lambda }}}_{B}^{G}({G}_{B}{\rho }_{B}{G}_{B}^{{\dagger} })\) for all \(G\in {\mathbb{G}}\). The arguments given here are closely related to those for randomized benchmarking19,20, and subsequently, the assumptions we make can be relaxed significantly by adapting the more modern treatments of randomized benchmarking25,33,34 to the network benchmarking setting, but we will not pursue this here.

Consider the average outcome bm of an m-bounce sub-protocol (for some \(m\in {\mathbb{M}}\)), as given in algorithm 1. This average outcome can be written out as

$$\begin{array}{ll}{b}_{m}={\mathbb{E}}\Bigg(\,{{{\rm{Tr}}}}\Bigg[{{{\Lambda }}}_{A}^{M}(E)\Bigg[{{{\Lambda }}}_{A}[{\mathbb{1}}-P]{\Bigg(\mathop{\prod }\limits_{i = 1}^{m}{G}_{B}^{(i)}{G}_{A}^{(i)}\Bigg)}^{{\dagger} }\\ \qquad\,\,\,\times\, {{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{B}{G}_{B}^{(m)}{{{\Lambda }}}_{A\to B}\cdots {{{\Lambda }}}_{B}\\ \qquad\,\,\,\times \,{G}_{B}^{(1)}{{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{A}{G}_{A}^{(1)}\Bigg](\rho )\Bigg]\,\Bigg),\end{array}$$
(9)

where the average is taken independently over \({G}_{A}^{(1)},\ldots ,{G}_{B}^{(m)}\). We can rewrite this quantity into something more manageable. Note first that, by linearity and independence, we can move the average over \({G}_{B}^{(m)}\) into the trace. Here we recognize the twirl operator

$$T({{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A})=\frac{1}{| {\mathbb{G}}| }\mathop{\sum}\limits_{{G}_{B}^{(m)}\in {\mathbb{G}}}{{G}_{B}^{(m)}}^{{\dagger} }{{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A}{G}_{B}^{(m)}$$
(10)

Now we can use the fact that \({\mathbb{G}}\) is a two-design to conclude that this twirl operator TABΛA) is a depolarizing channel with depolarizing fidelity fABΛA)23. Further, using the fact that a depolarizing channel commutes with unitary operations, we can perform this same trick for the remaining random gates \({G}_{A}^{(1)},\ldots ,{G}_{B}^{(m-1)},{G}_{B}^{(m)}\) to obtain

$$\begin{array}{ll}{b}_{m}={{{\rm{Tr}}}}\left({{{\Lambda }}}_{A}^{M}(E){\left[T({{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A})T({{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{B})\right]}^{m}\right.\\ \qquad\,\,\,\left.({{{\Lambda }}}_{A}^{SP}({\rho }_{A})-P{{{\Lambda }}}_{A}^{SP}({\rho }_{A}){P}^{{\dagger} })\right).\end{array}$$
(11)

Next we note that \({{{\rm{Tr}}}}({{{\Lambda }}}_{A}^{SP}({\rho }_{A})-P{{{\Lambda }}}_{A}^{SP}({\rho }_{A}){P}^{{\dagger} })=0\) (by cyclicity of the trace). Together with the fact that \({[T({{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A})T({{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{B})]}^{m}\) is a depolarizing channel with depolarizing fidelity \({[f({{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A})f({{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{B})]}^{m}\) this allows us to conclude that

$${b}_{m}\,=\,{{{\rm{Tr}}}}\,\left[(E\,-\,{P}^{{\dagger} }\,EP){{{{\mathcal{E}}}}}_{A}(\rho )\right]\,{\left[f({{{\Lambda }}}_{A\to B}{{{\Lambda }}}_{A})f({{{\Lambda }}}_{B\to A}{{{\Lambda }}}_{B})\right]}^{m}.$$

Hence we can obtain an estimate of the product fABΛA)fBAΛB) by fitting the data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\) to the formula

$${b}_{m}{ = }_{{{{\rm{fit}}}}}{A}_{{{{\rm{SPAM}}}}}{f}^{m}.$$
(12)

We can make a similar argument for the multi-node protocol, where we conclude that the average data \({\{{b}_{m}\}}_{m\in {\mathbb{M}}}\) can be described as

$${b}_{m}{ = }_{{{{\rm{fit}}}}}{A}_{{{{\rm{SPAM}}}}}{\left[{f}_{{A}_{1}{A}_{2}}\ldots {f}_{{A}_{K-1}{A}_{K}}{f}_{{A}_{K}{A}_{k-1}}\ldots {f}_{{A}_{2}{A}_{1}}\right]}^{m},$$

with \({f}_{{A}_{i-1}{A}_{i}}=f({{{\Lambda }}}_{{A}_{i-1}\to {A}_{i}}{{{\Lambda }}}_{{A}_{i}-1})\) where \({{{\Lambda }}}_{{A}_{i}-1}\) is the quantum channel modeling (gate-independent) local noise in the node Ai, and similarly for \({f}_{{A}_{i}{A}_{i-1}}\).

Symmetric fidelity and teleportation

The two-node network benchmarking protocol gives an estimate of the product of the depolarizing fidelities of the channels ΛAB and ΛBA modeling the links between node A and node B (up to local operation noise). However, in some relevant cases, the channels ΛBA and ΛAB have equal average fidelity (and thus depolarizing fidelity), in which case this average fidelity is directly accessible through network benchmarking. Here we discuss one important case where this is true, namely when the channels ΛAB and ΛBA are implemented through the quantum teleportation protocol using some pre-prepared entangled state ρAB (note that this is not necessarily a perfect maximally entangled state) between nodes A and B. Concretely we will prove the following lemma:

Lemma 1

Let ΛAB be the quantum channel implemented by teleportation using a state ρAB as a resource, and let ΛBA be the quantum channel implemented by teleportation using a state ρBA as a resource. If the local operations used in the teleportation process are noiseless, then FAB) = FBA).

Proof

We begin by noting that the average fidelity of any quantum channel Λ is related to its entanglement fidelity \(\left\langle {{\Phi }}\right|{\mathbb{1}}\otimes {{\Lambda }}({{\Phi }})\left|{{\Phi }}\right\rangle\) where Φ is the maximally entangled state, as (from ref. 23)

$$F({{\Lambda }})=\frac{d(\left\langle {{\Phi }}\right|{\mathbb{1}}\otimes {{\Lambda }}({{\Phi }})\left|{{\Phi }}\right\rangle )+1}{d+1}=\frac{d{F}_{e}({{\Lambda }})+1}{d+1}.$$
(13)

Next, we use a result from ref. 35, eq. (25) herein states that the entanglement fidelity of a channel ΛAB induced by teleportation (with perfect local operations) with a state ρAB is equal to the singlet fraction \({F}_{s}({\rho }_{AB})=\left\langle {{\Phi }}\right|{\rho }_{AB}\left|{{\Phi }}\right\rangle\) of the state ρAB. Similarly we have that FeBA) = Fs(ρBA). Now noting that the singlet fraction is invariant under the interchange of A and B we have FeAB) = Fs(ρAB) = (Fs(ρBA) = FeBA) and thus FAB) = FBA), which proves the lemma.

In this case, we can thus connect the network fidelity f, as measured by the two-node protocol, to the average fidelity of the network links A → B and B → A (assuming negligible contributions from local noise). We have

$$\begin{array}{ll}\sqrt{f}=\sqrt{f({{{\Lambda }}}_{A\to B})f({{{\Lambda }}}_{B\to A})}=f({{{\Lambda }}}_{A\to B})=f({{{\Lambda }}}_{B\to A})\\\quad\,\,\,\;=\,\frac{d{F}_{{{{\rm{avg}}}}}({{{\Lambda }}}_{A\to B})\,-\,1}{d\,-\,1},\end{array}$$
(14)

where we used eq. (2).

Simulation results

In this section, we discuss the results of a simulation of network benchmarking on a model network using the quantum network simulator NetSquid21. Netsquid is an advanced discrete event simulator that allows for the testing of quantum network properties in realistic circumstances, taking into account noisy operations and state preparation and measurement errors, but also issues specific to networks such as delay-induced decoherence, packet loss, and protocol timing issues. The code that generates the results below can be found in ref. 36. The goal of this section is to show how network benchmarking can be applied in practice. To this end, we have constructed two different simulations inspired by real-world scenarios. The first simulation investigates the behavior of two-node network benchmarking in a scenario where two network nodes are connected by network links implemented by teleportation, and the second investigates the use of the multi-node protocol as an efficient method to detect the decay of quality as the number of nodes in a path increases. For both these simulations we choose the local gateset to be the single-qubit Clifford group \({{\mathbb{C}}}_{1}\). For both simulations, we also specify a noise model that is an abstracted and simplified version of the noise present in networks based on NV-centers29. We, howeverasize that our intent is not to produce a detailed physical simulation of networks of this form (we do not take into account e.g., waiting times and non-deterministic entanglement creation), but rather to gain an intuition for the behavior of the network benchmarking protocol. Specifically, in both simulations, we will model physical T1 and T2 noise affecting qubits in the network nodes, model the quantum network links with a quantum channel emulating state transfer through teleportation (explained in more detail below), and omit other imperfections.

Teleportation-mediated link between two nodes

Figure 2a shows the outputs of two-node network benchmarking, as simulated in Netsquid. The links connecting A and B are here modeled by teleportation using a noisy entangled state of the form

$${\rho }_{AB}=\alpha \left|{{\Phi }}\right\rangle \,\left\langle {{\Phi }}\right|+(1-\alpha )\left|00\right\rangle \,\left\langle 00\right|,$$
(15)

where \(\left|{{\Phi }}\right\rangle\) is again the maximally entangled state and α is the bright state population of the qubit at the NV network node before entanglement generation. This state arises as a well-motivated model of single-photon heralded entanglement generation in NV-centers29. For our simulation, we choose a bright state population of α = 0.95 (slightly different from the value in ref. 29). Moreover, we model the qubits in the local nodes as being afflicted by standard T2 dephasing noise, with relevant values for 13C memory qubits in NV-center quantum processors being T2 = 12 ms (we technically also include T1 amplitude damping noise; however, this is not a critical factor in NV-centers37). Correspondingly we assume that applying native quantum operations on these memory qubits takes 39 μs (see Fig. 5 in ref. 37 for the above numbers). We note that since some of the gates in the single-qubit Clifford group must be compiled out of native operations, this is not a gate-independent noise model. The data in Fig. 2a is generated by running the two-node network benchmarking protocol for 40 random sequences for each number of bounces m (ranging from 1 to 20). Netsquid tracks density matrices, and can thus calculate the mean outcome for a random sequence directly. We add Gaussian noise to the data to simulate shot-noise for 4000 measurements per random sequence. The mean outcome for each random sequence of local gates is shown in light blue, and the average overall sequences is shown in dark blue. From the exponential decay fit, we obtain f = 0.899 ± 0.004 (95% Studentized confidence interval from the fit), which is in line with a fidelity dominated by the quality of the teleportation procedure.

Fig. 2: Network benchmarking simulations using NetSquid.
figure 2

a Simulation in Netsquid of the two-node network benchmarking protocol (algorithm 1). The nodes A and B hold qubits afflicted by dephasing (T2) errors, as seen in NV-center quantum processors37. The channels ΛAB and ΛBA emulate teleportation with a noisy quantum state of the form eq. (15), with a bright state population of α = 0.95. The error bar on the decay rate is a 95% Studentized confidence interval constructed from the fit. b Simulation in Netsquid of the multi-node network benchmarking protocol (going from two (line markers) to six (hexagon markers) nodes in a linear configuration). The noise models for network links and node operations are as before. The error bars in the inset are 95% Studentized confidence intervals constructed from empirical variances. We observe that the network fidelity decays exponentially with the number of nodes.

Teleportation-mediated links between multiple nodes

Figure 2b shows the outputs of multi-node network benchmarking, as simulated in Netsquid. In this simulation, we performed a multi-node network benchmark on n nodes in a linear configuration, where n ranges between 2 and 6, with the links connecting the node modeled again by teleportation using the same parameters as before. The data in Fig. 2b is generated by running the n-node network benchmarking protocol for 40 random sequences for each number of bounces m (ranging from 1 to 9). From this, we can infer that the network fidelity decreases from 0.899 ± 0.04 at two nodes (line markers), to 0.56 ± 0.02 at six nodes (hexagon markers, 95% Studentized confidence interval from the fit). We note that empirically the network fidelity decays exponentially with the number of nodes (see inset in Fig. 2b. This points to a potential use of network benchmarking as a network discovery tool, in this case, to give heuristic estimates of upper limits on the distance quantum information can travel through a network before degrading, without having to necessarily explore the whole network.

Statistics of network benchmarking

In this section, we analyze the finite sampling properties of the network benchmarking protocol. This analysis will resemble earlier statistical analyses of standard randomized benchmarking24,38, with one key difference. In standard analyses, the accuracy of the fidelity estimate is given as a function of the number of measurements that must be performed. This ignores that some measurements might be more expensive to perform than others. In particular, one typically assumes that it is not more costly to obtain a sample from a long sequence of gates than it is to obtain a sample from a short sequence of gates.

In network benchmarking, however, this assumption is no longer reasonable, as the cost of transmitting a qubit over a long distance will be the dominant factor in the cost of a sample. Hence sampling a sequence containing m bounces (as specified in algorithm 1 and algorithm 2) will be approximately m times as expensive as sampling a sequence containing only one bounce. This means it is more appropriate to estimate the accuracy of the fidelity estimate produced by algorithm 1 as a function of the number of bounces. Taking this cost into account has strong consequences for the statistical properties of network benchmarking. In particular, we will argue that we can not achieve “multiplicative accuracy” for the estimation of fidelity when taking the number of state transmissions as a cost metric. However, as seen in the simulations in section IV C, network benchmarking achieves good statistical accuracy for reasonable resource use in practice. Moreover, in the immediate future, network fidelities are expected to be reasonably low (in the 90–99% regime), so additive, and not multiplicative, accuracy is enough for practical purposes.

Relative accuracy estimation

One of the main selling points of standard randomized benchmarking is its ability to estimate the infidelity r = 1 − f, where f is the depolarizing fidelity measured by randomized benchmarking, to multiplicative precision. This means the estimator \(\hat{r}\) is distributed around its true value r with variance O(r2)24,38, which means that estimation in the high fidelity regime (r < < < 1) is not more costly than estimation in the low fidelity regime. We will argue here that this behavior is critically dependent on the assumption that the cost of obtaining samples from a given gate sequence in a (network) benchmarking experiment is independent of the sequence length. As discussed above, this is a reasonable assumption for standard randomized benchmarking but not so much for network benchmarking. We point out, however, that the argument below works just as well for standard randomized benchmarking if one takes the number of gates implemented as a cost function (as opposed to the number of samples collected). The argument below is not strictly rigorous as we will be making standard statistical assumptions such as the normality of distributions, but we expect it can be made rigorous with sufficient work.

In two-node network benchmarking, we can define the network infidelity as r = 1 − f. Network benchmarking constructs an estimator \(\hat{r}\) for r by sampling the decay function Afm for different sequence lengths m and then fitting an exponential through the resulting averages. Without loss of generality, we can assume the parameter A to be known, as perfect knowledge of a parameter in an estimation problem will never increase the difficulty of estimating another parameter. Now our goal is to give a lower bound on the estimation cost of f, given samples from distributions D(f, m) with mean Afm and variance V(f, m). This distribution D(f, m) is the distribution sampled by executing steps 2−17 in algorithm 1. We will make an argument using the Cramer–Rao bound, which states that the variance of any unbiased estimator of f must be larger than the inverse of the Fisher information, defined as

$$I(f)=\frac{A{f}^{2m-2}{m}^{2}}{{\mathbb{V}}(f,m)},$$
(16)

for some fixed m, where we assumed that D(f, m) is a Gaussian. This is a reasonable assumption since D(f, m) is defined as the distribution of the mean of many independent random variables. The central parameter that determines the Fisher information and, thus, the estimation cost is the variance of \({\mathbb{V}}(f,m)\) of D(f, m). By the law of total variance, we decompose the variance of D(f, m) into three contributions

$${\mathbb{V}}(f,m)={{\mathbb{V}}}_{g}(f,m)+{{\mathbb{V}}}_{{\mathrm{meas}}}(f,m)+{{\mathbb{V}}}_{{\mathrm{diff}}}(f,m),$$
(17)

where \({{\mathbb{V}}}_{g}(f,m)\) is the variance due to the randomness in selecting a sequence of gates, \({{\mathbb{V}}}_{{\mathrm{meas}}}\) is the variance due to the estimation of the probability \(p(\overrightarrow{G})\) (this is often called shot-noise in the experimental literature) and \({{\mathbb{V}}}_{{\mathrm{diff}}}\) is the variance associated with the random choice of Pauli operator at the end of each sequence. Vg(f, m) and Vmeas(f, m) have dependencies on both f and m, making analysis difficult. However, we can lower bound both by zero (which never makes the inference task harder) and state that \(V(f,m)\le {{\mathbb{V}}}_{{\mathrm{diff}}}(f,m)=1/(4\cdot 2)\), where the factor of 1/2 is due to the division by one-half in step 17 of the two-node protocol algorithm 1. We can thus upper bound the Fisher information I(f, m) of f in the distribution D(f, m) as

$$I(f,m)\le 8{A}^{2}{m}^{2}{f}^{2m-2}.$$
(18)

This is the Fisher information associated with a fixed sample. We can consider the Fisher information associated to the sampling cost (which grows linearly with the sequence length m) by dividing by m, to get

$${I}_{{\mathrm{cost}}}(m,f)\le 8{A}^{2}m{f}^{2m-2}.$$
(19)

Now given that we want to lower bound the variance of the estimator, we are interested in the maximum of Icost(m, f) over m. It can be easily seen that this function has a unique maximum at \(m=\frac{-1}{2\log (f)}\). This means the maximal Fisher information is

$${I}_{{\mathrm{cost,max}}}(f)\le \frac{-4{A}^{2}}{log(f)}{f}^{1/\log f-2}.$$
(20)

Writing f = 1 − r and writing out the Mercator series for the logarithm, we can see now that Icost,max(f) = O(r). This implies through the Cramer–Rao bound that

$${\mathbb{V}}(\hat{r})=O(r),$$
(21)

providing additive, but not relative, estimation accuracy.