Bootstrapping quantum process tomography via a perturbative ansatz

Quantum process tomography has become increasingly critical as the need grows for robust verification and validation of candidate quantum processors, since it plays a key role in both performance assessment and debugging. However, as these processors grow in size, standard process tomography becomes an almost impossible task. Here, we present an approach for efficient quantum process tomography that uses a physically motivated ansatz for an unknown quantum process. Our ansatz bootstraps to an effective description for an unknown process on a multi-qubit processor from pairwise two-qubit tomographic data. Further, our approach can inherit insensitivity to system preparation and measurement error from the two-qubit tomography scheme. We benchmark our approach using numerical simulation of noisy three-qubit gates, and show that it produces highly accurate characterizations of quantum processes. Further, we demonstrate our approach experimentally on a superconducting quantum processor, building three-qubit gate reconstructions from two-qubit tomographic data.

In light of these achievements, the need for robust, accurate, and efficient validation and verification of quantum processors becomes ever more pressing. This is the natural domain of quantum state tomography (QST) and quantum process tomography (QPT). Respectively, QST and QPT seek to characterize the state of a quantum processor or the dynamical map of its evolution [19]. Unfortunately, naïve implementations of both QST and QPT require measurement of a number of observables that scales exponentially in the number of qubits. Practically, this scaling has limited full QST and QPT to small system sizes, e.g. [20,21], though this can be improved using approximate characterizations [22,23], or in situations with large amounts of symmetry [24,25].
Further compounding QPT, the most error-prone operations are often system preparation and measurement (SPAM), which can overwhelm the intrinsic error in highfidelity quantum processes and hinder their characterization. Several SPAM-insensitive metrics exist, such as the widely-successful randomized benchmarking [26][27][28][29] and its variants [30][31][32][33][34][35][36], as well as gate-set tomography (GST) [37][38][39]. Randomized benchmarking has the additional benefit of overcoming the exponential scaling of standard QPT, but at the cost of returning only a single number characterizing the quantum process.
In this work, we present an approach to efficient QPT that reduces the exponential scaling to quadratic scaling, * luke.c.govia@raytheon.com while still returning a full process matrix describing the quantum process. We propose the Pairwise Perturbative Ansatz (PAPA), which describes the unknown quantum process as sequential two-qubit processes on all qubit pairs. We show how to fit the free parameters of our ansatz to data obtained from QPT of two-qubit subsets of the full system. When this data is provided by SPAMinsensitive tomography, such as GST, our approach becomes SPAM-insensitive as well as efficient.
The paper is organized as follows. In section II we provide background information on QPT and compare PAPA to existing QPT protocols. In sections III and IV we describe PAPA in detail, and outline how to obtain the necessary tomographic data to obtain a PAPA characterization. In section V we benchmark the PAPA approach using numerical simulation, and finally in section VI we present our conclusions.

II. BACKGROUND
A generic N -qubit quantum process, which we label as E, has 16 N − 4 N free parameters, and the goal of QPT is to determine these free parameters. This makes naïve QPT an exponentially hard problem, as an exponential number of measurement settings (unique observables) are required to determine the free parameters. Even for small to modest N this scaling is practically unfavorable, and QPT is very challenging experimentally.
Process tomography can be rephrased as state tomography of the Choi dual-state (via the Choi-Jamiołkowski isomorphism), which is the state formed when the unknown process acts on one half of a maximally entangled state in a Hilbert space of dimension 2 2N , given by where {|ψ µ } is an orthonormal basis for N -qubit Hilbert space. Thus, one can use efficient state tomography methods for process tomography, such as compressed sensing [22,40,41] and matrix-product-state (MPS) parameterizations [23,[42][43][44]. Unfortunately, the matrix completion algorithms that underly these approaches can them- Figure 1. a) Pairwise Perturbative Ansatz (PAPA) tomography: for all qubit pairs, characterize the effective two-qubit process (Choi state σS ) when the unknown N -qubit process E occurs, and all other qubits start in the maximally mixed state. b) three-qubit PAPA+GST: characterized twoqubit gate-sets are bootstrapped to a three-qubit gate-set via PAPA.
selves be inefficient in run-time. This issue can be circumvented using constrained approaches, as in Refs. [23,43], which restrict to pure state descriptions of the unknown quantum state.
Both compressed sensing and MPS parameterizations implicitly assume an ansatz for the unknown quantum process, that it is either low rank, or has a matrix product structure (and thus correlations are not long range) respectively. Our pairwise perturbative ansatz assumes a different physical constraint on the unknown process: that it is intrinsically built from two-qubit processes on all pairs of qubits. Like the MPS approach, this implies that few-body QPT is sufficient to find a PAPA characterization of the unknown process. Unlike an MPS, PAPA has no locality constraint on correlations, and allows for long-range correlations, though these come about only via local interactions between qubit pairs. Further, we will see in the next section that the PAPA constraint is physically motivated, unlike the low rank restriction of compressed sensing.

III. ANSATZ FOR PROCESS TOMOGRAPHY
We propose to restrict the unknown Choi state by assuming an ansatz for its form. This restricts the number of free parameters in the unknown process a priori, and therefore restricts the number of measurement settings required.
We will assume an ansatz where the unknown N -qubit process is written as a composition of two-qubit processes, consisting of quantum processes for each qubit pair in the system. This is most easily expressed in terms of the super-operator matrix representationÊ of the quantum process E, as the series composition becomes a product of matrices. This has the general form where E k,n+k is an arbitrary two-qubit process on qubits k and (k + n) with no restrictions. The product runs over all pairs of qubits, of which there are (N 2 − N )/2. Each of the unknown two-qubit processes can be written as i k,n } is a complete basis for single-qubit processes and I is the identity process. χ j k,n i k,n is an element of the χ-matrix describing the two-qubit process, and the summation variables i k,n and j k,n are subscripted to emphasize that they correspond to a particular qubit pair.
There are many possible ansatz for an unknown quantum process [22,23,[40][41][42][43][44], but the form we have chosen is particularly well motivated physically. As it is the composition of two-qubit processes in sequence, it captures the natural two-body quantum operations that occur in a gate-based quantum computation. It can completely specify any ideal gate operation (single-layer quantum circuit built from one and two-qubit gates), and will contain both single-qubit errors and correlated two-qubit errors as independent free parameters. It also describes processes that involve more than two qubits, but as combinations of two-qubit processes performed in sequence. Thus, it describes general processes in a perturbative fashion, built from one-and two-qubit processes.
While each arbitrary two-qubit process described by Eq. (3) is parameterized in terms of a basis with 16 2 elements, its χ-matrix has only 16 2 − 4 2 = 240 free parameters. There are N 2 = (N 2 − N )/2 two-qubit subsets, and so the total number of free parameters in our ansatz is 120(N 2 − N ). As this scales quadratically with qubit number, PAPA is an efficient approach to QPT.
QPT with PAPA consists of determining the χ-matrix for each two-qubit process in the product in Eq. (2). Inspired by the local tomography used in [23,43], we will use the tomographic characterization of two-qubit processes on all pairs of qubits to determine these free parameters. In essence, from characterization of two-body processes, we bootstrap to a multi-qubit process of PAPA form.
To compare the PAPA ansatz to two-qubit tomographic data, we must determine a notion of a two-qubit reduction of a process E. This is most easily done in terms of the Choi state ρ E . For the two-qubit subset S = {m, p} this takes the form where by Tr /S [ρ] we mean the partial trace of all qubits other than those in the set S, and it is important to note that the partial trace is applied to both "parts" of the Choi state. Using the orthogonality of the N -qubit basis, we see that where the indices µ S (µ /S ) are the subset of indices in µ that correspond to the qubits inside (outside) of the subset S. Thus, the reduced Choi state of the unknown process can be written as where I N −2 is the identity matrix of dimension 2 N −2 .
To determine the free parameters in the PAPA ansatz, for each pair of qubits we compare the two-qubit reduced Choi states described by Eq. (6) to the corresponding experimentally characterized two-qubit Choi state. Operationally, this amounts to performing two-qubit QPT on the (N 2 − N )/2 pairs of qubits. Each of the pairwise characterized two-qubit processes is described by 16 2 − 4 2 = 240 complex numbers, which gives a total of 120(N 2 − N ) total complex numbers describing the two-qubit process characterization of all pairs of qubits.
Thus, we have exactly as many constraints (coming from experimental characterization) as there are free parameters in PAPA. This further motivates our choice of ansatz, as we have made use of all available data from two-qubit characterizations of the unknown multi-qubit process. In the following section we complete our description of PAPA tomography by describing what two-qubit processes must be characterized for each qubit pair in order to solve for the unknown parameters in our ansatz.

IV. CHARACTERIZING THE TWO-QUBIT PROCESSES
In the most general version of QPT, there is a completely unknown quantum process which one wishes to determine. Applying PAPA to this problem, the required two-qubit QPT is derived from the form of Eq. (6). For a pair of qubits defined by the subset S we perform twoqubit QPT to characterize the effective process the qubits in S experience when the unknown process E is implemented on all N qubits (with all other qubits initialized in the maximally mixed state), as depicted in Fig. 1a).
To see that Eq. (6) describes a valid two-qubit process, we describe the unknown N -qubit process in a basis of N -qubit processes as where i = 1. Substituting this expression into the partial trace in Eq. (6), we obtain (recall S = {m, p}) where we have defined Λ S = i i Λ im ⊗Λ ip . The reduced Choi state can then be written as and it is clear that Λ S must describe a valid quantum process.
In Eq. (6) we see that the qubits outside the qubit pair of interest (the spectator qubits) must be prepared in the maximally mixed state. If this is experimentally challenging, one can instead randomly sample spectator qubit preparations from the uniform distribution of the set of spectator qubit logical states. With sufficient sampling to generate accurate statistics, the normalized sum of the randomly sampled preparation states approaches the maximally mixed state for the spectator qubits. Thus, performing two-qubit QPT on the qubit pair of interest with spectator qubits prepared in a random logical state will characterize the desired effective process in Eq. (6).
From two-qubit QPT we can obtain an experimentally characterized two-qubit Choi state, which we label σ S . We equate this to our reduced Choi state for the unknown process, ρ S , to determine the free parameters in the PAPA. In other words, we simultaneously solve the equations for every pair of qubits. Note that each ρ S depends on the χ-matrix elements for all qubit pairs, i.e. all χ j k,n i k,n , not just the qubit pair of the subset S. Thus, each two-qubit process characterization σ S constrains the global process, not just the component of the ansatz on the qubits in S. For this reason, we have labelled the reduced two-qubit processes as Λ S to distinguish them from the two-qubit processes that construct the PAPA, E k,n+k in Eq. (2).

A. PAPA and Gate-Set-Tomography
The PAPA tomography approach described so far works well to obtain a bootstrapped description of an Nqubit process from characterization of the effective processes on all qubit pairs. However, often the problem at hand is not to characterize a completely unknown process, but to determine the actual process, G, that occurs when we aim to implement a unitary gate,Ĝ, (from here on we use calligraphic text for processes and latin text for unitary gates).
Extending this to an entire gate-set via Gate-Set Tomography (GST), we obtain a set of processes {G i } corresponding to the experimental implementation of an ideal gate-set {Ĝ i }. GST has the further benefit of excluding state-preparation and measurement (SPAM) errors from the processes {G i } [38]. Note that for clarity we will use "gate-set" to refer to the processes {G i }, and "ideal gateset" to refer to the unitary gates {Ĝ i }.
Combining PAPA with GST, we can perform GST on all qubit pairs to obtain a characterized gate-set for each pair, and then use PAPA to bootstrap to descriptions of an N -qubit processes. To see why this is useful, consider the three-qubit gateX ⊗Ŷ ⊗X. Given characterized gate-sets with the relevant two-qubit gates, one way to describe the three-qubit process would bê where G AB is the experimental process when we try to implement the gateÂ ⊗B. However, there is ambiguity in the correct decomposition of the three-qubit gate, and G X1X3 (G Y2I3 (ρ)) would be an equally valid description of the process. An issue arrises as it is unlikely that the constructed three-qubit processes from all possible twoqubit decompositions will agree with one another. Using PAPA avoids this issue, as it finds the threequbit process of PAPA form that best agrees with the pairwise characterized processes, i.e. with G X1Y2 , G Y2X3 , and G X1X3 . As such it captures context dependence between gate operations, such as when the effect on qubit 1 is different for the processes G X1Y2 and G X1X3 . As an added benefit, one never has to implement the full N -qubit process, as one does when using PAPA without GST (as described in the previous section). Instead, from the characterized gate-sets on all qubit pairs, we can bootstrap to PAPA characterizations of the processes in an N -qubit gate-set, as represented in Fig. 1b).
While PAPA can in principle return a characterization of any N -qubit gate, when we restrict the pairwise twoqubit QPT to GST, the PAPA+GST combination can only characterize a limited set of N -qubit gates. Which N -qubit gates can be characterized with PAPA+GST is detailed further in Appendix A. The general requirement is that each two-qubit reduced process of the ideal Nqubit gate must be an incoherent mixture of two-qubit gates built from the ideal gate-set. For example, if the ideal gate is CNOT 12 ⊗Î, then as shown in Appendix A, the ideal gatesẐ ⊗Î andÎ ⊗Î need to be in the characterized gate-set for qubit pair 1-3.
Decomposing an N -qubit gate this way implicitly assumes the errors that make the implemented process G distinct from the ideal gateĜ are not strongly specific to the implementation of G. This is easily satisfied if the errors are gate-independent, but some kinds of gate-dependent error are tolerable, such as context dependence in simultaneous single-qubit gates. For the CNOT 12 ⊗Î gate considered previously, an example of a tolerable gate-dependent error would be a coherent error that occurs on qubit 1 both for an actualẐ-gate or an effectiveẐ-gate (as occurs in the reduced process on qubit 1 for the CNOT 12 gate).
It is important to emphasize that neither of these issues are limitations of PAPA, which can characterize any N -qubit process using pairwise two-qubit QPT, but of the two-qubit characterizations supplied to PAPA by GST. Nevertheless, there are many situations where PAPA+GST may be applicable, i.e. the ideal-gate decomposition is possible and the errors can be assumed to be captured by PAPA+GST, as we explore in section V A. For situations where PAPA+GST is not possible, PAPA can inherit SPAM-insensitivity from other SPAM-insensitive process tomography such as that using randomized benchmarking [45][46][47].

A. Noisy One-and Two-Qubit Gates
To test our PAPA approach to multi-qubit QPT, we numerically simulate "unknown" three-qubit processes, and then reconstruct the PAPA characterization of these processes. We consider several example processes formed by one of the ideal three-qubit gatesÎ⊗Î⊗Î, CNOT 12 ⊗Î, orX ⊗Ŷ ⊗X, followed by an error process. For the error process we consider two cases of gate-independent error, either a coherent error described by single-qubit rotations on all three qubitŝ or single-qubit decay and pure dephasing implemented by their standard Kraus operator representations [19].
In standard PAPA reconstruction, pairwise two-qubit QPT is used to characterize the reduced two-qubit process, and obtain σ S for each qubit pair. With PAPA+GST this is circumvented by using a GST characterized gate-set for each qubit pair to calculate σ S , provided the ideal reduced two-qubit process can be built from gates in the ideal gate-set. For the example threequbit ideal gates chosen, the required two-qubit gates are contained in the ideal gate-set CNOT + {Î,X,Ŷ ,Ẑ} ⊗2 . We follow the PAPA+GST approach for our numerical tests, simulating the implementation of this gate-set on all qubit pairs, including the error process, and use results of these simulations as our GST reconstructed two-qubit gate-sets. We then use the characterized two-qubit gatesets to calculate σ S for each qubit pair. We outline our approach in explicit detail in Appendix A.
We compare the PAPA+GST reconstruction for a noisy gate to the actual simulated noisy gate by calculating the trace distance between the Choi state of the PAPA-reconstructed three-qubit process, ρ E , and that for the actual process, ρ act We also calculate the trace distance for each of the reconstructed two-qubit processes, comparing them to the actual two-qubit reduced processes The results of this are shown in Fig. 2 for the seven candidate processes listed in the caption.
As the results show, the PAPA reconstructed process always improves upon the initial guess (ideal gate), both in terms of the trace distance for the full threequbit process reconstruction, Fig. 2a), and the average of the trace distances for the two-qubit reconstructions, Fig. 2b). This improvement is typically around one order of magnitude, except in the case of the CNOT gate, which was the most difficult to reconstruct of the gates tested.
The accuracy of the PAPA reconstructions of the simulated gates is set by the specifics of the classical numerical algorithm implemented (see Appendix B for details). If other algorithms [48,49] more tailored to quantum process reconstruction are used with PAPA we expect significant improvements in accuracy and runtime are possible.

B. Coherent Error in the Cross-Resonance Gate
We also perform a systematic testing of the PAPA approach by examining coherent error in a cross-resonance (CR) implementation of a CNOT gate [50,51], with the ideal gate taking the form CNOT 12 ⊗Î. Referred to as a CR-CNOT, this ideal gate consists of the ideal CR-gate followed by single-qubit gates. The unitary describing the implemented gate in the presence of coherent error is given byÛ CNOT =Û 1 −Z90Û 2 +X90Û CR , withÛ j ±µ90 a rotation of qubit j of angle ±90 • around the µ-axis (which we assume to be perfect), and where for compactness of notation we have suppressed the tensor product symbols, such thatẐXÎ =Ẑ ⊗X ⊗Î. In Eq. (17), the angles β and φ quantify the coherent error, with β the angle of over-rotation of the desired CR-interaction between qubits 1-2, and φ the angle quantifying the effect of spurious ZZ-coupling between qubits 2-3. We consider the echoed CR-pulse of Ref. [52], such that the only remaining ZZ-coupling is between the target and idle qubits (i.e. 2 and 3). We use values of β between π/16 and π/8 radians, which produce non-ideal gates with trace-overlap fidelity of 95 − 99%, and values of φ between 10 −3 and 4×10 −3 radians. For a gate of 400 ns in duration, these values of φ correspond to spurious ZZ-couplings of 2.5 − 10 kHz.
From the decomposition of the effective two-qubit processes for the ideal CNOT gate given in Appendix A, it is clear that a CR-CNOT with coherent error does not satisfy the criteria for PAPA+GST. In particular, the error is strongly gate-dependent as it is intrinsic to the CR-interaction. For instance, no effect of the CR error would be seen in the implementation of the simultaneous single-qubit gateẐ ⊗Î on qubits 1-3. As such, we must apply standard PAPA, and simulate QPT on the effective process for each pair of qubits during the implemented CR-CNOT. For this we assume no SPAM error, and in practice similar results can be achieved by applying other SPAM-insensitive process tomography approaches to the CR-CNOT [45][46][47].
The results of our simulations are shown in Fig. 3. As can be seen, for all values of β and φ tested the PAPA reconstruction is approximately an order of magnitude closer to the simulated unitary of Eq. (17) than the ideal gate (used as the initial guess). Thus, PAPA is a useful technique for benchmarking the performance of experimentally relevant implementations of entangling gates, such as the CR-CNOT widely used in circuit QED [53].

VI. CONCLUSION
We have presented here an approach to efficient and SPAM-insensitive quantum process tomography that relies on fitting tomographic data to a constrained ansatz for the unknown quantum process. Our physically motivated pairwise perturbative ansatz requires only twoqubit process tomography on all pairs of qubits, such that the total number of measurements scales only quadratically with qubit number. Further, our ansatz inherits SPAM-insensitivity from SPAM-insensitive two-qubit tomography, such as gate-set tomography [39] or RB gate tomography [45][46][47].
Testing via numerical simulations validates the usefulness of our tomographic approach on both a series of example gates, and the experimentally relevant CR-CNOT [51]. In typical cases, the resulting description of the unknown quantum process found by our ansatz is an order of magnitude more accurate than the naïve initial guess. In the future, we hope to improve the efficiency and accuracy of the classical algorithm underlying our reconstruction method [48,49].
It is worth noting that while we have chosen to build our ansatz for an N -qubit process from two-qubit processes, similar ansatz can be created from K-qubit processes for any K < N . These have measurement resource requirements that scale as a polynomial of order K, and are therefore still asymptotically efficient. We have focussed on the case K = 2 in our work as two-qubit process tomography is within current experimental capabilities. However, for larger system sizes, there will likely be an optimal K > 2 that reduces the number of qubit subsets, given by N K , while maintaining a small enough K that K-qubit QPT is experimentally feasible.
Finally, we comment briefly on the situations where PAPA may fail, and the fact that this actually gives useful information about the unknown process. Numerical reasons aside, PAPA reconstruction fails when the process being estimated is an operation that is not factorable to 2-body, or when non-Markovian noise is present. As such, PAPA reconstruction can be used as a form of model testing for error processes that entangle more than 2 qubits, or non-Markovian noise sources such as slow parameter drift. Similarly, PAPA+GST puts greater restrictions on the gate and context independence of the noise sources, and can be used as a model testing procedure for these error sources. This highlights the usefulness of ansatz-based approached to QPT: even when they fail they provide useful information about the system. In this appendix we discuss the set of N -qubit gates that can be characterized via bootstrapping with PAPA+GST. We will focus on the N = 3 case since the extension to N > 3 is straightforward from the threequbit results. Consider an ideal three-qubit gate written where Tr(Û iÛ † j ) = 2δ ij such that {Û i } is an orthonormal basis for one-qubit operator space. We will label the ideal process for this gate as U, and label the imperfect experimental implementation of this process as U. For notational simplicity we break slightly from the nomenclature used in the main text, and throughout this appendix processes without tildes will be ideal, and those with tildes will be experimental implementations of the ideal process.
The Choi state of the ideal process is Then, as an example, the two-qubit reduction for qubits 1-2 is given by where we have used the fact that Tr(Û iÎÛ † j ) = 2δ ij , and U 12 is the two-qubit process defined by The general PAPA approach would be to characterize the experimental implementation of the process U 12 , i.e.Ũ 12 , via two-qubit QPT on qubits 1-2 whenŨ occurs. The PAPA+GST approach is the situation where one does not want to perform two-qubit QPT for every unknown three-qubit process, but would rather bootstrap characterizations of three-qubit processes from existing two-qubit gate-set characterizations.
In the PAPA+GST approach, the two-qubit reduction ρŨ 1,2 can be experimentally characterized if the ideal process U 12 can be described as a convex sum of unitary processes with eachĜ i in the GST characterized gate-set. If this is the case, then whereG i is the experimental implementation of the gatê G i , and each σG i can be obtained from the GST gate-set which contains allG i .
For the ideal gate-set we have used in the main text, CNOT + {Î,X,Ŷ ,Ẑ} ⊗2 , we will now show that any three-qubit quantum gate consisting of a singlelayer circuit of these gates can be characterized using PAPA+GST. Three-qubit gates of the formĜ 1 ⊗Ĝ 2 ⊗ G 3 ∈ {Î,X,Ŷ ,Ẑ} ⊗3 can obviously be parameterized by PAPA, as one can trivially show that the two-qubit processes to be characterized are the unitary gatesĜ 1 ⊗Ĝ 2 , G 1 ⊗Ĝ 3 , andĜ 2 ⊗Ĝ 3 , which are all in the GST gate-sets.
For a three-qubit gate that involves a CNOT on two of the qubits, a bit more effort is required to show that the necessary two-qubit gates to be characterized are still in the GST gate-set. For example, consider the ideal gatê U = CNOT 12 ⊗Î used in the main text. This has the ideal two-qubit reduced dual states and therefore the necessary gates to characterize are CNOT for qubits 1-2,Î ⊗Î andẐ ⊗Î for qubits 1-3,Î ⊗Î andX ⊗Î for qubits 2-3. As all of these gates belong to their respective GST gate-sets, a characterization of U can be bootstrapped using PAPA+GST. It is straightforward to show that this generalizes to all arrangements of the CNOT (i.e. on any pair of qubits), and any gate on the qubit not involved in the CNOT. So far we have only commented on the ideal two-qubit gates that need to be characterized for a given N -qubit process, and not on the other criteria for PAPA+GST, namely tolerable error. The general criteria is not as strong as all error needing to be gate-independent. For instance, three-qubit gates such asĜ 1 ⊗Ĝ 2 ⊗Ĝ 3 may have error that is dependent on the specific single-qubit gates implemented, as this will be captured in characterization of the two-qubit reductions. Similarly, if the error in a single-qubit gate depends on the gate implemented on another qubit (i.e. context dependence) this will also be captured by PAPA+GST.
The fact that both gate-dependent and contextdependent error fits within the PAPA+GST framework for simultaneous single-qubit gates comes from the fact that the physical implementation of the simultaneous single-qubit gates on N qubit is the same as on two qubits. This is often not the case for an entangling gate such as CNOT 12 ⊗Î, where the physical implementation, such as a CR-CNOT, could be vastly different than the physical implementation of the gates in its reduced twoqubit decomposition, cf. Eqs. (A7)-(A9).
In the case of the CR-CNOT used in the main text, this is especially true for the error model assumed, over- . Absolute value of the element-wise difference between the "measured" Choi state, σ12, and the PAPA reconstructed Choi state, ρ12, for the effective process experienced by qubit pair 1-2 during a CNOT gate with single-qubit decoherence. Simulation parameters for the three-qubit process are the same as in Fig. 2. rotation and cross-talk, is intrinsic to the CR-CNOT, and as such GST characterization of the simultaneous singlequbit gates on qubit pairs 1-3 and 2-3 would not capture this error. Even a large difference in gate-length between entangling and simultaneous single-qubit gates can result in a discrepancy in their error due to decoherence, and is enough to make PAPA+GST inapplicable. In such situations standard PAPA should be used in combination with another SPAM-insensitive two-qubit QPT technique should be used.

Appendix B: Numerical Implementation
The computational task in PAPA characterization is the simultaneous solution of Eq. (10) for each pairwise reduction, from which we obtain the elements of the twoqubit χ-matrices, χ j k,n i k,n . These equations are nonlinear in general, and must be solved under the constraint that each of the two-qubit χ-matrices describes a completelypositive and trace-persevering (CPTP) map.
This implies that the χ-matrix is a positive semidefinite matrix with trace 4 (dimension of two-qubit Hilbert space). Further, CP requires an additional constraint, which to describe we need to parameterize a twoqubit process on the set S in the usual way via its χmatrix Figure 5. Trace distance between the simulated Choi state and PAPA reconstructed Choi state for the three-qubit process, and reduced two-qubit processes as a function of the solver tolerance, tol . The CNOT gate with coherent error (red + and ×) and decoherence (blue triangles and circles) were used for these simulations. The vertical dashed line indicates the tolerance used for the simulations presented in the main text.
where {Ê p } is a basis for two-qubit operator space. The CP constraint is then [19] 16 p,r Comparing this to our previous parameterization of a two-qubit process in terms of one-qubit processes used in Eq. (3), for S = {k, n + k}, we see that the two parameterizations are related by splitting each index i k,n and j k,n into two parts via the equations To solve for the χ-matrix elements under the CPTP constraints, we use a least-squares minimization approach implemented in MATLAB [54]. Here, the cost function for the least-squares minimization consists of two parts. The first encodes the experimental characterizations of the two-qubit reductions, and simply consists of the element-wise difference between the two-qubit reduced Choi state for each pair of qubits and the current estimate for the two-qubit reduced Choi state generated by PAPA, where χ is a vector of the χ-matrices for the processes on all qubit pairs that make up the PAPA, and the sum over S runs over all qubit pairs. The second part of the cost function, C 2 [ χ], encodes the CPTP constraints, and consists of the difference between the trace of each χ-matrix estimate and four, the sum of all negative eigenvalues of the χ-matrix estimate (to constrain positivity), and the elements of Eq. (B2). The least-squares minimization solves the problem χ est = arg min We note that this has the form of a semi-definite program (SDP). However, the operations involved in calculating C 1 are not obviously convex, and as a result the problem is not compatible with available convex-optimization packages. As such, we have not used this approach, but in future work hope to explore making the problem compatible with convex-optimization. Even with the CPTP constraints imposed, the χmatrix estimates returned by the numerical solver will not necessarily be positive semi-definite. As such, we apply a post-processing step where we diagonalize each χmatrix estimate, generating a set of eigenvalues λ i with corresponding eigenvectors |v i . We can then create a positive semi-definite χ-matrix estimate for each twoqubit processχ where N is a normilization factor to ensure Tr(χ est S ) = 4. These are what we use in the PAPA construction of the N -qubit gate. Fig. 4 shows an example of the output from our implementation of the PAPA algorithm. For the CNOT gate with single-qubit decoherence described in the main text, it plots the difference between the measured (experimentally or in this case by simulation) Choi state, σ 12 , and the PAPA reconstructed Choi state, ρ 12 , for the effective process experienced by qubit pair 1-2. The element-wise difference is consistent with the magnitude of the trace distance reported in Fig. 2b).
Least-squares minimization requires an initial guess for the χ-matrices, and we choose a decomposition of the ideal three-qubit gate as the initial guess. For the reconstructions presented in the main text, we found that their accuracy was mostly limited by numerical issues, such as the trade-off between the minimization tolerance and computation time. We observed a saturation in the trace distance for solver tolerance below a threshold value of tol = 10 −7 , which we attribute to the solver becoming stuck in a local minimum, see Fig. 5.
In future work we hope to explore these numerical issues, and implement more efficient and accurate classical algorithms for the PAPA reconstruction. For instance, we would aim to prevent the solver from getting stuck in regions where the gradient of the cost function is below the tolerance threshold, but the solution accuracy is not. One route forward would be to adapt to PAPA more sophisticated optimization algorithms tailored for optimization over positive definite matrices, such as those using gradient descent [48,49].