## Introduction

Quantum computers1 can extend classical computational reach in diverse research fields, including quantum chemistry, material science, and even machine learning. Based on various technological advances so far, such nontrival quantum applications have been pursued with currently available devices mainly through quantum-classical hybrid schemes2,3. The schemes combine the advantages of classical and quantum computation, where quantum processors are used to estimate expectation values of physical observables on certain states for classical feedback. The hybrid schemes can be applied to estimate the ground-state energies of molecules3,4,5, to simulate quantum models in materials6 and high-energy physics7, and to find approximate solutions of optimization problems8. Although it is anticipated that around a hundred well-behaved qubits are required for such schemes to outperform current classical counterparts in quantum chemistry9,10,11, the advantages are only possible with accurate quantum processors. However, expectation values obtained with output results of the quantum devices are inevitably deviated because of errors originated from both environmental fluctuations and operational imperfections. Therefore, techniques for accurately estimating expectation values with improving the accuracy of noisy quantum processors are of great importance.

Apart from physically improving the devices, the deviations in estimating expectation values can be suppressed on the algorithmic level. For example, quantum error correction12,13 provides a mean of fault-tolerant quantum computation, which results in accurate expectation values. However, quantum error correcting codes require complex coding schemes, a large number of physical qubits, and low error rates, which are still far from being affordable for near-term quantum technologies14,15. Consequently, it has not yet been demonstrated that quantum fault tolerance protocols can increase the fidelity of computation operations in any physical implementation. Alternatively, for the quantum algorithms estimating expectation values, the reliability of computation result can be improved by recently proposed error mitigation schemes16,17,18,19,20 without challenging requirements for quantum error corrections. The probabilistic error-cancellation method provides a comprehensive way to mitigate errors in expectation estimation tasks17,18,21. It begins with characterizing imperfect operations on the quantum device by tomography technique and then cancels errors by sampling random quantum circuits, according to a quasi-probability distribution derived from reconstructing ideal quantum operations with characterized imperfect ones. Please note that this method does not improve the physical quality of quantum states or gates but reduces the error in the estimation of expectation values.

Here we construct a trapped-ion system with full controllability and investigate the universal validity of the probabilistic error-cancellation method in a general quantum computational context. We apply the method to every imperfect elementary quantum operation and benchmark the performance of error-mitigated quantum computation22. We observe singnificant improvements on effective gate fidelities of single- and two-qubit gates by an order of magnitude to those of physical gates. Here, the effective gate fidelities are obtained by fitting the corresponding expectation values estimated with error mitigation, which are not actual physical gate fidelities.

## Results

### Paradigm of error-mitigated quantum computation

The paradigm of error-mitigated quantum computation is shown in Fig. 1. The noisy quantum device is treated as a multi-qubit black box in Fig. 1a, capable of preparing each qubit into an initial state ρ0, performing a set of single-qubit and two-qubit gates, and two-outcome measurement on each qubit, which is described by a positive operator-valued measure $${\mathcal{M}}\equiv \{{E}_{0},I-{E}_{0}\}$$ with I being the 2 × 2 identity operator. These quantum operations are generally not accurate because of errors from operational imperfections and environmental fluctuations. As proposed in ref. 18, we perform the gate set tomography23,24,25 and characterize state preparation and measurement (SPAM) and gates of noisy quantum devices by Gram matrices and Pauli transfer matrices (PTMs), respectively25, as shown in Fig. 1b. When we repeatedly execute a quantum circuit with such a noisy device aiming at obtaining the expectation values of observables of interest, the estimation will be deviated from the ideal case due to the imperfection of the quantum device, as shown in Fig. 1c. The correction of each noisy quantum operation can be decomposed to the combination of experimental basis operations (which we give later) with quasi-probabilities as shown in Fig. 1d. As some of the quasi-probabilities can be negative, we cannot physically implement the decomposition. However, the expectation of the decomposition can be estimated by sampling circuits with random basis operations according to the quasi-probabilities17,18. After running the random circuits with the corrections, the probability distribution of the output expectation value is shifted towards the ideal value at a cost of enlarged variance due to the presence of negative values in the quasi-probabilities18, as shown in Fig. 1c. The variance can be reduced by increasing the repetition number, which is the number of random-circuit instances.

### Experimental realization

In our experimental realization, the quantum hardware encapsulated in the black box is a trapped-ion system, where 171Yb+ ions are trapped into a linear crystal and individually manipulated by global and individual laser beams, as shown in Fig. 1a. To encode quantum information, a pair of clock states in the ground-state manifold 2S1∕2, i.e., $$\left|F=0,{m}_{F}=0\right\rangle$$ and $$\left|F=1,{m}_{F}=0\right\rangle$$, are denoted as the computational basis $$\left\{\left|0\right\rangle ,\left|1\right\rangle \right\}$$ of a qubit. At the beginning of executing a quantum circuit, each ion qubit is initialized to $$\left|0\right\rangle$$ by optical pumping. We implement single-qubit operations by Raman laser beams with beatnote frequency about the hyperfine splitting ω0 = 2π × 12.642821 GHz. In addition, the two-qubit operation, i.e., the Mølmer-Sørensen YY-gate (MSYY) is realized by driving transverse motional modes26,27, with frequencies in the x-direction {ν1, ν2} = {1.954, 2.048} MHz. We apply amplitude-shaped28 bichromatic Raman beams with beatnote frequencies ω0 ± μ, where μ is set to be the middle frequency of the two motional modes, and achieve the MSYY gate for 25 μs. We also realize the MS ZZ-gate (MSZZ) by adding single-qubit rotations before and after the MSYY gate29 (see Supplementary Fig. 4b). At the end of the execution, internal states of qubits are measured by state-dependent fluorescence detection30. It is noteworthy that to collect fluorescence photons, we use a photomultiplier tube in the single-qubit case and an electron-multiplying charge-coupled device (EMCCD) in the two-qubit case.

### Characterization of quantum device

We introduce the PTM representation for the mathematical description of an n-qubit noisy quantum device, where density operators ρ and physical observable E are represented by 2n-entry column vectors $$\left|\left.\rho \right\rangle \right\rangle$$ and row vectors $$\left\langle \left\langle E\right.\right|$$, and quantum gates G are represented by 22n × 22n PTMs RG. Here, the expectation value of the observable $$\hat{E}$$ after operating Gs on the initial state $$\hat{\rho }$$ is represented by 〈〈ERGρ〉〉. PTMs can be determined by gate set tomography, which requires informationally complete data obtained from experiments with initial states from a basis set $${{\mathcal{S}}}_{n}\equiv {\{\left|0\right\rangle ,\left|1\right\rangle ,{\left|1\right\rangle }_{X},{\left|1\right\rangle }_{Y}\}}^{\otimes n}$$ and measurement of the observables from the n-qubit Pauli basis $${{\mathcal{P}}}_{n}={\{I,X,Y,Z\}}^{\otimes n}$$, where $${\left|1\right\rangle }_{X}$$ and $${\left|1\right\rangle }_{Y}$$ are the eigenstates of Puali operators X and Y, respectively. Compared with quantum process tomography, gate set tomography is featured by appropriately taking consideration of SPAM errors, which is of great importance in quantum computations with high accuracy. In gate set tomography, the states in $${{\mathcal{S}}}_{n}$$ and the measurement of observables in $${{\mathcal{P}}}_{n}$$ are realized by using a set of fiducial gates $${{\mathcal{F}}}_{n}\equiv {\{I,{X}_{\pi },{Y}_{-\frac{\pi }{2}},{X}_{\frac{\pi }{2}}\}}^{\otimes n}$$ consisting of the identity operation and the X or Y axis rotations on each qubit, which are to be characterized together with the rest of the quantum operations. The single-qubit SPAM errors are reflected in the Gram matrix25, as shown in Fig. 2a, which is obtained by preparing the qubit in one of the states $${{\mathcal{S}}}_{1}$$, $$\left|\left.{\rho }_{i}\right\rangle \right\rangle ={R}_{{F}_{i}}\left|\left.{\rho }_{0}\right\rangle \right\rangle$$, and measuring the expectation values of the operators in the single-qubit Pauli basis $${{\mathcal{P}}}_{1}$$, $$\left\langle \left\langle {E}_{i}\right.\right|=\left\langle \left\langle {E}_{0}\right.\right|{R}_{{F}_{i}}$$, where ρ0 and E0 are ideally associated with $$\left|0\right\rangle \left\langle 0\right|$$ and Z, respectively.

For single-qubit randomized benchmarking22, we design pulse sequences for implementing major-axis π pulses {X±πY±πZ±π} and $$\frac{\pi }{2}$$ pulses $$\{{X}_{\pm \! \frac{\pi }{2}},{Y}_{\pm \! \frac{\pi }{2}}\}$$. Thus, the gate set for the single-qubit case is $${{\mathcal{G}}}_{1}=\{I,{X}_{\! \pm \! \pi },{Y}\! _{\pm \! \pi },{Z}\! _{\pm \! \pi },{X}\! _{\pm \! \frac{\pi }{2}},{Y}\! _{\pm \! \frac{\pi }{2}}\}$$, where I is the identity operation. The gate set for implementing two-qubit random circuits are $${{\mathcal{G}}}_{2}={{\mathcal{G}}}_{1}^{\otimes 2}\cup \{{{\rm{MS}}}_{YY},{{\rm{MS}}}_{ZZ}\}$$. We experimentally obtain the PTMs of all the gates in the gate set by maximizing a likelihood function with the assumption that Pauli errors are dominant in our devices (see Methods).

The reconstructed PTMs of $${X} \!_{\pm \! \frac{\pi }{2}}$$ and $${Y}\!_{\pm \! \frac{\pi }{2}}$$ for the single-qubit case and those of MSYY and MSZZ gates for the two-qubit case are shown in Fig. 2b, c, respectively (more data for the single-qubit case are in Supplementary Fig. 1a). We note that, for the gate set tomography of two qubits, we apply a two-step parameter estimation, as the infidelities for the single-qubit gates are about an order lower than those of the two-qubit gates. We first determine the Pauli error rates for all the single-qubit gates in $${{\mathcal{G}}}_{1}^{\otimes 2}$$ as described above and then characterize the two-qubit gate MSYY based on the knowledge of the characterized single-qubit gates (see Methods). The MSZZ gate is derived from those results. Using these reconstructed PTMs, we numerically simulate the single-qubit randomized benchmarking and two-qubit random circuits on a classical computer. The comparisons between the numerically reconstructed and experimental data clearly validate the Pauli error assumption within both error bars (see Supplementary Fig. 2).

The initial state, quantum gates, and measurement are deviated from the ideal ones, as experimentally characterized by Gram matrix and PTMs. Mathematically, we can reconstruct the ideal ones by a weighted combination of experimental operations17,18. As we cannot distinguish errors in state preparation from those in measurement, we ascribe all of the SPAM errors to state preparation and decompose the initial state $$\left|\left.{\rho }_{0}^{{\rm{id}}}\right\rangle \right\rangle ={\sum }_{i}{q}_{0,i}\left|\left.{\rho }_{i}\right\rangle \right\rangle$$. The quasi-probabilities q0,i for the decomposition of the ideal single-qubit initial state is shown in Fig. 3a. It is noteworthy that for the two-qubit case, the SPAM errors are much more serious because of the EMCCD and we calibrate the results to remove the SPAM errors as proposed in ref. 31. We prepare the system in the states $$\left|00\right\rangle$$ and $$\left|11\right\rangle$$, and measure the state fidelities of $$\left|0\right\rangle$$ and $$\left|1\right\rangle$$ for both qubits. The infidelities of these states give the SPAM error probability associated with each measurement outcome, which can then be used to remove the SPAM errors by data processing.

An ideal quantum gate $${G}_{\mathrm{{s}}}^{{\rm{id}}}$$ can be written as the experimental one followed by the inverse of noise operation, i.e., $${{R}_{{G}_{\mathrm{{s}}}^{{\rm{id}}}}={N}_{\mathrm{{s}}}^{-1}{R}_{{G}_{\mathrm{{s}}}}}$$, where the noise operation Ns introduces errors in the experimental gate $${R}_{{G}_{\mathrm{{s}}}}={N}_{\mathrm{{s}}}{R}_{{G}_{\mathrm{{s}}}}^{{\rm{id}}}$$. The inverse of the noise operation $${N}_{\mathrm{{s}}}^{-1}$$ is then decomposed by the experimental operations associated with the n-qubit Pauli group, $${N}_{\mathrm{{s}}}^{-1}={\sum }_{j}{q}_{s,j}{R}_{{P}_{j}}$$ with Pauli error assumption, where the quasi-probabilities qs,j are determined by a set of linear equations. We show decompositions of the inverse error operations for single-qubit gates $$\{{X}\! _{\pm \! \frac{\pi }{2}},{Y}\!\! _{\pm \! \frac{\pi }{2}}\}$$ in Fig. 3b (more data in Supplementary Fig. 1b) and for two-qubit gates {MSYY and MSZZ} in Fig. 3c.

### Benchmarking of the quantum error mitigation protocol

We benchmark the performance of the quantum error mitigation using a set of random computations, in the spirit of randomized benchmarking. Each specific computation starts with fully polarized initial states, $$\left|0\right\rangle$$ in the single-qubit case and $$\left|00\right\rangle$$ in the two-qubit case, and ends with measuring Z on each qubit. Between the SPAM, there is a sequence of randomly selected quantum gates. We note that the randomness in selecting the gate sequence is for the purpose of benchmarking the performance rather than correcting errors. For each specific computation, i.e., gate sequence, we apply the error mitigation and modify the gate sequence with random basis operations to correct errors. We remark that, for each specific computation, we observe the improvement on the computation accuracy by using the error mitigation.

For the single-qubit case, benchmarking computations are selected according to the standard randomized benchmarking, i.e., a gate sequence of length L contains L computational gates and L + 1 interleaving identity or Pauli operations, uniformly drawn from the set $$\{{X}\! _{\pm \! \frac{\pi }{2}},{Y}\! _{\pm \! \frac{\pi }{2}}\}$$ and {IX±πY±πZ±π}, respectively. For each sequence length L, we choose four sequences whose ideal final states are the eigenstates of the Pauli Z operator. We then repeatedly implement each of the sequences with a trapped-ion system consisting of a single trapped ion and measure the state fidelity between the ideal and experimentally prepared final states. In Fig. 4a, we show the dependence of the average fidelity without error mitigation, obtained by averaging the state fidelities over sequeces of the same length, on the sequence length. We numerically fit the average fidelity with an exponential function and obtain the error rate per single-qubit gate as (1.10 ± 0.12) × 10−3.

To obtain unbiased estimator of the expectation value, both the initial state and 2L + 1 gates in the selected sequence need to be decomposed and resampled, where the initial state is replaced probabilistically by one of the states in $${{\mathcal{S}}}_{1}$$, and each experimental gate is followed by a random Pauli or identity operation drawn from $${{\mathcal{P}}}_{1}$$. Thus, for a specific computation with (2L + 1) gates, there are 42L+2 possible experimental settings. As the number of settings grows exponentially with the length of the random sequence, we use the Monte-Carlo method to compute the result by sampling random experimental settings, which are specified by an index i for the initial state $$\left|\left.{\rho }_{i}\right\rangle \right\rangle$$ and two (2L + 1)-entry index vectors $${\bf{a}}\equiv \left({a}_{1},\ldots ,{a}_{2L+1}\right)$$ and $${\bf{b}}\equiv \left({b}_{1},\ldots ,{b}_{2L+1}\right)$$ specifying the computation and the choices of the error-compensating operations. We note that for a specific computation, a is determined but b is random. The probability of an experimental setting $$\langle \langle {E}_{0}| {\prod }_{l=1}^{2L+1}{R}_{{P}_{{b}_{l}}}{R}_{{G}_{{a}_{l}}}| {\rho }_{i}\rangle \rangle$$, where $${G}_{{a}_{l}}\in {{\mathcal{G}}}_{1}$$ and $${P}_{{b}_{l}}\in {{\mathcal{P}}}_{1}$$, is $${C}^{-1}\left|{q}_{0,i}\left({\prod }_{l=1}^{2L+1}{q}_{{a}_{l},{b}_{l}}\right)\right|$$. Here, the rescaling factor $$C={\sum }_{i,\ldots ,\left({a}_{l},{b}_{l}\right),\ldots }\left|{q}_{0,i}\left({\prod }_{l=1}^{2L+1}{q}_{{a}_{l},{b}_{l}}\right)\right|\ge 1$$ characterizes the cost to mitigate the errors. It is noteworthy that the signs of the coefficients, i.e., $${\rm{sgn}}\left[{q}_{0,i}\left({\prod }_{l=1}^{2L+1}{q}_{{a}_{l},{b}_{l}}\right)\right]$$, are integrated into the measurement results of the random experiments (see Methods). In Fig. 4a, we represent the error-mitigated single-qubit randomized benchmarking with length L up to 64 and show that the single-qubit gate error rate is effectively suppressed to (1.44 ± 5.28) ×  10−5.

For the two-qubit case, we select four gate sequences as benchmarking computations for each length L. Each sequence contains L two-qubit gates uniformly drawn from the set {MSYY, MSZZ} with interleaving single-qubit gates32. The sequence is selected under the restriction that the ideal final state is an eigenstate of Z2. Similar to the single-qubit case described above, we apply error mitigation to each of the two-qubit gate sequences with length L up to 6 and represent the error-mitigated results in Fig. 4b, where the two-qubit gate error rate is effectively suppressed from (0.99 ± 0.06) × 10−2 to (0.96 ± 0.10) × 10−3.

## Discussion

Our work shows that for the estimation of expection values, the error mitigation technique, i.e., probabilistic error cancellation17,18,21, surely have the capacity of surpassing the break-even point, where the effective gates are superior to their physical building blocks, at an affordable cost with respect to near-future quantum techniques. We note that error mitigation techniques are developed for the intermediate-scale quantum computation. The cost of the error mitigation increases with the circuit depth; therefore, techniques such as quantum error correction are still needed for large-scale fault-tolerant quantum computation. The effective infidelity after error mitigation comes from the Pauli error assumption, time-dependent systematic drifting33 for both single-qubit and two-qubit cases, and crosstalk error of single-qubit addressing operations for the two-qubit case (see Methods). Thus, further improvement requires both calibrating and stabilizing the quantum device. With technologies to tackle the crosstalk error, the probabilistic error-cancellation method of quantum error mitigation can be straightforwardly applied to systems with more qubits for realizing high-fidelity quantum computation.

## Methods

### Maximum-likelihood gate set tomography

To obtain the PTMs of all the gates in the gate set, we experimentlly measure informationally complete data consisting of the average $${\bar{m}}_{ijk}$$ and variance Δijk of the expectation value $$\langle \langle {E}_{i}| {R}_{{G}_{j}}| {\rho }_{k}\rangle \rangle$$, which are obtained by repeating the corresponding experimental settings enough number of times. We assume Pauli errors are dominant in our device, where each of the noisy quantum gate $${G}_{j}\in {{\mathcal{G}}}_{n}$$ is modeled with the ideal gate $${G}_{j}^{{\rm{id}}}$$ followed by a Pauli error channel. We use a maximum-likelihood estimation for the reconstruction of PTMs of all the gates in the gate set, parameterized as ansatz $${R}_{{G}_{j}}={N}_{j}{R}_{{G}_{j}}^{{\rm{id}}}$$, where $${N}_{j}={\sum }_{l}{p}_{j,l}{R}_{{P}_{l}}$$, with variational parameters being gate-specific Pauli error rates pj,l. With the ansatz for each gate, we calculate the ansatz prediction for the expectation value of each experimental setting, denoted as mijk. We then define the following likelihood function25,

$${\mathcal{L}}=\prod _{i,j,k}\exp \left[-{({m}_{ijk}-{\bar{m}}_{ijk})}^{2}/{\Delta }_{ijk}^{2}\right],$$
(1)

which takes its maximum value when the experimental average values $${\bar{m}}_{ijk}$$ and the ansatz expectations mijk coincide with each other. Thus, the gate-specific Pauli error rates can be determined by maximizing the likelihood function, with which we construct the PTMs of the imperfect gates that are implementable in the quantum device.

### Characterization and decomposition of single-qubit gate set

We use gate set tomography to characterize the single-qubit operations. In the superoperator formalism, each experimental single-qubit operation $${R}_{{G}_{\mathrm{{s}}}}$$ can be describe as an ideal 4 × 4 PTM followed by a PTM of noise operation Ns. With Pauli error assumption, each Ns can be written as $${N}_{\mathrm{{s}}}={p}_{s,0}{R}_{I}^{{\rm{id}}}+{p}_{s,1}{R}_{X}^{{\rm{id}}}+{p}_{s,2}{R}_{Y}^{{\rm{id}}}+{p}_{s,3}{R}_{Z}^{{\rm{id}}}$$, where ps,j are the Pauli error rates and ∑jps,j  = 1 for trace-preserving condition. As there are 11 gate in $${{\mathcal{G}}}_{1}$$, $${{\mathcal{F}}}_{1}\subset {{\mathcal{G}}}_{1}$$ and the experimental initial state ρ0 can be characterized by 3 parameters, we need to obtain the values for 11 × 3 + 3 = 36 parameters. We run 3 × 11 × 4 different experimental settings specified by $$\langle \langle {E}_{0}| {R}_{{F}_{k}}{R}_{{G}_{j}}{R}_{{F}_{i}}| {\rho }_{0}\rangle \rangle$$ with repetitions of 10,000 per setting to collect experimental data $${\bar{m}}_{ijk}$$, where i = 1, …, 4 for state preparation, j = 1, …, 11, and k = 1, 2, 3 for different measurement settings. The ansatz prediction $${m}_{ijk}=\langle \langle {E}_{0}| {N}_{{F}_{k}}{R}_{{F}_{k}}^{{\rm{id}}}{N}_{{G}_{j}}{R}_{{G}_{j}}^{{\rm{id}}}{N}_{{F}_{i}}{R}_{{F}_{i}}^{{\rm{id}}}| {\rho }_{0}\rangle \rangle$$ contain Pauli error rates as variational parameters, which we numerically optimize to maximize the likelihood function in Eq. (1). The obtained PTMs are shown in Fig. 2b and Supplementary Fig. 1a.

Once we get experimental PTMs for single-qubit operations, we can derive the inverse of PTM of the noise operation as $${N}_{\mathrm{{s}}}^{-1}={R}_{{G}_{\mathrm{{s}}}}^{{\rm{id}}}{R}_{{G}_{\mathrm{{s}}}}^{-1}$$, which can be decomposed by the combination of PTMs of experimental Pauli operations with $${N}_{\mathrm{{s}}}^{-1}={q}_{s,0}{R}_{I}+{q}_{s,1}{R}_{X}+{q}_{s,2}{R}_{Y}+{q}_{s,3}{R}_{Z}$$. Then, the ideal operation can be decomposed by experimental operations as $${R}_{{G}_{\mathrm{{s}}}}^{{\rm{id}}}={q}_{s,0}{R}_{I}{R}_{{G}_{\mathrm{{s}}}}+{q}_{s,1}{R}_{X}{R}_{{G}_{\mathrm{{s}}}}+{q}_{s,2}{R}_{Y}{R}_{{G}_{\mathrm{{s}}}}+{q}_{s,3}{R}_{Z}{R}_{{G}_{\mathrm{{s}}}}.$$

### Characterization of the two-qubit gate set

The two-qubit gate set, i.e., $${{\mathcal{G}}}_{2}$$ includes single-qubit operations in $${{\mathcal{G}}}_{1}^{\otimes 2}$$ and two-qubit operations {MSYY and MSZZ}. As infidelities for the single-qubit gates are about an order lower than those of the two-qubit gates, it is reasonable to divide the maximum-likelihood estimation into two steps.

First, we treat each qubit in the two-qubit system as a single-qubit system and characterize the single-qubit gate set $${{\mathcal{G}}}_{1}$$ by gate set tomography, obtaining single-qubit PTMs. The two-qubit PTMs of the single-qubit operations in $${{\mathcal{G}}}_{1}^{\otimes 2}$$ is then obtained by a direct product of the single-qubit PTMs on both qubits. As the fiducial set $${{\mathcal{F}}}_{2}\in {{\mathcal{G}}}_{1}^{\otimes 2}$$, the PTMs of the fiducial operations are determined at this step.

Second, we characterize the native two-qubit MSYY gate. Under the Pauli error assumption, the PTM of the experimental MSYY gate is decomposed as $${R}_{{{\rm{MS}}}_{YY}}={N}_{{{\rm{MS}}}_{YY}}{R}_{{{\rm{MS}}}_{YY}}^{{\rm{id}}}$$, where $${N}_{{{\rm{MS}}}_{YY}}$$ is the PTM of the Pauli error channel containing 16 two-qubit Pauli components. After considering the trace-preserving constraint, $${N}_{{{\rm{MS}}}_{YY}}$$ has 15 parameters, which are determined by linear equations connecting the ansatz predition $$\langle \langle {E}_{0}^{\otimes 2}| {R}_{{F}_{k}}{N}_{{{\rm{MS}}}_{YY}}{R}_{{{\rm{MS}}}_{YY}}^{{\rm{id}}}{R}_{{F}_{i}}| {\rho }_{0}^{(1)}\otimes {\rho }_{0}^{(2)}\rangle \rangle$$ and corresponding experimental results. To minimize the projection error, we choose 15 linearly independent equations out of 16 × 9 different settings, with most of the measured probabilities close to 0 or 1. Supplementary Fig. 3 shows the corresponding circuits for the experimental settings.

As the MSZZ is implemented by a MSYY gate sandwiched by proper single-qubit gates, the PTM of the experimental MSZZ gate is obtained by multiplying the PTMs of the corresponding experimental operations, i.e., $${R}_{{{\rm{MS}}}_{ZZ}}={R}_{{X}_{-\frac{\pi }{2}}\otimes {X}_{-\frac{\pi }{2}}}{R}_{{{\rm{MS}}}_{YY}}{R}_{{X}_{\frac{\pi }{2}}\otimes {X}_{\frac{\pi }{2}}}$$.

### Probabilistic error-cancellation scheme

The concrete procedure of applying the probabilistic error cancellation to a given quantum computation task consists of the so-called characterization and calculation phases. The characterization phase is described above. In the calculation phase, we estimate expectation values of quantum circuits with the characterized imperfect quantum device. We first write down the unbiased estimator of the expectation value of a specific quantum circuit as $$\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{G}_{{a}_{L}}}^{{\rm{id}}}\ldots {R}_{{G}_{{a}_{1}}}^{{\rm{id}}}| {\rho }_{0}^{{\rm{id}}}\rangle \rangle$$, which can be expanded with the quasi-probability distributions obtained in the characterization phase as follows,

$$\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{G}_{{a}_{L}}}^{{\rm{id}}}\ldots {R}_{{G}_{{a}_{1}}}^{{\rm{id}}}| {\rho }_{0}^{{\rm{id}}}\rangle \rangle =\sum _{i}\sum _{{b}_{1},\ldots ,{b}_{L}}{q}_{0,i}{q}_{{a}_{1},{b}_{1}}\ldots {q}_{{a}_{L},{b}_{L}}\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{P}_{{b}_{L}}}{R}_{{G}_{{a}_{L}}}\ldots {R}_{{P}_{{b}_{1}}}{R}_{{G}_{{a}_{1}}}| {\rho }_{i}\rangle \rangle ,$$
(2)

where the expection value of $$\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{P}_{{b}_{l}}}{R}_{{G}_{{a}_{L}}}\ldots {R}_{{P}_{{b}_{1}}}{R}_{{G}_{{a}_{1}}}| {\rho }_{i}\rangle \rangle$$ can be obtained by repeating the specific experimental setting and averaging the measurement results. The straightforward way to evaluate the unbiased estimator is summing over all possible settings. However, this is impractical, because the number of settings grows exponentially with the circuit depth. To alleviate the exponential growth, we rewrite the above expansion as a probability distribution as follows,

$$\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{G}_{{a}_{L}}}^{{\rm{id}}}\ldots {R}_{{G}_{{a}_{1}}}^{{\rm{id}}}| {\rho }_{0}^{{\rm{id}}}\rangle \rangle ={C}_{{\bf{a}}}\sum _{i,{\bf{b}}}{P}_{{\bf{a}}}\left(i,{\bf{b}}\right)g\left(i,{\bf{a}},{\bf{b}}\right)\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{P}_{{b}_{L}}}{R}_{{G}_{{a}_{L}}}\ldots {R}_{{P}_{{b}_{1}}}{R}_{{G}_{{a}_{1}}}| {\rho }_{i}\rangle \rangle ,$$
(3)

with the short-hand notations $${\bf{a}}\equiv \left({a}_{1},\ldots ,{a}_{L}\right)$$ and $${\bf{b}}\equiv \left({b}_{1},\ldots ,{b}_{L}\right)$$, where $${C}_{{\bf{a}}}\equiv {\sum }_{i,{\bf{b}}}|{q}_{0,i}|{\prod }_{l}|{q}_{{a}_{l},{b}_{l}}|$$ is the rescaling factor, $${P}_{{\bf{a}}}\left(i,{\bf{b}}\right)=|{q}_{0,i}|{\prod }_{l}|{q}_{{a}_{l},{b}_{l}}|/C$$ is the probability distribution, and $$g(i,{\bf{a}},{\bf{b}})={\rm{sgn}}({q}_{0,i}{\prod }_{l}{q}_{{a}_{l},{b}_{l}})$$ is the sign of the setting. Then, we use important sampling to generate M experimental settings, specified by $$\left({i}_{m},{{\bf{b}}}_{m}\right)$$ with m = 1, …, M, according to the probability distribution $${P}_{{\bf{a}}}\left(i,{\bf{b}}\right)$$, and calculate the expectation value as follows,

$$\langle \langle {E}_{0}^{{\rm{id}}}| {R}_{{G}_{{a}_{L}}}^{{\rm{id}}}\ldots {R}_{{G}_{{a}_{1}}}^{{\rm{id}}}| {\rho }_{0}^{{\rm{id}}}\rangle \rangle =\frac{{C}_{{\bf{a}}}}{M}\sum _{m=1}^{M}g\left({i}_{m},{\bf{a}},{{\bf{b}}}_{m}\right)O\left({i}_{m},{\bf{a}},{{\bf{b}}}_{m}\right),$$
(4)

where $$O\left({i}_{m},{\bf{a}},{{\bf{b}}}_{m}\right)$$ is the result of the projective measurement of the m-th setting, being either 0 or 1 in our experiment.

### Simple example

In this section, we provide an illustrative example of applying the probabilistic error-cancellation technique to a simple quantum circuit. Suppose an experimenter plans to apply an ideal gate $${G}^{{\rm{id}}}\equiv [{e}^{-iY\frac{\pi }{4}}]$$ on an ideal initial state $${\rho }^{{\rm{id}}}\equiv \left|0\right\rangle \left\langle 0\right|$$ and get the ideal expectation value of observable $${\left\langle X\right\rangle }^{{\rm{id}}}\equiv Tr[X{G}^{{\rm{id}}}({\rho }^{{\rm{id}}})]=1$$. However, as an example of a noisy quantum device, the actual initial state could be $$\rho =90 \% \left|0\right\rangle \left\langle 0\right|+10 \% \frac{I}{2}$$ and the actual gate could be G =  80%Gid + 20%D, where $$D(\rho )=\frac{I}{2}$$. Then, the actual result is $$\left\langle X\right\rangle =Tr[XG(\rho )]=72 \%$$. With the error-cancellation procedure, the ideal initial state is decomposed as $${\rho }^{{\rm{id}}}=(\rho -10 \% \frac{I}{2})/90 \%$$ and the ideal gate is decomposed as Gid =  (G − 20%D) ∕ 80%. Then, the ideal expectation value can be obtained by $${\left\langle X\right\rangle }^{{\rm{id}}}=Tr[XG(\rho )]\times (1/72 \% )-Tr[XG(\frac{I}{2})]\times (10 \% /72 \% )\, -Tr[XD(\rho )]\times (20 \% /72 \% )-Tr[XD(\frac{I}{2})]\times (2 \% /72 \% )$$, where the four terms can be obtained by running the noisy quantum device. By computing each term on the noisy quantum device and substituting results into the formula, we can obtain the ideal expectation value.

For a computation with multiple gates, the state preparation, measurement, and each gate can be treated in a similar way. Then, the formula of the ideal expectation value, i.e., a weighted summation of noisy computations has exponential terms with respect to the gate number. Therefore, instead of evaluating each term, we compute the summation using the Monte-Carlo method.

In this example, we consider the depolarizing error model. The decomposition can be applied to general error models without correlations. The decomposition formula is obtained by inverting the noise. For the gate G, the noise is N = 80%[I] + 20%D, and G = NGid. The inverse of the noise is N−1 = ([I] − 20%D) ∕ 80%. Then, the ideal gate Gid = N−1G = (G − 20%D) ∕ 80%.

### Analysis on residual errors

Theoretically, the error mitigation technique, combining probabilistic error cancellation and gate set tomography, is capable of completely rectifying the effect of errors in the estimation of expectation values. However, in our experiment, the effective error rates after error mitigation are (1.44 ±  5.28) × 10−5 and (0.96 ± 0.10) × 10−3 in the single-qubit and two-qubit cases, respectively. Generally speaking, the reasons for the residual errors include the Pauli error assumption, time-correlated systematic drift, and crosstalk errors between qubits.

In the single-qubit case, the residual errors mainly come from the introduction of the Pauli error model. To quantify the non-Pauli error rate, we simulate the dynamics of the same random sequences as those used in the experiment with the characterized experimental PTMs, which are obtained under the Pauli error assumption. The experimental and simulated data of average fidelity are shown in Supplementary Fig. 2a, which are then numerically fitted to extract the error rates. The difference between the simulated and experimental error rates for single-qubit gates is 1.41 × 10−5, which are of the same order of the residual error rate in the single-qubit case. Meanwhile, the data show that the time-correlated systematic drift has negligible effect and cannot be faithfully quantified within experimental and fitting errors.

In our experiment, we implement two different two-qubit gates, i.e., MSYY and MSZZ gates. To quantify the residual errors from the Pauli error assumption, we compare the dynamics of the simulated and experimental random two-qubit sequence, where the simulation is based on the characterized PTMs with the Pauli error assumption. The difference between the simulated and experimental error rates gives the estimation of the non-Pauli residual error rate, which is about 0.20 × 10−3. As to the crosstalk errors, the situations for MSYY and MSZZ gates are quite different because of different implementation schemes. Specifically, a MSZZ gate is implemented by a MSYY gate sandwiched by proper single-qubit gates, which introduce qubit-crosstalk errors. We model the crosstalk effect by measuring an effective Rabi frequency Ωeff on the neighboring ion induced by leakage laser intensities when a single-qubit gate is being implemented by lasers focused on one of the ions. The ratio Ωeff ∕ Ω, with Ω being the Rabi frequency of the target ion, thus quantifies the severity of crosstalk errors. As shown in Supplementary Fig. 4a, we numerically simulate the state fidelities of the original and error-mitigated MSYY and MSZZ gates. As expected, the numerical results show that MSYY gates, either original or error-mitigated ones, are insensitive to the crosstalk errors, whereas the fidelities of MSZZ gates degrade as the severity of crosstalk errors increases. According to the numerical results, the crosstalk residual error rate is about 0.68 × 10−3 at the experimental level of qubit crosstalk. Finally, the remaining part of the residual error rate, 0.08 × 10−3, is attributed to the time-correlated systematic drift.