Abstract
Spin qubits created from gatedefined silicon metal–oxide–semiconductor quantum dots are a promising architecture for quantum computation. The high single qubit fidelities possible in these systems, combined with quantum error correcting codes, could potentially offer a route to faulttolerant quantum computing. To achieve fault tolerance, however, gate error rates must be reduced to below a certain threshold and, in general, correlated errors must be removed. Here we show that pulse engineering techniques can be used to reduce the average Clifford gate error rates for silicon quantum dot spin qubits down to 0.043%. This represents a factor of three improvement over stateoftheart silicon quantum dot devices and extends the randomized benchmarking coherence time to 9.4 ms. By including tomographically complete measurements in our randomized benchmarking, we infer a higherorder feature of the noise called the unitarity, which measures the coherence of noise. This, in turn, allows us to theoretically predict that average gate error rates as low as 0.026% may be achievable with further pulse improvements. These spin qubit fidelities are ultimately limited by incoherent noise, which we attribute to charge noise from the silicon device structure or the environment.
Main
The implementation of faulttolerant quantum computing systems will require precise control of qubits and error rates below the tolerance requirement for quantum error correction. In particular, qubits must be manipulated, coupled and measured with error rates well below 1% (refs. ^{1,2}). Among semiconductor implementations, silicon quantum dot spin qubits have demonstrated average singlequbit Clifford gate error rates approaching this threshold^{3,4,5,6}, with error rates of 0.14% in isotopically enriched ^{28}Si/SiGe devices^{7}.
In these previous demonstrations, the gate fidelities were characterized using Cliffordbased randomized benchmarking. Randomized benchmarking^{8,9,10,11,12,13} is the gold standard for quantifying the performance of quantum gates and can be used to efficiently obtain accurate estimates of the average gate fidelity, independently of state preparation and measurement (SPAM) errors. The standard method for randomized benchmarking, however, is based on measuring many random gate sequences and is therefore designed to provide only an average of the system and not any further details about the noise. To improve quantum gates further, information about the characteristics of the noise process, such as its frequency spectrum and its primary source (whether it comes from qubit interaction with the environment or control errors), would be useful. Quantum state tomography methods can provide such information but are generally inefficient and highly sensitive to SPAM errors. To overcome these challenges, variants of randomized benchmarking that quantify higherorder noise features, as well as the average gate fidelity, have been developed^{14,15,16}.
In an early example of this approach^{3}, the randomized benchmarking data—demonstrating average Clifford gate fidelities of 99.59% in silicon metal–oxide–semiconductor (SiMOS) qubits—exhibited nonexponential decay features. These features were subsequently attributed to lowfrequency detuning noise in the system^{17}. Thus, the randomized benchmarking approach can also provide details about the noise characteristics, which could be used to further reduce the gate infidelity. In particular, lowfrequency noise can be addressed with pulse engineering techniques that can exploit the quasistatic nature of the noise process. Such an approach could, in principle, lead to higher fidelities.
In this Article, we exploit recent developments in randomized benchmarking to give precise estimates of the average gate fidelity. We use pulse engineering techniques to increase the average Clifford gate fidelity of singlequbit gate operations from 99.83% to 99.96% on the same SiMOS quantum dot (Fig. 1). This increased gate fidelity represents a 3.2 times improvement compared with stateoftheart silicon devices^{7}. In terms of coherence times, and compared to using standard square pulses, this leads to an improvement in randomized benchmarking coherence time \(T_2^{{\mathrm{RB}}}\) from 620 μs to 9.4 ms, an increase of 15 times. This improved coherence time is also 180 times longer than in stateoftheart silicon devices (\(T_2^{{\mathrm{RB}}} = 52\,{\mathrm{\mu s}}\); ref. ^{7}); see Supplementary Table 1 for a detailed comparison with earlier work. Unlike filtering approaches based on dynamicaldecoupling (DD) pulses such as Carr–Purcell–Meiboom–Gill (CPMG), which assume that the spin is in a certain state and with no computational degree of freedom, the \(T_2^{{\mathrm{RB}}}\) gives a more practical benchmarking metric for qubits serving as memories. However, \(T_2^{{\mathrm{RB}}}\) is still much shorter than the spontaneous emission time of the qubit (T_{1} ≈ 1 s). Accordingly, the qubit performance is not limited by relaxation processes.
Furthermore, by using tomographic measurements in our randomized benchmarking, we are able to quantify the unitarity^{14} of the noise: a higherorder feature of the noise that quantifies the average change in the purity of a state, averaged over a given gate set (Fig. 2). Extracting the unitarity can help quantify the coherence in the noise independently of the error rate. By measuring both the unitarity and the average error rate, we can estimate how much of an experimental error budget is due to control errors and lowfrequency noise and how much is due to uncorrectable decoherence. Our measurements demonstrate that the improved gate performance via pulse engineering is primarily due to the reduced unitary component of the noise, which also suggests that greater gains could potentially be made from further improvements in pulse engineering and control.
Pulse engineering and calibration
Figure 1 provides details of our qubit experiment and the shaped optimized pulse used. Our previous study into the cause of qubit gate infidelity for SiMOS qubits^{17} identified lowfrequency drift in the qubit detuning as the dominant noise term. The timescale of this drift process is very long compared to the timescale for control of the qubit, enabling pulse engineering techniques to be used to identify compensating pulses for this noise term. Specifically, we use gradient ascent pulse engineering (GRAPE)^{18} to identify pulses for our qubit control that are robust against lowfrequency detuning noise. This method uses a theoretical model for the noise, together with gradient ascent methods to identify (locally) optimal pulses.
We identified seven improved Clifford gate operators by using this procedure, as detailed in the Methods and Fig. 1c. The full set of 24 singlequbit Clifford gates can be achieved simply by phaseshifting one of the seven basic operators (manipulating the sign of Ω_{x} and Ω_{y} and/or swapping them). For example, a Y gate can be constructed by swapping Ω_{x} and Ω_{y} of the X gate.
Four different controllers, illustrated in Fig. 3, were used to ensure that the spin qubit environment and control parameters did not drift during the entire 35 h experiment (see Methods for full details). Calibration data in Fig. 3f suggest that the main source of noise, \(\epsilon _z\), comes from the nuclear spin from ^{29}Si, where the change of resonance frequency has a strong steplike behaviour and no clear correlation with charge rearrangement (Fig. 3b,d). Similar nuclearspinlike behaviour has been observed in the same device while operating in twoqubit mode^{19}.
Qubit tomography
Figure 2a presents an example of a single shot sequence during randomized benchmarking. The projection pulses that measure each spin projection are shown in Fig. 2b. They are designed in a way that can be easily multiplexed and have builtin echoing ability. To confirm that these projection pulses are able to correctly construct a robust density matrix, a tomographic Rabi chevron map was constructed, as shown in Fig. 2c,d, combined with the calibration technique described above. To make the correlation between XYZ axes clearer, the colourcoded density matrix map integrates all the spin projection maps into one. The measured and simulated data (nonfitted) in Fig. 2f appear nearly identical, apart from the fact that the simulated map has less background white noise. We can still see XY phase oscillation for both sets of data, even at far detuning (Δf_{ESR} > 3 MHz), where a normal Z projectiononly Rabi map would appear to have no readout signal. The coloured Rabi chevron measurement serves as a strong validation of our tomographic readout, feedback control and microwave calibration, and confirms the data quality in our randomized benchmarking experiment.
Randomized benchmarking
We assessed the performance of our improved gates using randomized benchmarking^{8,9,10,11,12}. Randomized benchmarking and its variants are fully scalable protocols that allow for the partial characterization of quantum devices. Here we use a variant of randomized benchmarking to determine the average gate fidelity as well as the coherence (unitarity) of the noise^{14}. An overview of randomized benchmarking is provided in the Methods.
The results of our randomized benchmarking experiments, determining the average gate fidelities of both the original (square) pulses scheme S and the improved optimized pulses scheme O, are shown in Fig. 4. Both pulse schemes are performed for each of the measurement projections in an alternating manner, using the identical square projection pulses shown in Fig. 2b, with calibrations activated. For scheme S, this gave a measured randomized benchmarking decay factor (p) of 99.66(5)%, which equates to an average perClifford fidelity of 99.83(2)%. With scheme O, this resulted in a decay factor of 99.914(9)%, which equates to an average perClifford fidelity of 99.957(4)%, where the error indicates the 95% confidence levels. For comparison purposes we note that the literature often reports not only a Clifford gate fidelity but also a fidelity based on gate generators. Here we report only the fidelity returned by randomized benchmarking, namely the per Clifford fidelity. The relevant comparison fidelities are therefore the 99.96% achieved here, compared to 99.86% (ref. ^{7}), 99.90% (ref. ^{20}) and 99.24% (refs. ^{3,17}). Fitting assumptions and methods are detailed in the Methods. Bayesian analysis was carried out, leading to the tight credible regions seen in Fig. 4a.
Unitarity and coherence
The data obtained from the tomographic measurements in our randomized benchmarking experiments allow us to determine the unitarity, which is a higherorder feature of the noise afflicting the system^{14}. The unitarity can be used to distinguish ‘unitary’ errors, which may arise, for example, from control errors and/or lowfrequency noise, from stochastic errors (which are generally associated with highfrequency noise). The lowfrequency noise can be treated as a nonstochastic error based on the assumption that the noise in each single shot is identical during each tomography sequence time, which is the time where 120 single shots are measured in order to reconstruct a single density matrix.
For a system of dimension d, the unitarity is defined as an integral of pure states (ψ) over the Haar measure:
and provides a measure to characterize the noise within the range 0–1: completely coherent noise corresponds to where the unitarity achieves its maximum value of 1; completely depolarizing noise corresponds to its minimum value. The minimum value depends on the fidelity and we have \(u({\cal E}) \ge [1  \frac{{dr}}{{(d  1)}}]^2\), which is saturated by a completely depolarizing channel with average infidelity r = 1 − F. Another way to think of unitarity is to note its equivalence to the averaged squared length of the generalized Bloch vector after applying \({\cal E}\) with the component due to the identity subtracted. We can define a new quantity, the incoherence^{16}, which is related to the unitarity as follows:
The incoherence is defined so that it takes a maximum value given by the infidelity and a minimum value of 0 (purely coherent noise), so that \(0 \le \omega ({\cal E}) \le r({\cal E})\). The value of the incoherence represents the minimum infidelity that might be achievable if one had perfect unitary control over the system. The incoherence (when compared to the infidelity) directly gives an indication of the amount of the infidelity that can be attributed to incoherent (statistical) noise sources. See Fig. 5d for a geometrical comparison between infidelity and incoherence.
The incoherence can therefore be used to estimate useful information about the type of noise afflicting the system, as well as to provide a guide as to how much improvement in fidelity can be achieved by correcting purely coherent errors (such as overrotations). Furthermore, it can be used to provide tighter bounds on the likely diamond distance of the average noise channel^{21,22} and to reduce uncertainty in the interleaved benchmarking protocol^{23} (Fig. 4b).
The incoherence (equation (2)) for scheme S is 0.53(10) × 10^{−3} and for scheme O is 0.25(08) × 10^{−3}. Using the incoherence allows a direct comparison with the reported infidelities (Fig. 4a). With scheme S the incoherence is ~30% of the infidelity and for the improved pulses 61% of the infidelity. Two conclusions can be drawn from this. First, the data provide strong, quantitative evidence that the improved pulses have reduced the errors on the gates primarily by reducing coherent errors. Second, we observe that the infidelity for the optimized pulses is below the incoherence for the square pulses, and that there are still coherent errors in the improved gates. Therefore, by using scheme O we have not only improved our unitary control but have also reduced the incoherent noise in the system.
Figure 4c presents an intuitive explanation of these results. The shaded regions in each row correspond to the effective frequency of noise that can couple into the pulsing schemes, where different schemes act as different noise filters over their fidelities and coherences. Scheme O minimizes the effect of noise on timescales greater than 8 μs, decreasing the infidelity of the system and the coherence of the remaining infidelity. The small trade off, however, is that the pulse optimized gates are slightly more susceptible to higherfrequency noise, up to the bandwidth of the pulse, leaving us with some coherent noise. Despite this, in general, the noise spectrum follows a 1/f trend at higher frequencies so it is worthwhile. Finally, near d.c. imperfections such as miscalibrations and microwave phase errors will also contribute to degrade the fidelity, but with lesser impact on its incoherence.
Conclusion
We have shown that the unitarity can be used to characterize noise and also as a tool to increase gate fidelities. The Clifford gate fidelities for a single qubit reported here (99.957%) are higher than other contemporary electronic platforms for scalable quantum computers, such as donor spin qubits (99.90%)^{20} and superconducting qubits (99.89%)^{24}. (For this comparison, we have normalized all reported fidelities to per Clifford fidelities, rather than fidelities for noncomposite pulses.) In addition, these fidelities have the potential to reach the level of atomic spin platforms such as trapped ion qubits (99.993%)^{25} and nitrogenvacancy centre qubits in diamond (99.995%)^{26}. Specifically, the data indicate that, with the improved pulses, the fidelity of the gates could be as high as 99.974% if perfect unitary control can be achieved. Furthermore, if combined with suitable qubit driving (such as the πpulse time of 120 ns achieved in previous experiments^{7}), the long \(T_2^{{\mathrm{RB}}}\) reported here would lead to a control fidelity exceeding 99.998% in silicon.
Methods
Stochastic GRAPE
We model our qubit system using the Hamiltonian \(H = \Omega _x\sigma _x + \Omega _y\sigma _y + \epsilon _z\sigma _z\), where Ω_{x}/Ω_{y} are the I/Q (inphase/quadrature) microwave amplitudes, and \(\epsilon _z\) is a fixed (d.c.) random variable representing the Z detuning for a single pulse sequence.
The amplitudes Ω_{x} and Ω_{y}, as functions of time, are the two controls available that define our shaped microwave pulse. In each iteration of GRAPE, we calculate the derivative \(\frac{{\delta \Psi }}{{\delta \Omega }}\) of the target operator fidelity Ψ corresponding to each sample point Ω_{x} and Ω_{y}, and update them accordingly to maximize Ψ. Our GRAPE implementation is stochastic, sampling \(\epsilon _z\) on every iteration from a Gaussian distribution of \(\frac{1}{{2T_2^ \ast }} = 16.7\,{\mathrm{kHz}}\) noise strength, where \(T_2^ \ast = 30\,{\mathrm{\mu s}}\). (Note that \(\epsilon _z\) is constant within a single iteration.) In our search for improved pulses, we constrain the maximum pulse length to 8 μs, four times longer than a square π pulse. The amplitude of each pulse is also constrained by \(\Omega _x^2\) + \(\Omega _y^2 = \Omega _{{\mathrm{max}}}^2\), where \(\Omega _{{\mathrm{max}}} = \frac{1}{{2T_\pi }} = \frac{1}{{2 \times 1.75\,\mu {\mathrm{s}}}} = 285.7\,{\mathrm{kHz}}\) is the maximum allowed effective B_{1} amplitude.
Each optimized pulse is constructed via 800 Ω samples at a sample rate of 10 ns, with a time length of 8 μs.
For a given, small, learning factor η, a single iteration step can be written as follows:

(1)
Randomize \(\epsilon _z\)

(2)
Calculate \(\frac{{\delta \Psi }}{{\delta \Omega }}\) for all Ω pointwise, with the current Hamiltonian H

(3)
Update \(\Omega \to \Omega + \eta \frac{{\delta \Psi }}{{\delta \Omega }}\)

(4)
Filter Ω for smoothness and bound condition \(\Omega _{{\mathrm{max}}}^2 \ge \Omega _x^2 + \Omega _y^2\)
The pulse optimization can perform roughly 100 iterations per second with MATLAB; within a few minutes, solutions that have close infidelities to Fig. 1c can be found. Here, we optimize seven basic Clifford gate operators using the GRAPE method described above. These basic gates can be expanded to the complete group of 24 Clifford gates by phaseshifting one of the seven basic operators (manipulating the sign of Ω_{x} and Ω_{y}, and/or swapping them). For example, a Y gate can be constructed by swapping Ω_{x} and Ω_{y} of the X gate. Figure 1c shows the optimized Clifford gates that were found and used for the randomized benchmarking experiment. The normal square pulses in black are plotted in the same scale for comparison. See the Supplementary Information for analysis of the expected performance of these gate pulses.
Experimental setup
The device being measured is the same as that described in refs. ^{19,27}, fabricated on an isotopically enriched 900 nm ^{28}Si epilayer^{28} with an 800 ppm residual concentration of ^{29}Si with multilevel gatestack silicon MOS technology^{29,30}. The measurements were conducted in a wet dilution refrigerator with base temperature of T = 20 mK. Stanford Research System SIM928 rechargeable isolated voltage sources were used to supply all the d.c. voltages, and a LeCroy ArbStudio 1104 arbitrary waveform generator (AWG) was combined with the d.c. voltages through a resistive voltage divider (1/5 for d.c., 1/25 for AWG). The shaped microwave pulses were delivered by an Agilent E8267D vector signal generator; we used its own internal AWG for IQ modulation. The SET current signals were detected by a FEMTO transimpedance amplifier DLPCA200 and finally acquired using an Alazar ATS9440 waveform digitizer with a PCIe interface.
Supplementary Fig. 3 presents the stability diagram and read/control point for the qubit. Notice that there is a faint horizontal transition that shows there is a quantum dot sitting under G2; this is being controlled in our other work on twoqubit randomized benchmarking^{19}.
Randomized benchmarking
Randomized benchmarking sequence
We perform the randomized benchmarking experiment using the methods described earlier. The results are shown in the main text (Fig. 4) and in Fig. 5. We also present the data as follows: for every measurement acquisition of a randomized benchmarking sequence, we obtain a density matrix that is reconstructed via 120 single shot spin readouts with tomographic measurement (see main text). The density matrix can be rotated in such a way that its expected final state would have aligned to spin up (+Z), followed by removing the XY phase angle while maintaining its magnitude. This produces a realigned partial density matrix map that is colour encoded in Fig. 5a according to the colour semicircle in Fig. 5d. The maps are grouped in different interleaved gates and contain the complete measurement data set for every single acquisition that is studied in this Article, before any averaging and analysis take place. To present the measurement data in as raw a form as possible, other than the realigned phase information being taken away (because it only has trivial physical meaning in a randomized benchmarking experiment), no other corrections including SPAM error renormalization were performed. The colour point that has higher brightness means the measured final state from a randomized benchmarking sequence has higher fidelity. If the colour red is mixed in the data point, this suggests unitary errors have occurred that may result in a measurement having low fidelity but a high coherence/unitarity (analysis in Fig. 4). Note that there is no colour saturation in Fig. 5a, meaning there are no data compression losses unless through the limitations of the viewing/printing device for this Article. However, the colour semicircle at high coherence/visibility is saturated, but no experimental data points lie within those regions. The grey boxes at the top right corner for each map in Fig. 5a are unperformed data points due to early termination of the measurement.
Figure 5b,c describes how the whole randomized benchmarking experiment is stepped through in time, and the numbers in the figure represent the order of stepping. To begin, note that the Clifford gate sequences in every single data point shown in Fig. 5a are rerandomized and different. Now, we have (1) (square) and (2) (optimized) steps through the different interleaved gates in Fig. 5b. We start from the standard square pulse reference (no interleaved gate) with a randomized Clifford gate sequence. Once tomographic readout acquisition is done, we move onto the next interleaved gate, I, and regenerates a new randomized Clifford gate sequence with same sequence length, m, which takes about 1.7 s (at short m). After the last interleaved gate, −Y/2 acquisition is completed, which concludes process (1); the same measurement is repeated again but with the GRAPEoptimized pulses, referred to as process (2). At the end of (1) + (2), the frequency and power calibration then kicks in to adjust the qubit environment (see above). A total of 16 acquisitions are cycled through (eight interleaved gates and two types of Clifford gate pulse) and this takes about 40 s (at short m) including the calibration. When the interleaved gate cycle is done, we now move to Fig. 5c where process (3) starts. Process (3) is a simple fiverepetition sequence of (1) + (2) + calibration; this is repeated on the y axis in Fig. 5a and takes ~3 min (at short m) to complete. Process (4) changes m after completion of process (3), stepping through [1–6, 8, 10, 13, 16, 20, 25, 32, 40, 50, 63, 79, 100, 126, 158, 200, 251, 316, 398, 501, 631, 794, 1,000, 1,259, 1,585, 1,995, 2,512, 3,162] sequentially, a total of 33 steps, as shown on the x axis of Fig. 5a, and takes ~250 min to complete. Finally, process (5) repeats all the above a total of nine times and stacks up on the y axis of Fig. 5a, with a final product of 45 rows. The complete measurement can be expressed as process stack ((1) + (2) + calibration) × (3) × (4) × (5), and lasts for 35 h.
Eliminating the nuisance parameter B
We note that using tomographic measurements also allows a variation of the RB protocol similar to variations previously discussed in the literature^{8,9,10,17,20}. For any particular sequence the tomographic measurements at the end of the sequence include not only a measurement that corresponds to the expected ‘maximaloverlap’ measurement of the state, but also one that corresponds to a ‘minimaloverlap’ measurement. This ‘minimaloverlap’ measurement can be included by setting \(\bar q(m,s) = 1  \bar q(m,s)\) for each such measurement and combining this into in the average estimate of the survival probability for each sequence length, m. If this is done the constant B is mapped to (B + (1 − B))/2 = 1/2. This removal of the SPAM parameter B leaves only two free parameters with which to fit the data, leading to tighter credible regions for the parameter of interest (p).
The randomized benchmarking procedure described above was carried out for 33 different sequence lengths of m (Fig. 4). The survival percentage for each sequence of a particular length was averaged (as discussed above) and a weighted leastsquares nonlinear fit was performed to the data, using Supplementary equation (2), with B set to 0.5. The data points were weighted by the inverse variance of the observed data at a particular m.
To take into account possible gatedependent noise, the nonlinear fit to the data was reanalysed, this time ignoring m of less than four (these are the only m that are likely to be noticeably affected by gatedependent noise)^{13}, with no significant impact on the results. To finalize the analysis, QInfer^{31,32} was used to analyse the data using Bayesian techniques (a sequential Monte Carlo estimation) of the parameter p. As can be seen in Fig. 4b the credible region found is in accordance with the leastsquare fit methods. This provides an indication as to the correctness of the model, which might not be the case if the system were still impacted by lowfrequency noise^{33}. Finally, we note that the use of repeat sequences complicates the analysis surrounding the use of leastsquares estimates and the Bayesian techniques used by QInfer. However, using bootstrapping methods on the data confirms the robustness of the estimates.
Determining the unitarity from tomographic measurements
The tomographic measurements allow the unitarity of the average noise channel to be measured^{14}. The protocol is similar to a randomized benchmarking experiment, except that no inverting gate is applied and the resulting state is best measured as an average over the nonidentity Pauli operators, known as the purity measurement. For a single qubit this can be accomplished by measuring \({\cal Q} = \left\langle {S_x} \right\rangle ^2 + \left\langle {S_y} \right\rangle ^2 + \left\langle {S_z} \right\rangle ^2\), where each expectation value is taken with respect to the state in question. The projective measurements carried out by the tomography allow us to make numerical estimates for each of the components of the purity measurement and thus for \({\cal Q}\). Then, using the techniques discussed above, this is fit to a curve of the form \({\cal Q}(m) = A + Bu({\cal E})^{(m  1)}\), where \(u({\cal E})\) is the unitarity and A and B are parameters that absorb SPAM noise.
Feedback and calibration
We implemented four different controllers to ensure that the spin qubit environment and parameters do not drift. The first two controllers are responsible for spintocharge readout process, and the other two for the Hamiltonian coefficients. Figure 3 shows how all the four controllers and their respective parameters change throughout the whole 35 h of the randomized benchmarking experiment. Figure 3a is the schematic of the circuitry controlling the sensor current I_{sensor}. The difference between I_{sensor} and the desired sensing point I_{ref} is passed through a gain of β, and fed back into V_{TG}. This controller ensures the sensing signal I_{sensor} is always sitting on the most sensitive point for blip detection. Figure 3c presents a schematic of the circuitry controlling the dark blip count, blip_{dark}. The dark blip count refers to the excessive blips that occur even when the qubit spin is down, gathered as the blip detection count at the later half of the readout time window. Dark count occurrence is usually caused by not biasing the readout level in the middle of the Zeeman splitting energy. Having the dark blip count being too high or low may cause the readout visibility to become saturated and will have an effect on the analysis of the randomized benchmarking decay rate. Here we set blip_{ref} to 0.16 for maximum readout visibility for the controller. The above two controllers are automatically applied by doing extra analysis of I_{sensor} traces for each acquisition (a single digitizer data transfer of collective single shot traces of I_{sensor}), and do not require additional adjustment. The next two controllers require interleaved measurements that are independent of the randomized benchmarking sequence. These are done periodically, after every 16 acquisitions. Here, we can modify our Hamiltonian into
where f_{ESR} is the ESR centre frequency adjustment, which can be seen as a multiplier on σ_{z}. This can cancel the effect of the detuning noise offset, \(\epsilon _z\). Ω_{drift} is the effective physical ESR amplitude multiplier, which drifts over time and is balanced by Ω_{ESR} through the controller to maintain the relation Ω_{ESR}Ω_{drift} = 1. We also have the relation \(\Omega^{\prime}_{x,y} = \Omega _{{\mathrm{ESR}}}\Omega _{x,y}\), which is shown in Fig. 1a. f_{ESR} is updated by measuring the difference between the two control sequences, as shown in Fig. 3e. We have one sequence of X/2 → Y/2 with a 0.2 μs gap, while the other one has the Y/2 changed to −Y/2. The two calibration sequences would have equal spin up probability—close to 0.5—if no resonance frequency offset exists, and will have a different probability if \(f_{{\mathrm{ESR}}} + \epsilon _z \ne 0\), regardless of other SPAM errors. We then take the spin up probability difference of these two sequences and feed this back into f_{ESR} with a certain stable gain, where now the controller will enforce \(f_{{\mathrm{ESR}}} + \epsilon _z\sim 0\), because \(\epsilon _z\) has a very slow drift over the calibration period (in the range of minutes). Similarly, after calibrating f_{ESR} we perform another calibration sequence pair (shown in Fig. 3g), where now the first has X/2 repeated 32 times, followed by another X/2 at the end, versus the second having −X/2 at the end. Given that the \(f_{{\mathrm{ESR}}} + \epsilon _z\) term is negligible at this stage, the spin up probabilities of these two sequences are also close to 0.5 and only the same when Ω_{ESR}Ω_{drift} = 1. Any difference in these two probabilities will feed back into Ω_{ESR}. The repetition of 32 is chosen for higher accuracy of calibrating Ω_{ESR} while still maintaining a stable controller. A repetition number that is higher will give better accuracy but with less tolerance of the drift range. This can result in the same spin up probability where Ω_{ESR}Ω_{drift} = A (A is a number close to 1). On average, 16 acquisitions take around 35 s and the two calibrations of f_{ESR} and Ω_{ESR} take ~5 s each. Figure 3b,d,f,h presents plots of feedback values over the measurement time period for the controllers shown on the left of the figure.
During the randomized benchmarking measurement, traces of V_{TG}, V_{G1} and f_{ESR} appear to be binary/steplike, suggesting that changes in the qubit environment are more eventlike rather than drifting. The cause of these jump events could include local charge rearrangement, battery switching of gate sources or local nuclear spin flip. However, Ω_{ESR} appears to be a driftlike mechanism, which we believe is due to the high sensitivity of the microwave source to temperature and the power supply. Interestingly, we observe no clear correlation between all four traces. This is a strong indication that the f_{ESR} jumps of the qubit come from nuclear spin flip rather than local charge rearrangement, which would require a big offset in readout level V_{G1} with a given Stark shift level^{27}.
Data availability
The data sets generated during and/or analysed during the current study are available from the corresponding authors on reasonable request.
Code availability
The analysis code that support the findings during the current study are available from the corresponding authors on reasonable request.
References
 1.
Knill, E. Quantum computing with realistically noisy devices. Nature 434, 39–44 (2005).
 2.
Fowler, A. G., Mariantoni, M., Martinis, J. M. & Cleland, A. N. Surface codes: towards practical largescale quantum computation. Phys. Rev. A 86, 032324 (2012).
 3.
Veldhorst, M. et al. An addressable quantum dot qubit with faulttolerant controlfidelity. Nat. Nanotechnol. 9, 981–985 (2014).
 4.
Kawakami, E. et al. Gate fidelity and coherence of an electron spin in an Si/SiGe quantum dot with micromagnet. Proc. Natl Acad. Sci. USA 113, 11738–11743 (2016).
 5.
Watson, T. F. et al. A programmable twoqubit quantum processor in silicon. Nature 555, 633–637 (2018).
 6.
Zajac, D. M. et al. Resonantly driven CNOT gate for electron spins. Science 359, 439–442 (2017).
 7.
Yoneda, J. et al. A quantumdot spin qubit with coherence limited by charge noise and fidelity higher than 99.9%. Nat. Nanotechnol. 13, 102–106 (2017).
 8.
Emerson, J., Alicki, R. & Życzkowski, K. Scalable noise estimation with random unitary operators. J. Opt. B 7, S347–S352 (2005).
 9.
Knill, E. et al. Randomized benchmarking of quantum gates. Phys. Rev. A 77, 012307 (2008).
 10.
Dankert, C., Cleve, R., Emerson, J. & Livine, E. Exact and approximate unitary 2designs and their application to fidelity estimation. Phys. Rev. A 80, 012304 (2009).
 11.
Magesan, E., Gambetta, J. M. & Emerson, J. Scalable and robust randomized benchmarking of quantum processes. Phys. Rev. Lett. 106, 180504 (2011).
 12.
Magesan, E., Gambetta, J. M. & Emerson, J. Characterizing quantum gates via randomized benchmarking. Phys. Rev. A 85, 042311 (2012).
 13.
Wallman, J. J. Randomized benchmarking with gatedependent noise. Quantum 2, 47 (2018).
 14.
Wallman, J. J., Granade, C., Harper, R. & Flammia, S. T. Estimating the coherence of noise. New J. Phys. 17, 113020 (2015).
 15.
Kimmel, S., da Silva, M. P., Ryan, Ca, Johnson, B. R. & Ohki, T. Robust extraction of tomographic information via randomized benchmarking. Phys. Rev. X 4, 011050 (2014).
 16.
Feng, G. et al. Estimating the coherence of noise in quantum control of a solidstate qubit. Phys. Rev. Lett. 117, 260501 (2016).
 17.
Fogarty, M. A. et al. Nonexponential fidelity decay in randomized benchmarking with lowfrequency noise. Phys. Rev. A 92, 022326 (2015).
 18.
Khaneja, N., Reiss, T., Kehlet, C., SchulteHerbrüggen, T. & Glaser, S. J. Optimal control of coupled spin dynamics: design of NMR pulse sequences by gradient ascent algorithms. J. Magn. Reson. 172, 296–305 (2005).
 19.
Huang, W. et al. Fidelity benchmarks for twoqubit gates in silicon. Nature (in the press); preprint available at https://arxiv.org/abs/1805.05027
 20.
Muhonen, J. T. et al. Quantifying the quantum gate fidelity of singleatom spin qubits in silicon by randomized benchmarking. J. Phys. Condens. Matter 27, 154205 (2015).
 21.
Wallman, J. J. Bounding experimental quantum error rates relative to faulttolerant thresholds. Preprint at https://arxiv.org/abs/1511.00727 (2015).
 22.
Kueng, R., Long, D. M., Doherty, A. C. & Flammia, S. T. Comparing experiments to the faulttolerance threshold. Phys. Rev. Lett. 117, 170502 (2016).
 23.
Dugas, A. C., Wallman, J. J. & Emerson, J. Efficiently characterizing the total error in quantum circuits. Preprint at https://arxiv.org/abs/1610.05296v1 (2018).
 24.
Barends, R. et al. Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature 508, 500–503 (2014).
 25.
Ballance, C. J., Harty, T. P., Linke, N. M., Sepiol, M. A. & Lucas, D. M. Highfidelity quantum logic gates using trappedion hyperfine qubits. Phys. Rev. Lett. 117, 060504 (2016).
 26.
Rong, X. et al. Experimental faulttolerant universal quantum gates with solidstate spins under ambient conditions. Nat. Commun. 6, 8748 (2015).
 27.
Chan, K. W. et al. Assessment of a silicon quantum dot spin qubit environment via noise spectroscopy. Phys. Rev. Appl. 10, 044017 (2018).
 28.
Itoh, K. M. & Watanabe, H. Isotope engineering of silicon and diamond for quantum computing and sensing applications. MRS Commun. 4, 143–157 (2014).
 29.
Angus, S. J., Ferguson, A. J., Dzurak, A. S. & Clark, R. G. Gatedefined quantum dots in intrinsic silicon. Nano Lett. 7, 2051–2055 (2007).
 30.
Lim, W. H. et al. Observation of the singleelectron regime in a highly tunable silicon quantum dot. Appl. Phys. Lett. 95, 242102 (2009).
 31.
Granade, C., Ferrie, C. & Cory, D. G. Accelerated randomized benchmarking. New J. Phys. 17, 013042 (2015).
 32.
Granade, C. et al. QInfer: statistical inference software for quantum applications. Quantum 1, 5 (2017).
 33.
Ball, H., Stace, T. M., Flammia, S. T. & Biercuk, M. J. Effect of noise correlations on randomize benchmarking. Phys. Rev. A 93, 022303 (2016).
 34.
Magesan, E. et al. Efficient measurement of quantum gate error by interleaved randomized benchmarking. Phys. Rev. Lett. 109, 080505 (2012).
Acknowledgements
The authors acknowledge support from the US Army Research Office (W911NF1310024, W911NF1410098, W911NF1410103 and W911NF1710198), the Australian Research Council (CE170100009 and CE170100012) and the NSW Node of the Australian National Fabrication Facility. B.H. acknowledges support from the Netherlands Organization for Scientific Research (NWO) through a Rubicon Grant. K.M.I. acknowledges support from a GrantinAid for Scientific Research by MEXT, NanoQuine, FIRST and the JSPS CoretoCore Program. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the US Government.
Author information
Affiliations
Contributions
C.H.Y. conceived and designed the GRAPE pulse sequences and the feedback control systems for the experiments. C.H.Y. and K.W.C. performed the experiments. C.H.Y., R.H., T.E., S.T.F. and S.D.B. analysed the data. K.W.C. and F.E.H. fabricated the device. K.M.I. prepared and supplied the ^{28}Si epilayer wafer. All authors contributed materials, analysis and/or tools. C.H.Y., R.H., S.T.F., S.D.B. and A.S.D. wrote the paper with input from all coauthors. A.S.D. supervised the project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–2, Supplementary equations 1–2, Supplementary Table 1
Rights and permissions
About this article
Cite this article
Yang, C.H., Chan, K.W., Harper, R. et al. Silicon qubit fidelities approaching incoherent noise limits via pulse engineering. Nat Electron 2, 151–158 (2019). https://doi.org/10.1038/s4192801902341
Received:
Accepted:
Published:
Issue Date:
Further reading

Dipole coupling of a hole double quantum dot in germanium hut wire to a microwave resonator
New Journal of Physics (2020)

Coherence of a Driven Electron Spin Qubit Actively Decoupled from Quasistatic Noise
Physical Review X (2020)

Effect of Quantum Hall Edge Strips on Valley Splitting in Silicon Quantum Wells
Physical Review Letters (2020)

Controlling spins in silicon quantum dots
Journal of Semiconductors (2020)

Spinphoton module for scalable network architecture in quantum dots
Scientific Reports (2020)