Quantum error correction (QEC) promises to exponentially suppress uncorrelated errors in quantum computing devices, bridging the gap between achievable physical error rates and the low logical error rates required for useful quantum algorithms1,2,3. The surface code is a promising candidate for experimental implementations of QEC, in which a repetitive stabilizer circuit protects a logical qubit state.

Superconducting transmon qubits4,5 represent a leading platform for implementing surface-code QEC. There have been recent demonstrations of architectures compatible with QEC and capable of scaling6,7,8,9,10,11,12,13,14,15,16,17. However, a transmon is only weakly nonlinear, with transitions between successive states closely spaced in frequency. Transitions from the qubit computational states to higher-energy leakage states are, therefore, difficult to avoid. These leakage states can be substantially populated by single-qubit gates18,19, entangling gates20,21,22,23,24 and measurement25,26.

Leakage is particularly dangerous in the context of QEC27,28,29,30,31,32. A key underlying assumption of QEC is that the physical errors to be suppressed are sufficiently uncorrelated in both space and time. Contrary to this requirement, a qubit in a leakage state can induce errors on multiple neighbouring qubits, even causing them to leak as well33. The correlated spread of errors through the device represents a major problem for experimental QEC. Identifying and post-selecting out leakage events has permitted cutting-edge experiments on the surface code15,16. Partial leakage removal has been integrated into surface-code circuits14,17. However, all these experiments displayed a characteristic rise in the number of detected errors as the code progressed, indicative of an accumulating leakage population in the device. A demonstration of leakage removal from all qubits in a surface-code circuit has not yet been reported. Further, stabilizing the leakage populations such that error rates do not grow over time is a requirement for scalable QEC, and this remains an important open challenge.

Here, we study and remove the effects of leakage in a surface-code circuit on an array of transmon qubits. First, we detail the dynamics of leakage in the QEC circuit and the spread of errors through space and time. We quantify the effect of leaked qubits undergoing multi-qubit interactions, which is the primary vehicle for the spatial propagation of leakage. Second, we demonstrate the effective removal of leakage from all qubits involved in the surface-code circuit. We introduce a leakage removal operation that is capable of removing leakage on both measure and data qubits. Despite not affecting the rates of leakage generation, we show that the residual leakage populations averaged over all qubits are suppressed to below 1 × 10−3 when using this leakage removal operation and that they do not grow as the code is extended in time. Finally, we show that removing leakage improves logical performance. Using a distance-21 bit-flip code with leakage removal, the injected leakage impacts logical performance equivalently to injected Pauli errors. This confirms that leakage removal is effective in suppressing the correlated nature of leakage-induced errors. Then, using a distance-3 surface code, we show that leakage removal both decreases the rate of logical errors and prevents the code performance from declining over time, proving that QEC can be stable when carried out over many cycles. We extrapolate this behaviour to larger code distances operating well below the threshold, and we find that the injected leakage impacts logical error rates in the same fashion as uncorrelated Pauli errors. In summary, leakage removal overcomes an important obstacle to growing QEC to algorithmically relevant scales.

Characterizing the spread of leakage

Leakage states (Fig. 1a) are particularly problematic in structured QEC circuits because they are long-lived and spread through the device, inducing correlated errors in both space and time. The surface-code circuit displayed in Fig. 1b shows a single cycle that consists of a number of moments. A moment is a grouping of gates operated concurrently in time. Four such moments correspond to CZ gates used to measure the surface-code stabilizers. In the QEC experiments presented in this work, leakage is primarily generated by entangling gate errors and measurement25,26,34. Then, when a qubit in the circuit leaks, subsequent gates involving that qubit produce additional errors.

Fig. 1: Leakage in a structured QEC circuit.
figure 1

a, The energy potential of a transmon qubit, illustrating the computational energy levels \(\left\vert 0\right\rangle\) and \(\left\vert 1\right\rangle\) (blue) and the leakage levels \(\left\vert 2\right\rangle\) and higher (red). b, The circuit for surface-code QEC. A square grid comprises measure qubits (light blue circle) and data qubits (orange squares). The cycle consists of four layers of entangling gates, along with intervening single-qubit rotations, followed by a measurement (M) and reset (R). The reset operation here is shown across all qubits. It could be implemented as single-qubit operations on the measure qubit or include entangling operations with various neighbouring data qubits. c, The time decay (main, blue) and spatial spread (inset) of leakage in a distance-3 surface code following the injection of \(\left\vert 2\right\rangle\) on the central data qubit. Each cycle takes approximately 1 μs. The leakage population is measured at the end of each cycle. The expected decay of \(\left\vert 2\right\rangle\) from T1 relaxation on the leaked qubit alone is indicated (dashed red). The excess leakage population is defined as the subtraction of the leakage population in the absence of injection from the leakage population in the presence of injection.

Figure 1c illustrates the dynamics of leakage in a distance-3 surface-code circuit. At the cycle labelled 0, we fully inject \(\left\vert 2\right\rangle\) by performing a \(\left\vert 1\right\rangle \to \left\vert 2\right\rangle\) rotation on the central data qubit, producing an expected near-50% \(\left\vert 2\right\rangle\) population. We do not expect to exceed 50% of the \(\left\vert 2\right\rangle\) population because we prepare each data qubit in a pseudorandom initial basis state, thus achieving an average \(\left\vert 1\right\rangle\) population of 50% before injection. It takes many surface-code cycles before this injected leakage population decays sufficiently, with an exponential decay constant around 4.4 cycles. However, this decay is somewhat faster than the expected decay from the T1 relaxation of \(\left\vert 2\right\rangle\) alone. The inset shows that the leakage population does not stay on the injected qubit but is also transported to neighbouring qubits as the circuit progresses. At the small code distance being considered, this transport is enough to affect every qubit involved in the circuit.

If there is no attempt to remove it, a single leakage event will persist for many rounds and will spread a notable distance through the device, affecting many measurements and inducing many error detection events. The number of uncorrelated errors required to produce the same effect is the decomposed weight of the leakage event28. This high weight of leakage events when decomposed into uncorrelated errors makes them especially problematic for QEC.

The precise dynamics of leakage depends primarily on the details of the entangling gate used in the circuit. Here, we focus on the diabatic CZ gate used in the Sycamore architecture14,17,35. This gate involves biasing qubits to satisfy the resonance conditions indicated in Fig. 2a. The interaction strength is tuned to achieve a rotation of 2π in \(\left\vert 11\right\rangle \leftrightarrow \left\vert 20\right\rangle\). We follow the convention that the higher-energy qubit state is listed first in two-qubit states \(\left\vert \mathrm{HL}\right\rangle\). This resonance condition also aligns other resonances that involve leakage states. In particular, the \(\left\vert 30\right\rangle \leftrightarrow \left\vert 12\right\rangle\) resonance enables a two-photon process, which allows \(\left\vert 2\right\rangle\) on the lower-energy qubit to move to \(\left\vert 3\right\rangle\) on the higher-energy qubit. Similarly, the \(\left\vert 31\right\rangle \leftrightarrow \left\vert 22\right\rangle\) resonance enables \(\left\vert 3\right\rangle\) on the higher-energy qubit to cause the lower-energy qubit to leak to \(\left\vert 2\right\rangle\), whereas the higher-energy qubit remains leaked in \(\left\vert 2\right\rangle\). These so-called leakage transport processes allow leakage to spread, even in a single QEC cycle.

Fig. 2: Leakage transport and phase errors in CZ gates.
figure 2

a, The eigenenergy ladder for a pair of qubits satisfying the resonance condition for a diabatic CZ gate, where the qubits are detuned by their common nonlinearity \(\left\vert \eta \right\vert\). We denote the two-qubit states \(\left\vert \mathrm{HL}\right\rangle\) with the higher (lower) energy qubit first (second). In addition to the intended resonance (\(\left\vert 20\right\rangle \leftrightarrow \left\vert 11\right\rangle\); blue arrow), higher levels also satisfy a resonance condition, either directly (\(\left\vert 31\right\rangle \leftrightarrow \left\vert 22\right\rangle\); orange arrow) or mediated by a two-photon process (\(\left\vert 30\right\rangle \leftrightarrow \left\vert 12\right\rangle\); red arrows). b, The relative population transport (net change in state populations) ΔPt for the diabatic CZ gate, including the first two leakage levels. The rotation in \(\left\vert 20\right\rangle \leftrightarrow \left\vert 11\right\rangle\) has been calibrated to 2π. Highlighted are the off-diagonal elements due to the couplings between higher levels, with average relative population transport \(\overline{| {{\Delta }}{P}_\mathrm{t}| }\) shown below. c, The two circuits used to measure the relative population transport shown in b. We subtract the population transport Pt in the baseline experiment without a CZ gate (right) from the experiment with a CZ gate (left). d, The circuit for the modified Ramsey experiment shown in e with an interleaved CZ gate to a neighbouring qubit at a higher frequency, followed by tomography on the lower frequency qubit. e, The measured phase shift ϕ during the modified Ramsey experiment with the neighbouring qubit prepared in \(\left\vert 0\right\rangle\) (blue), \(\left\vert 1\right\rangle\) (green), or \(\left\vert 2\right\rangle\) (red) shown in an empirical cumulative distribution function (ECDF) over 20 qubit pairs, with the mean value indicated by the dashed line. The CZ gate should produce a phase shift of ϕ = 0 for an input \(\left\vert 0\right\rangle\) and a shift of ϕ = π for an input \(\left\vert 1\right\rangle\). A spurious phase shift near ϕ ≈ 0.65π is produced when the higher-energy qubit is prepared in \(\left\vert 2\right\rangle\).

The amount of leakage transport a gate produces is not normally calibrated and, so, depends on the chosen gate length and effective coupling between levels. Figure 2b shows how a calibrated CZ gate affects populations, as measured by the circuits shown in Fig. 2c. In this device, we find around 18% of the population of \(\left\vert 30\right\rangle\) is transported to \(\left\vert 12\right\rangle\) and vice versa. The transport population is around 61% for \(\left\vert 31\right\rangle \leftrightarrow \left\vert 22\right\rangle\). We can also see the first indications of the expected higher resonances, such as \(\left\vert 42\right\rangle \leftrightarrow \left\vert 33\right\rangle\). Data for each individual experiment and further characterization of the readout can be found in Supplementary Information Section 1.

Even in the absence of leakage transport, we find that leakage induces additional errors in the CZ gate. When the higher-energy qubit is in \(\left\vert 2\right\rangle\) and the lower-energy qubit is in the computational basis, leakage transport is not possible but a considerable phase error is imparted on the non-leaked qubit. When a CZ gate is applied as in Fig. 2d with the higher-energy qubit in \(\left\vert 0\right\rangle\), we expect to see no phase shift ϕ = 0 on the lower-energy qubit. With the higher-energy qubit prepared in \(\left\vert 1\right\rangle\), we expect to see a phase shift ϕ = π, indicating a well-calibrated CZ gate. Figure 2e shows the relative phase for 20 pairs of qubits. When computational states \(\left\vert 0\right\rangle\) and \(\left\vert 1\right\rangle\) are prepared, we see tight groupings around the expected phase shifts ϕ = 0 and ϕ = π, respectively. However, when a leakage state is prepared on the higher-energy qubit, we see a phase shift near ϕ ≈ 0.65π. This represents a considerable computational error on the non-leaked qubit, and is a notable source of errors to be detected and corrected as leakage spreads.

These results illuminate the dangers of leakage. A single leakage event on any qubit will expose many CZ gates to a leaked input state before it decays sufficiently. Each of these interactions has a substantial probability of introducing new computational errors, moving the leakage to another qubit or inducing additional leakage on previously non-leaked qubits. In QEC circuits, these effects are damaging enough that they must be included in simulations to achieve good agreement with experimental performance17. Accordingly, we are motivated to remove leakage in the code circuit so as to suppress these effects.

Suppressing leakage populations in a QEC circuit

Having better understood the dangers of leakage in QEC circuits, we turn to removing it. An unconditional reset gate can remove all energy from a qubit, including when it starts in a leakage state, and can be applied to the measure qubits at the end of each cycle36,37,38,39,40. However, our study of leakage transport motivates the need to remove leakage from the data qubits as well. Leaving the computational state intact is incompatible with an unconditional reset and requires a more delicate leakage removal operation.

Three broad approaches for leakage removal have been proposed: swap type28,31,41, in which the roles of measure and data qubits are exchanged at a regular interval by additional operations; feedback type32,33, in which the leakage is identified classically from measurement patterns and feedback is applied to return the qubit to the computational subspace; and direct type42, in which an operation is used to remove leakage from a qubit without disturbing the computational states. In light of our findings on leakage transport, swap-type strategies become more difficult to justify; only half the qubits are reset in each cycle, and so leakage may still move between qubits and thereby spread through time. Similarly, the conditional nature of feedback-type approaches prevents them from fully solving the leakage problem. Leakage states cause several errors before they are noticed and corrected. Hence, we pursue a direct removal approach.

In the following sections, we present and compare three leakage removal strategies. First, No Reset forgoes any operations at the end of the cycle, representing the best case for a simple Pauli error model but the worst case for leakage. Second, MLR applies multi-level reset (MLR) gates39 on measure qubits immediately after measurement at the end of every cycle. This adds additional error to the cycle due to the additional data qubit idle time while the gate is performed but has been previously shown to remove leakage population and improve logical performance compared with the baseline No Reset strategy39. Finally, in DQLR, we perform an MLR on the measure qubits followed by a data qubit leakage removal (DQLR) operation, which is made up of two constituent gates. First, we employ a LeakageISWAP gate, which is a two-qubit interaction like the diabatic CZ gate but which executes an ISWAP in the \(\left\vert 11\right\rangle -\left\vert 20\right\rangle\) basis. We choose the frequency arrangement to transport leakage excitations from the data qubit to the measure qubit. Second, a fast reset gate on the measure qubit removes any excitation transported by the LeakageISWAP. Additional details of the DQLR process and constituent operations are included in Supplementary Information Section 2. Notably, for MLR and DQLR, the name of the strategy denotes the operation added to the entire leakage removal strategy.

To compare the leakage dynamics for the three strategies, we implemented a distance-3 surface code on a Sycamore processor. We measured the evolution of the leakage population as the surface code progresses by truncating the circuit in time and performing a measurement that can resolve \(\left\vert 2\right\rangle\) on all qubits39. For Fig. 3a, we performed this truncation at the end of each surface-code cycle (top of Fig. 3b). With No Reset, we observed a gradual rise in the leakage populations over all qubits, reaching an average leakage population of nearly 5% for data qubits and nearly 3% for measure qubits over 30 cycles. Note that, even after 30 cycles, the leakage populations had not stabilized but continued to grow. Using MLR reduced the average measure qubit leakage populations to about 3 × 10−4, but the average data qubit populations still rose to over 1.5%. Using DQLR suppressed the average leakage populations to around 10−3 for data qubits and less than 10−4 for measure qubits. Most importantly, DQLR maintained these levels throughout the full 30 cycles.

Fig. 3: Leakage population during surface-code execution.
figure 3

a, Average leakage populations for data qubits (squares) and measure qubits (circles) measured at the end of each surface-code cycle with No Reset (red), MLR (green) and DQLR (blue). Error bars correspond to 1 standard deviation. b, Top, The surface-code circuit shown for a pair of neighbouring measure and data qubits. Each surface-code cycle is highlighted (rounded black rectangles). Bottom, A single surface-code cycle showing each moment in the cycle, each separated by grey dashed lines. c, Leakage populations after each moment in the cycle for the MLR (green) and DQLR (blue) leakage removal strategies, averaged over data qubits (squares) and measure qubits (circles) and over cycles 25–30. Error bars correspond to 1 standard deviation. Avg, average.

We can use the same technique to study the dynamics of leakage within a surface-code cycle by truncating the circuit at each moment midway through a cycle (bottom of Fig. 3b). Each surface-code cycle consists of ten moments, and Fig. 3c shows the leakage population measured after each moment in the cycle, averaged over cycles 25–30 when the leakage populations have stabilized. We neglect the No Reset strategy here, as its leakage populations do not stabilize. With MLR, the average leakage population for the data qubits saturates to a stable value around 1.5%, consistent with Fig. 3a. However, the average measure qubit leakage population starts each cycle at a very low value near 2 × 10−4, grows over the course of the cycle as operations produce leakage and is then reduced back to its initial low value by the reset procedure. Thus, we estimate that the operations produce a leakage population of around 5 × 10−3 in each cycle. With DQLR, the leakage populations for both measure and data qubits grow over the course of the cycle and are removed by the reset procedure. The data qubits start each cycle with a leakage population of around 1 × 10−3, again increasing to around 5 × 10−3 immediately following measurement, before it is removed. The measure qubits attain even lower leakage populations compared to MLR.

These results demonstrate that our DQLR procedure successfully suppresses steady-state leakage populations to previously unachievable levels and stabilizes those levels over the course of a long QEC circuit. The removal strategy also contains the leakage dynamics to a single cycle. However, the residual ability for leakage to spread and induce correlated errors within a single round30 should be the subject of further study.

Effect on QEC logical performance

Having achieved low leakage populations in both data qubits and measure qubits with our DQLR procedure, we turn to evaluating the logical performance. We consider two codes providing complementary information: a distance-21 bit-flip code and a distance-3 surface code. Our physical qubit error rates place the surface code close to the threshold, whereas the bit-flip code is well below the threshold14,17. The vastly lower logical error rates for the bit-flip code give us finer resolution on the effect of leakage within the code. In contrast, the surface code is a more challenging circuit for calibration and operation and is sensitive to both bit-flip and phase-flip errors, providing an environment in which more potentially adverse effects of a reset can be detected and measured.

Figure 4a shows the logical error probability of a distance-21 bit-flip code carried out to 60 cycles while introducing both leakage and Pauli errors. We injected leakage population PL into all qubits by applying a \(\left\vert 1\right\rangle \leftrightarrow \left\vert 2\right\rangle\) rotation on each qubit immediately after the first Hadamard gate layer (Fig. 4b, left), where the rotation angle θL is

$${\theta }_\mathrm{L}=2\sin^{-1}\left(\sqrt{2{P}_\mathrm{L}}\right).$$

We compare PL to injected Pauli error ‘population’ PP, which is produced by X and Z rotations on the data and measure qubits (Fig. 4b, right), respectively, taking advantage of the classical nature of the bit-flip code. The Pauli error rotation angle θP is

$${\theta }_\mathrm{P}=2\sin^{-1}\left(\sqrt{{P}_\mathrm{P}}\right),$$

where the missing factor of 2 relative to the definition of leakage population accounts for Pauli rotations always affecting the qubit state in the computational basis, whereas leakage injection applies only to the qubit population in \(\left\vert 1\right\rangle\). We fitted the experimental data and numerical simulations to an offset power law as a guide, as detailed in Section 5 of the Supplementary Information.

Fig. 4: Bit-flip code logical performance and dependence on injected errors.
figure 4

a, Logical error probability for a distance-21 bit-flip code run for 60 cycles, under the effect of either injected leakage (dark circles) or injected Pauli errors (light squares). Three leakage removal strategies, No Reset (red, unfilled), MLR (green, semi-filled) and DQLR (blue, filled), are considered. Lines are fits to experimental data using a power law with an offset. Below, Highlights of fits to experimental data (solid) and numerical simulations (dashed) for the MLR and DQLR strategies. b, Circuits for the bit-flip code, showing the error injection locations for both leakage (left) and Pauli errors (right). Exp., experimental; Sim., simulated.

With No Reset, even small amounts of injected leakage population of less than 1% cause the logical error probability to rise above 40%. This is in contrast with correctable Pauli errors, which can be introduced to around 5% population before similar logical error probabilities are encountered. With MLR, the logical error probability is drastically lowered without injection, consistent with previous measurements in bit-flip codes39. Still, the logical error probability rises much more rapidly when injecting leakage compared to injecting Pauli errors. We attribute this to unmitigated leakage accumulation on the data qubits, which leads to a high decomposed weight of uncorrelated errors and ultimately logical errors. When we prevent this leakage buildup with DQLR, we observe a much smaller difference between the code’s response to injected leakage compared to injected Pauli errors. This is strong evidence that the DQLR operation has successfully reduced the decomposition weight of a leakage event to near 1. In this situation, leakage has around the same influence on logical performance as an equivalent amount of Pauli error and has been prevented from effectively spreading and inducing correlated errors.

Also note the good agreement between data and numerical simulation for injected leakage and Pauli errors, quantifying our understanding of the effects of leakage in the code with both MLR and DQLR strategies. In both cases, note that we slightly underestimated the logical error induced by the injected leakage, illustrating the difficulties of fully capturing the effect of correlated errors even with DQLR preventing a substantial spread across cycles and emphasizing the importance of future work on leakage dynamics inside a single cycle. Nonetheless, the close correspondence of the Pauli simulation to the injected leakage experimental data for DQLR helps justify future Pauli simulations as useful estimates of final code performance when leakage is removed in each cycle.

Figure 5a shows the average detection probabilities corresponding to the weight-4 stabilizers in the distance-3 surface code. Detection probabilities are the fraction of the total number of experiments in which an error was detected on a given stabilizer. With No Reset, the buildup of the leakage population produces more errors as the code progresses, creating a rising pattern of detection probability. With MLR, a large portion of this rise is mitigated, but the detection probability still rises by 2.5 percentage points over the course of the first 15 cycles. With DQLR, the detection probability immediately stabilizes to around 18% and remains steady throughout the code duration. We attribute this to the recurrent removal of leakage on all qubits, which prevents any growth of the leakage populations and the resulting correlated errors over time. This resolves a key concern in state-of-the-art QEC15,16,17 in which detection probabilities were found to rise even with partial leakage removal or post-selection. These results confirm the relationship between rising detection probability and rising leakage populations and demonstrate the resolution of this effect.

Fig. 5: Surface-code logical performance and dependence on injected errors.
figure 5

a, Measured detection probability averaged for the weight-4 stabilizers in a distance-3 surface code under the three leakage removal strategies studied in this work. Lines are connections between data points. b, Measured logical error probability for a distance-3 surface code run for 15 cycles, for different injected leakage populations and the three different leakage removal strategies studied in this work. Solid lines are fits to an offset power law. The inset shows that the circuit has an included layer where leakage is injected by performing a \(\left\vert 1\right\rangle \leftrightarrow \left\vert 2\right\rangle\) rotation. c, Comparison of the simulated dependence of the surface-code error budget 1/Λ5/7 (the inverse of the exponential error suppression factor between a distance-5 and distance-7 surface code) with an injected leakage population for MLR (green) and DQLR (blue). Solid lines are fits to a ratio of offset power laws, whereas the dotted light blue line is a linear fit of the data when using DQLR. Red dashed line indicates where Λ5/7 = 1.

In Fig. 5b, we compare the three leakage removal strategies by measuring the logical error probability of a distance-3 surface code after 15 cycles. At 0% injected leakage, the circuit corresponds to the standard code circuit with an additional idle where the injection is otherwise inserted. Over the range of injected leakage population values, No Reset exhibits the worst logical performance, followed by MLR, with DQLR having the lowest logical error probability. This confirms that DQLR improves logical errors by suppressing correlated errors from leakage, despite the additional cycle time and errors introduced by the DQLR operations. Further, the degradation of the logical performance of No Reset and MLR was faster with more injected leakage compared to DQLR.

To study the performance of the surface code in a regime further below the threshold, we turned to numerical simulations of distance-5 and distance-7 surface codes. To consider scaling performance, we use the exponential error suppression factor Λ5/7, defined as Λ5/7 = ε5/ε7, where ε5 and ε7 are the logical error rates for a distance-5 and distance-7 surface code, respectively. Using Fig. 5c, we investigate Λ5/7 for a hypothetical device with lower component errors than what is currently realizable (see Supplementary Information Section 6 for details). In particular, we set the intrinsic leakage rates to zero and varied the probability of leakage injection. With no leakage in the system, Λ5/7 ≈ 7.2, independent of the leakage removal strategy. However, when injecting up to 4 × 10−3 leakage population per round (comparable to intrinsic leakage rates in current devices), the surface-code error budget 1/Λ5/7 (ref. 17) rises rapidly and nonlinearly for MLR. With DQLR, in contrast, the leakage increases 1/Λ5/7 much more slowly and with a near-linear dependence on the injected leakage population, characteristic of an uncorrelated error source14,17. With this ability to maintain effective error suppression in the presence of leakage, DQLR successfully mitigates the dangers of correlated leakage-induced errors to scalable QEC.

Summary and outlook

We have demonstrated the effective removal of leakage from all qubits involved in a surface-code QEC circuit. Moreover, we have shown that when leakage is removed on all qubits, correlated leakage-induced errors are suppressed. Moreover, the logical performance of the code improves outright and stabilizes in time. We confirm the conjecture that the growth in the logical errors is attributable to leakage, and we did not uncover any other major sources of logical error that grew as the code continues in time.

With these findings, we unequivocally resolve the longstanding concern that qubits with weak nonlinearity cannot successfully implement QEC at long times due to correlated leakage-induced errors. As such, we confirm that large arrays of transmon qubits are a viable and promising architecture for QEC at scale.