Real-time processing of stabilizer measurements in a bit-flip code

Although qubit coherence times and gate fidelities are continuously improving, logical encoding is essential to achieve fault tolerance in quantum computing. In most encoding schemes, correcting or tracking errors throughout the computation is necessary to implement a universal gate set without adding significant delays in the processor. Here, we realize a classical control architecture for the fast extraction of errors based on multiple cycles of stabilizer measurements and subsequent correction. We demonstrate its application on a minimal bit-flip code with five transmon qubits, showing that real-time decoding and correction based on multiple stabilizers is superior in both speed and fidelity to repeated correction based on individual cycles. Furthermore, the encoded qubit can be rapidly measured, thus enabling conditional operations that rely on feed forward, such as logical gates. This co-processing of classical and quantum information will be crucial in running a logical circuit at its full speed to outpace error accumulation.


INTRODUCTION
Fault-tolerant quantum computation offers the potential for vast computational advantages over classical computing for a variety of problems 1 .The implementation of quantum error correction (QEC) is the first step toward practical realization of any of these applications.This typically requires detecting the occurrence of an error by performing a stabilizer measurement, followed by either a corrective action on the physical device (active QEC), or a frame update in software (passive QEC) [2][3][4][5][6][7][8][9][10][11][12][13][14] .In either case, these checkpoints must occur regularly to protect a quantum state throughout the computation.To do that, one needs to rapidly measure the stabilizers with high fidelity and without disrupting the encoded qubit.In spite of these challenges, recent progress has been made in repetitive stabilizer measurements across a diverse range of physical architectures, including trapped ions 15,16 , superconducting qubits [17][18][19][20][21] , and defects in diamond 22 .
Furthermore, when the stabilizer measurements cannot be trusted as they themselves are error prone, one can introduce a decoder that uses information from multiple rounds of stabilizer measurements 39,40 or from the spatial connectivity of the device 41 to determine the appropriate correction.Unfortunately, performing the decoding calculation in software at a high level of the hardware stack hinders low-latency correction and fast feedforward control due to the communication and computation overhead 17 .
In this work, we overcome this bottleneck by performing both QEC decoding and control with custom low-latency hardware, which acts as a classical co-processor to our quantum processor.We demonstrate repeated active correction, as well as real-time decoding of multi-round stabilizer measurements.We show that the decoding strategy successfully mitigates stabilizer errors, and identifies the encoded state with a latency far below the qubit coherence times, while matching the results obtained by post processing on a conventional computer.

Repeated stabilizer measurements on a five-qubit device
For our demonstration, we implement a three-qubit code that corrects bit-flip errors ( X), and is sufficient to encode one logical bit of classical memory.We use an IBM five-transmon device similar to ibmqx2 (refs 42,43 ; Fig. 1), of which three transmons (D 1 , D 2 , D 3 ) are used as data qubits, and two (A t , A b ) as ancilla qubits to measure the stabilizers.Each qubit is coupled to a dedicated resonator for readout and control.Additional resonators dispersively couple D 1 , D 2 with A t and D 2 , D 3 with A b .
We perform CNOT gates between data and ancilla qubits by using a sequence of single-qubit gates, and a ZX 90 rotation driven by the cross-resonance interaction 44 .By applying two CNOT gates in succession controlled by two different data qubits with a single ancilla as the target, the parity of the data qubit pair is mapped onto the ancilla state.The same protocol is applied simultaneously to both data qubit pairs, with the shared qubit D 2 interacting first with A t , then with A b (Fig. 1b).The ancilla measurement result a t,b = 0(1) ideally corresponds to even (odd) parity for the corresponding pair.We refer to the complete sequence comprised of four CNOT gates and ancilla measurement as a single error correction cycle.The result of each cycle (the measurements a t , a b ) is a syndrome that identifies which data qubit (if any) has most likely been subjected to an X error.
Key to preserving a logical state is the capability of repeating such stabilizer measurements 16,17,20,21 , which has two technical requirements.First, the two ancilla qubits must be reused at every cycle, either by resetting them to the ground state 45,46 or by tracking their state.For either reset or state tracking by measurement, we need to ensure that the readout process is nondestructive, i.e., the result is consistent with the qubit state at the end of the measurement.This sets an upper limit to the allowable photon number, and therefore to the readout fidelity (Supplementary Methods).Second, the readout cavities must be depleted of photons before starting the new cycle to prevent gate errors.To accelerate the cavity relaxation to its steady state (near vacuum), we employ the CLEAR technique 47 for the slower resonator coupled to A b (Supplementary Table 2), reducing its average photon population to <0.1 in 600 ns.Altogether, we measure a single-round joint stabilizer readout fidelity of 0.61, averaged over the eight computational states of the data qubits (Supplementary Fig. 3).

Real-time processing of measurement results
An integral part of our experiment is the interdependence of qubit readout and control, mediated by fast processing of the measurement results.For this purpose, we use a combination of custom-made and off-the-shelf hardware, consisting of pulse sequencers, receivers, and a processor (Fig. 1c), all based on fieldprogrammable gate arrays (FPGAs).In particular, the storage capability of the processor FPGA enables expansion beyond onetime feedback protocols 18,48,49 , where conditional actions rely on single, or joint but simultaneous measurements.During each QEC cycle, digital-to-analog converters in the pulse sequencers produce a pre-programmed series of gate and measurement pulse envelopes.Each returning readout signal is captured by a receiver channel via an analog-to-digital converter, where it is integrated and compared against a calibrated threshold to determine the qubit state 48 .The processor collects all the digitized results and stores them in memory.After a preset number N of cycles have been executed, the processor feeds the stored values to an internal custom calculation engine.The engine function result is broadcast to the pulse sequencers, which conditionally apply a corresponding set of gates.The overall latency to store and process the classical data and to issue a conditional pulse is 590 ns, a small fraction of coherence times (Supplementary Methods).
Preserving an encoded qubit state by active error correction We will explore three distinct approaches to the bit-flip code.In all cases, we first prepare the logical excited state 111 j i. Next, we apply one of the following schemes: (i) uncorrected, in which the cycle is performed but no correction is applied to the data qubits based on the syndrome measurements, and the ancilla qubits are not reset; (ii) repeated error correction (REC), in which, at the end of each cycle, a correction gate is conditionally applied to the data qubits based on the last syndrome results only, and the ancillas are reset; (iii) decoder error correction (DEC), in which N cycles are performed without ancilla reset or corrective gates, and the set of syndromes from all N cycles are used to determine and apply the optimal correction via a decoder 40 .To assess how well each code has protected the prepared state after a desired number of cycles, we perform a logical data measurement.This involves measuring the constituent physical data qubits and computing the majority function over the digitized results {d 1 , d 2 , d 3 }.In cases (i) and (ii), the majority function is calculated offline.In case (iii), the processor computes both the decoding and majority functions sequentially, making the result available for further conditional operations.
We begin by comparing the REC protocol to the uncorrected case (Fig. 2).For REC, there is a one-to-one relation between the two-bit value syndrome {a t , a b } and one of the three possible corrective X gates (in blue in Fig. 2a), or no gate at all.The same syndrome value is used to actively reset the ancilla qubits for use in subsequent cycles (Fig. 2b).
When error detection is based on a single round of stabilizer measurements, it is impossible to distinguish between the targeted data errors and stabilizer measurement errors, largely caused by imperfect CNOT gates and ancilla readout.Thus, the resulting active correction directly propagates syndrome errors to the data qubits.These errors dominate in the case of d 1 and d 3 , whose average values decay faster with the number of cycles than without active correction (Fig. 2b).Conversely, the larger intrinsic error per cycle for d 2 (due to its shorter T 1 ) is partially compensated by the protocol.Overall, this gain nearly balances out the errors introduced by the active error correction, as shown by comparing the results of the majority function (Fig. 2b).REC can be thought of as repeated one-time feedback, where the processor storage and calculation engine are unused.The added latency is considerable: for each cycle, the stabilizer results are aggregated by the processor and forwarded to the pulse sequencers (400 ns), followed by correction and reset operations (160 ns).Improvements in logical state protection are achieved by correlating multiple stabilizer measurements using the DEC protocol.In ref. 17 , a simplified minimum-weight perfect-matching decoder 40 was used to post-process the syndrome results, and differentiate between true data bit flips and false positives.We apply the same method (Fig. 3), but with the crucial difference that the results are processed in real time.Specifically, the processor acquires stabilizer measurement results for N cycles, and uses the engine to decode them into the appropriate set of X gates using a precomputed lookup table.These corrections are then applied by the pulse sequencers on the data qubits.Finally, the data qubits are measured as in Fig. 2, with the majority function also computed on the processor.Whereas for N ≤ 2 the decoder is ineffective-as there are not enough records to identify ancilla readout errors-a gap emerges at larger N (Fig. 3c) in favor of the decoder.Furthermore, this scheme eliminates the per-cycle latency cost; the latest ancilla results can be processed while the next cycle is executing.The total additional latency becomes fixed at 1300 ns (590 ns for each processor engine call, 120 ns for corrective gates), approximately equal to that accrued over two REC cycles (Table 1).
In an effort to minimize errors by avoiding unnecessary quantum gates, it is important to move as many operations as possible to the classical hardware.In this case, we note that measuring the data qubits immediately after conditional X gates is equivalent to inverting the classical measurement result.Therefore, in a final experiment (Fig. 4, triangles), we dispense with the active correction and instead filter the d i results based on the decoder output.This corresponds to a Pauli frame update (PFU) 2 applied just before the measurement.The slight reduction in latency (120 ns) and error rate due to the absence of these pulses consistently achieves 1-2% improvement for all N over the actively corrected case.The results match those obtained by postprocessing all the data in software (diamonds), confirming that the fast classical loop works as expected.
Finally, we evaluate the decoder against the majority result we would obtain by replacing the QEC gates and measurements, with an idle time of equal duration.The result shows that, for all N > 1, DEC has a higher success probability of determining the initial state compared to free decay (Fig. 4, gray curve).

DISCUSSION
Although the experiment ends with the measurement of the data qubits, the readily available majority result may be used to condition additional operations on a second encoded qubit.This would be the case, for instance, to teleport an S gate using a logical ancilla 27 .More generally, the ability to update the Pauli Fig. 2 Real-time repeated error correction with ancilla reset.a Gate sequence for independent cycles of active error correction.b Average digitized result for each of D 1 (red), D 2 (blue), and D 3 (green) measurements after initialization in 111 j i and N cycles of a (full symbols).The results are compared to those obtained with the same circuit, but where data correct and ancilla reset operations are omitted (empty symbols).c Average majority vote m for the data in b.The closed-loop circuit (full symbols) does not improve over open loop (empty).In both b and c, solid (dashed) curves are obtained from numerical simulation for the corrected (uncorrected) case (Supplementary Methods).Error bars: range over five repetitions of the experiment (3000 shots each).
frame in real time will be essential to implement quantum algorithms at the logical level.Since not all gates can be transversal in any given code 50 , conditional operations based on the current frame can be used to complete the universal gate set (e.g., T gates in the surface code 30 ).
In conclusion, we have demonstrated the repeated measurement and real-time processing of stabilizers for a minimal bit-flip code.An intertwined readout and control system provides a realtime interface to the quantum processor, converting a series of stabilizer results to the current Pauli frame without interrupting the execution of a potential algorithm.This approach is not limited to superconducting qubits, but is applicable to any quantum computing platform that faces coherence-limited operation.
Finally, we touch upon the applicability of the presented control architecture to larger circuits.Clearly, the use of a look-up-table as a decoder is a limiting factor, with the number of entries scaling exponentially with both circuit depth and width.However, we predict that 3-4 cycles of a small surface code 40 are within reach of this approach, provided that the processor is upgraded with commercially available, albeit significantly larger, memory.4 Comparison between different QEC schemes.Measured average majority result m for REC (circles, from Fig. 2) and DEC with final active correction (squares, Fig. 3).The results obtained by replacing the correction in DEC with a PFU (triangles) match those obtained by post-processing (diamonds) the uncorrected data, using the same decoder.In all DEC cases and for N > 1, m stays above the result expected from an equivalent idle time (gray curve).Inset: difference between the average majority result for DEC with PFU and REC.Error bars: range over five repetitions of the experiment (3000 shots each).
Developing efficient decoders for the fault-tolerant scale is an active area of research, with promising results addressing both low-latency and scalability requirements 51 .

Classical control hardware
The real-time protocols presented above rely on the interconnection between the receivers (two Innovative Integration X6-1000M digitizers), the processor (BBN Trigger Distribution Module-TDM in ref. 48), and the pulse sequencers (BBN Arbitrary Pulse Sequencers-APS2).The event sequence and the communication between those instruments, as well as their interface with the qubit device (hosted in a Bluefors BF-LD400), are detailed in the Supplementary Methods.

Numerical simulations
In this section, we describe the model and methods used to obtain the numerical simulation results shown in the main text.We chose not to do a full time-dependent master equation simulation of the error correction, as such an open system simulation is numerically intensive.Further, for the real-time error correction with ancilla reset, interspersing strong measurement and conditional operations within a time-dependent evolution is a nontrivial task.Instead, we use a simulation model that is approximate, but with well-controlled error that does not significantly reduce the accuracy of our results.We aim for qualitative agreement with the experimental results, using a model with no fit parameters (i.e., we do not search among models for a best fit to the data), with all free parameters in our model determined by independent characterization of the device.Each round of the error correction can be thought of as an entangling operation (CNOT gates), followed by a measurement operation, followed by an optional correction and ancilla reset operation.This can be represented by the following composition of linear maps where ρ i is the state after the ith round of error correction, and E, M, and R m are the entangling, measurement, and correction/reset operations, respectively, with the correction/reset operation being conditional on the vector of ancilla measurement outcomes m.The modeling of these operations is described in detail in the Supplementary Methods.

Fig. 1
Fig. 1 Stabilizer measurements on a five-transmon device.a Schematics of the device implementing the bit-flip code with the data qubits D 1 , D 2 , D 3 and the ancilla qubits A t , A b .Triangles represent the bus resonators coupling the qubits at their vertices.A Josephson Parametric Amplifier 52 (Converter 53 ) enhances the readout of A t (A b ).b Gate and measurement sequence for one round of stabilizer measurements.CNOT gates map the parity of D 1 -D 2 (D 2 -D 3 ) onto A t (A b ) and are applied concurrently two at a time.c Simplified setup diagram highlighting the closed loop central to active error correction and decoding.For QEC cycles n ≤ N, the processor stores the stabilizer results fa t ; a b g n acquired by the receivers.When n = N, it executes a custom function (here decoder) and broadcasts the result back to the pulse sequencers for conditional gates (here X).The same framework is used to execute a logical data measurement, where the majority function is applied on a single acquisition {d 1 , d 2 , d 3 }.

Fig. 3
Fig.3Real-time decoder-based error correction and logical measurement.a Gate sequence with active correction, based on N rounds of stabilized measurements, followed by simultaneous data qubit measurements and majority function.The sequence in each cycle is identical to the uncorrected case in Fig.2(including the cavity depletion, not shown).The shaded areas highlight the real-time processing.b Average results d 1 , d 2 , d 3 following the conditional X gates in a (solid symbols), compared to the uncorrected case (same as Fig.2b; empty).Dashdotted (dashed) curves are obtained from numerical simulation for the real-time corrected (uncorrected) case.c Average majority vote m for the data in b.Error bars: range over five repetitions of the experiment (3000 shots each).

Table 1 .
The latency includes the time for classical processing, as well as active correction, where applicable.