Silicon CMOS architecture for a spin-based quantum computer

Recent advances in quantum error correction codes for fault-tolerant quantum computing and physical realizations of high-fidelity qubits in multiple platforms give promise for the construction of a quantum computer based on millions of interacting qubits. However, the classical-quantum interface remains a nascent field of exploration. Here, we propose an architecture for a silicon-based quantum computer processor based on complementary metal-oxide-semiconductor (CMOS) technology. We show how a transistor-based control circuit together with charge-storage electrodes can be used to operate a dense and scalable two-dimensional qubit system. The qubits are defined by the spin state of a single electron confined in quantum dots, coupled via exchange interactions, controlled using a microwave cavity, and measured via gate-based dispersive readout. We implement a spin qubit surface code, showing the prospects for universal quantum computation. We discuss the challenges and focus areas that need to be addressed, providing a path for large-scale quantum computing.

The most promising routes towards large-scale universal quantum computing all require quantum error correction (QEC) [1], a technique that enables the simulation of ideal quantum computation using realistic noisy qubits, provided that the errors are below a fault-tolerant threshold.Using the most forgiving methods, such as the two dimensional surface code [2], these error thresholds can be as high as 1% [3], a level that is now routinely achieved across several qubit platforms [4][5][6][7][8][9][10].However, these approaches also require a platform that can be scaled up to very large numbers of qubits, of order 10 8 .This currently creates one of the most stringent barriers in the field, even for the most promising platforms.Here, we propose a method to overcome this hurdle, using spin qubits in silicon and taking direct advantage of CMOS technology.While silicon was recognized early on as a promising platform in the seminal work of Kane [11], leading to many novel architectures [12][13][14][15][16][17][18], a key and contrasting feature of our approach is that each architectural component is based on existing devices and commercially available technology to provide a scalable solution.We show that it is possible to construct a highly dense two-dimensional qubit array starting from a single silicon-on-insulator (SOI) wafer.
Silicon CMOS integrated circuits (ICs) are the prototypical example for scalable electronic platforms, now holding transistor counts exceeding billions.This remarkable level of integration is based upon decades of advances in silicon materials technologies, and these will also be crucial in the development of high-quality spin qubits.A key architectural aspect of ICs has been the use of parallel addressing via word lines and bit lines facilitating rapid read and write operations on large 2D arrays of bits.Unfortunately, this method cannot directly be applied to scale qubit arrays.Unlike transistors, the tolerance levels of qubits are small, thereby requiring individual tunability.However, as we show here, the highly repetitive nature of error correction methods like the surface code enables the use of an advanced protocol for parallel addressing.Individual qubit stabilization is further obtained via floating memory gate electrodes that can be routinely reset, similar to dynamic random access memory (DRAM) systems.Together, these allow the design of a platform where the number of addressing lines increases in a scalable manner proportional to √ N , where N is the number of qubits.

I. PHYSICAL ARCHITECTURE
The general architecture we propose is depicted in Fig. 1.We start with a SOI wafer, where the top layers host the classical circuitry, the isotopically enriched silicon-28 bottom layer holds the quantum circuit, and these are interconnected via metal lines which penetrate the oxide region, see Fig. 1a.The fabrication could be performed monolithically, from a single wafer, or include flip-chip technologies to enable the construction of the two circuits separately.We focus here on single spin qubits confined in quantum dots [10].For complete qubit control, one data line (D 2i ) is interconnected to each corresponding qubit (Q i ) to tune the qubit resonance frequency (ν i ), while a second (D 1i ) interconnects to each J-gate to control the exchange coupling between qubits, shown in Fig. 1b.To provide individual, row, or global qubit addressing, the data lines are controlled by a combination of word lines (W ) and bit lines (B).Assuming the minimal width of, and separation between, the gates and doped regions is equal to the minimum feature size λ, the classical circuit occupies an area 80λ 2 per qubit (see Supplementary Information section 1 for further details).A feature size of 7nm would require a minimum qubit size of ≈ 63 nm × 63 nm, consistent with experimental realizations of silicon quantum dot qubits [10,19].Large foundries are now capable of manufacturing some features down to this size, but ongoing advances in down-scaling will be needed to fabricate the classical devices assumed here, and so the development of such a quantum computer will there- Word lines 1. Physical quantum processor.a A silicon-on-insulator (SOI) wafer is processed, such that the bottom layer of isotopically enriched silicon-28 contains the 2D qubit array and the top layer of silicon forms the transistors to operate the qubits.These are interconnected through the oxide regions using polysilicon vias.b Electrical circuit for the control of one Q-gate and one J-gate allowing the required individual, row-by-row, or global operations, as explained in the main text.c Physical architecture to operate one unit module containing 480 qubits.The inset on the bottom right shows a plan view cross-section through the qubit plane.Each J gate and qubit is connected via the circuit shown in (b).

Si
fore need to proceed hand-in-hand with the ongoing advances in semiconductor technology.
Generally, the most compact classical circuits have different geometries from quantum circuits.The situation is further complicated by the geometrical layout of the metal connection lines, determined by the quantum error correction implementation.We have overcome the complexity in scaling these differently sized circuit components via the use of verticallystacked interconnection layers, and as the number of qubits increases, the three layers become spatially identical.This point is reached upon expanding the structure to host 480 qubits, as shown in Fig. 1c (see the Supplementary Information section 2 for further details), and beyond this further scaling becomes a straightforward replication of this 480 qubit module.A full quantum processor would then contain multiple modules and the edges would be connected to a doped silicon region, serving as an electron reservoir, from which electrons may be sequentially loaded into the qubit array as is done in chargecoupled devices [21].The word and bit lines of the integrated quantum processor chip will then be connected to classical control and measurement electronics [22] that can reside next to or further away from the quantum chip depending on their level of power dissipation.

II. ELECTRICAL OPERATION
We now turn to the electrical operation of the qubit module, Fig. 2, and consider surface code operation, Fig. 3.A single electron is loaded into each quantum dot by addressing the corresponding word and bit lines.The electron occupancy is verified by gate-based dispersive readout, as shown in Fig. 2c and described further below.We assume that the complete structure is maintained at cryogenic temperatures (∼1 K or less) inside an electron spin resonance (ESR) cavity, which will be used to apply qubit control pulses.Each qubit must be calibrated to its desired qubit resonance frequency by tuning the associated floating memory gate, using electrical g-factor control, as has been demonstrated experimentally [10].The surface code operation we discuss here requires a total of six different resonance frequencies (see Fig. 3).The qubit gates (Q ij ) are calibrated to voltages such that the exchange coupling between adjacent qubits is negligible when the intermediate J-gates are set at an "off" bias point, and for which there is a common value of exchange when the J-gates are set to an "on" bias.Global (i.e.parallel) control is a crucial aspect for large-scale operation.The use of floating memory gates in the proposed architecture here has the significant advantage of enabling the individual tuning of qubits, while having a minimal amount of control lines that can then be set to common bias levels, thus enabling global operations.

II.a. Gate-based dispersive readout and initialization
Two popular methods for spin qubit readout are based on spin to charge conversion: readout based on the Zeeman energy (using a reservoir) [23] and readout based on the singlettriplet energy (via Pauli spin blockade) [24].Both approaches can be made compatible with our control circuit, but readout based on Pauli spin blockade offers a number of advantages, including: a larger energy scale leading to higher readout fidelity; no necessity for a large electron reservoir; and a large magnetic field is not required so that the qubit resonance frequency can be freely chosen (e.g.operation can be at a rather low frequency of order one GHz).We therefore propose to use Pauli spin blockade for parity readout between two spin qubits.
Dispersive readout [25] is generally considered for multidot qubits such as singlet-triplet qubits [24], but here we envision the readout of single spins by exploiting Pauli spin blockade.Single spin states can be projected onto singlet-triplet states using a reference neighbour dot, thus allowing a parity measurement between two qubits.Starting from the (0,2) singlet ground state, qubit initialization is obtained by adiabatically moving to the (1,1) state, which results in a spin-down state in the dot with the larger g-factor [20].The adiabaticity here is with respect to both the tunnel coupling and resonance energy difference between the qubits, which can be larger than 100MHz [26].Qubit readout occurs via the reverse process of initialization.Depending on the target qubit spin state, the (0,2) singlet state will be partly occupied.This will result in a capacitance that is dependent on the state of the target qubit, which can be observed in the reflected power in the RF circuit connected to a nearby gate [25], see Fig. 2.
The readout is performed in a row-by-row manner and the parity analyzers are connected to the data lines D 2i via bias tees, see Fig. 2d.Using classical circuitry, it is possible to frequency multiplex an entire row [27] so that only one RF analyzer circuit is needed, however it could be more convenient to use separate analyzers for each bit line, as depicted in Fig. 2a.Operating dispersive readout at 1 GHz enables readout on the order of 10-100 ns, such that a large qubit array can be read out well within the single qubit coherence time of 28ms in 28 Si substrates [10].
To be able to perform parallel operations, an integrated 3D arrangement of the addressing and qubit structures is required, such that a certain combination of word lines and bit lines will address the same particular qubit in each unit cell.This is implemented in the schematic in Fig. 2, where the unit cell has a size 2x3 (2 data qubits and 4 measurement qubits).Then, 20x24 qubits may be addressed using input lines on a grid 9x54 (see Supplementary Information section 2).To deselect individual qubits, the J-gates surrounding the relevant qubits are deactivated, thereby isolating them from the data qubits and creating an additional degree of freedom in the array for quantum computation.This protocol will be particularly relevant for operation of the defect-based surface code, as described in section II.b.

II.b. Surface code operations
Surface codes are among the most promising methods for quantum error correction [1,3].In our approach, an alternating arrangement of data and measurement qubits is used, where two data qubits interact with four measurement qubit neighbours, and the surface code unit cell becomes as shown in Fig. 3a.Two measurement qubits together enable a parity readout step, and this implementation is thus slightly larger than the usual surface code unit cell of four qubits.The measurement qubits are initialized to I by adiabatically moving from the (0,2) charge state to the (1,1) charge state, as discussed in section II.a.Single qubit operations and the twoqubit CPHASE and SWAP operations are then performed, followed by dispersive readout.The complete surface code cycle for quantum dot qubits, see Fig. 3b, then involves ten steps.
The focus of the work presented here is the realization of a 2D qubit array and we envision that many surface code schemes and even analog quantum simulator algorithms can be constructed based on our design.We therefore do not undertake here a detailed analysis of the particular error thresholds associated with our surface code implementation.However, since our implementation is based on general surface codes and the number of operations is comparable with those previously reported [3], we expect that the fault-tolerant error  B lines are grouped in five and the D in three, such that a combination of these form the lines of the electrical circuit of a single extendable structure, consisting of a single qubit and two J gates.The zigzag structure in (a) is to accommodate for the different aspect ratios of qubit size, control size and in order to be consistent with surface code operation (see Supplementary Information section 2 and 3 for further details).The purple rectangle displays the region that is occupied by 6 qubits, corresponding to a surface code unit cell (see Fig. 3a).Note that the word lines are connected to the qubits in an alternating arrangement in order to make the circuit compatible with our spin qubit surface code scheme.b Typical operation protocol of the electrical circuit shown in c and d.Individual qubit selection can be via lines W1 and B1 that (de)charge floating electrodes (M 1 in c) and (dis)connect the data lines from the corresponding J-gates.Two-qubit operations are performed by activating the associated lines W2 and B2 and sending a pulse through data line D1.Global single-qubit operations can be applied by broadcasting an ESR pulse at the resonance frequencies of the corresponding subgroup of qubits at any time of the sequence.Readout is enabled via the lines W2, B2, W3, and B3.Then a pulse turns on the selected J gates, and RF readout is performed via the data line D2 connected to the qubit.The electrical circuits in c and d show the corresponding structures to control the qubits and the exchange coupling between them.The floating memories M1 and M2 are to maintain the desired electric fields on the respective J and Q gates and may be periodically refreshed.
thresholds will largely remain the same; see Supplementary Information section 3 for a comparison between general surface codes and the spin qubit surface code as presented here.
Recent demonstrations of single-and two-qubit gates in silicon [10,20] provide thereby significant scope to meet all the required fault-tolerant thresholds.Further improvements in two-qubit fidelities fidelities are conceivable, for example via operation at the charge symmetry point for a pair of quantum dot qubits [28,29].
To perform logical quantum operations on the qubit module with a defect-based surface code, qubit deselection is re-quired to create holes for braiding operations [3].Individual qubit (de)selection is enabled by the circuit shown in Fig. 2c, using word and bit lines W 1j and B 1i .The required holes will be limited, as most physical qubits will be used to create the logical qubits.The infrequent nature of required qubit (de)selection allows for this to be done individually, rather than globally, and we achieve this by deactivating the associated J-gates, thereby isolating the associated data qubits from their measurement qubits.FIG. 3. Surface code operation.a Single unit cell, containing six qubits: two data qubits, D1 and D2, and four measurement qubits, Z1, Z2, X1 and X2.Each of these qubit classes has a well-defined independent qubit resonance frequency.b Surface code operation based on this unit cell.Note that the labels A, B, C and D refer to the data qubits associated with the respective measurement qubit (see also a).A single cycle of initialization, control, and readout corresponds to ten steps.An additional SWAP operation is included, compared to standard surface code operation [3], for the qubit readout step.The single qubit operations H are Hadamard-like and compensate for ẑ axis rotations that can occur during the CPHASE [20].See Supplementary Information section 3 for qubit operation and for details on comparison between general surface code and this quantum dot surface code.

III. HEAT DISSIPATION
A critical factor for almost any large-scale computing platform is cooling power.While it is not within the scope of this manuscript to calculate the total power dissipation that will depend on the exact layout of the architecture, we estimate here the dynamic power produced by the J-gates, which is likely the largest source of dissipation.The power dissipation of a single surface code unit cell, shown in Fig. 3, is given by P = CV 2 αf , with C the capacitance for the floating memory, V the switching voltage, and α the activity factor relative to the surface code clock cycle with frequency f ≈ 0.1 MHz (assuming Rabi frequencies on the order of 1MHz [10]).The surface code unit cell is operated using 54 transistors and during a full cycle the J-gate actvity α = 12.The floating gate electrodes may be periodically refreshed, as in DRAM technology, but we estimate that for high-fidelity qubit operation RC times beyond one second will be required to avoid significant drifts during operation.We assume this requires a capacitance C ≈ 1 pF, with an associated Johnson-Nyquist thermal noise V thermal = K B T /C ≈ 1 µV, providing a tolerable level [20].Assuming a switching voltage V = 0.2 V results then in a power dissipation for a single unit cell of ≈ 50 nW.
The available cooling power depends on the dilution refrigerator, but will ultimately be limited by the thermal conductivity of the addressing lines in the upper layers of the circuit.The thickness will depend on the exact implementation, but assuming ten to twenty stacked metallic layers we estimate that the total thickness of the lines will be below 5 µm.Polysil-icon at temperatures close to zero Kelvin can have a thermal conductivity κ = 100 W/ m/ K, and sufficient cooling power will be thus available at temperatures above ≈ 0.1K.Silicon metal-oxide-semiconductor (MOS) spin qubits can have a significant advantage for qubit operation at higher temperature, due to large energy scales of excited states and measured valley splittings, exceeding 10K [30].Further reductions in the required cooling power can be made by reducing the operation voltage, which is foreseeable at cryogenic temperatures, but possibly also by operating the transistors as single-electrontransistors [31], thereby significantly lowering the switching voltage.

IV. DISCUSSION
The architecture shown here demonstrates that an array of single electron spins confined to quantum dots in isotopically purified silicon can be controlled using a scalable number of control lines.We have shown that the often argued compatibility of silicon spin qubits with standard CMOS technology is non-trivial.However, in the case of quantum dot qubits, the fabrication can be made consistent with standard CMOS technology and be scaled up to contain thousands to even millions of qubits.Provided that the down-scaling of CMOS transistors continues as anticipated, the control and measurement circuitry described can be integrated with qubits of a size that have already been experimentally demonstrated [10,19,20].The combination of ESR control, exchange coupling and dispersive readout of this design enables surface code operations to be performed using this platform.A key advantage is the possibility of global qubit control, so that many qubits can be addressed within the qubit coherence time.
The proposed architecture is based on the current experimental status of silicon qubits and requires multiple transistors per qubit, significantly challenging CMOS manufacturing capabilities.Advancements in device uniformity and reproducibility could lower the number of required transistors.For example, with more uniform qubits the tuning circuitry and associated floating gates might not be needed.Additionally, operating at low magnetic fields will result in uniform qubit frequencies, avoiding the need for g-factor tuning.This limits functionality, since single-qubit gates can then be applied only globally, but universal computing is still possible using the local two-qubit gates.We anticipate that 2D arrays with such limited functionality can be realized in the near future, and will aid in the development of the universal quantum processor as presented here.
The architecture for control and operation presented here is highly generic and can be implemented in a number of platforms, including spin qubits based on either Si/SiO 2 or Si/SiGe heterostructures, and various modes of operation such as single spin qubits [10,19], singlet-triplet qubits [32], exchange-only [33] or hybrid qubits [34].The system we considered here requires only local exchange interaction, but the architecture could also be incorporated in larger architectures that include long-range qubit coupling [13,[35][36][37], for example to interconnect quantum structures as presented here.While we consider the fabrication on a single SOI wafer, a more advanced and complex fabrication process could include multiple stacked layers to allow for more complex classical electronics per qubit, or for a separate control circuit that is purely dedicated for calibration and stability.A more sophisticated design could also include frequency multiplexing along a row, allowing global readout.While the full fabrication and operation of our architecture is a formidable task, we believe that the detailed description together with the key requirements identified here pave the way towards an era of large-scale quantum computation; using the same silicon chip technology that has defined our current information age.Section 2. Matching planes (from qubit to address line) In drawing the physical qubit structure we have assumed a single linewidth, for both gate definition and gate separation.This allows us to define a single parameter λ, set by the feature size of the fabrication platform.While a 2D qubit plane takes on a square shape due to square (or circular) size of qubits, we found that this is generally not the case for the most optimal classical control layers.These different aspect ratios of the planes are matched using the vias, and can take on the same shape after expanding the qubits to a larger number, as described below.
We start with the basic control structure, which connects to a qubit and two J-gates, see Supplementary Figure 5.The aspect ratio of the control structure is 4λ × 20λ.In order to match with a square qubit, we extend the control structure to a set of 20 × 9, and the resulting structure is shown in Supplementary Figure 6.This control structure addresses a qubit array 20 × 4, which has the same footprint.However, in order to match the surface code protocol shown in the main text, Fig. 3, we again have to extend the structure to hold 54 × 9 classical control structures for 24 × 20 qubits (note the presence of 6 redundant classical control structures that appear after matching aspect ratio).The resulting, completely extandable, structure is shown in Supplementary Figure 7.
Transistor circuit, see Supp.Section 3. Qubit operation In this section we discuss the qubit operation in more detail.

Qubit initialization and readout
There are currently two popular methods for qubit readout based on spin-to-charge-conversion within the spin qubit community: readout based on the Zeeman energy (using a reservoir) and readout based on the singlet-triplet energy (via Pauli spin blockade).While both approaches can be made compatible with our control circuit, readout based on Pauli spin blockade offers the advantage of a larger energy scale (higher readout fidelity), no necessity for a large electron reservoir, and the qubit resonance frequency can be independently optimized (e.g.operation can be at low frequency).Two electron spins residing in adjacent dots can be coupled by turning on the J-gate.In an adiabatic experiment, single spin states can then be converted onto the singlet-triplet axis.The triplet states have charge occupancy state (1,1), whereas the singlet states are in the (0,2) state.The resulting difference in capacitive coupling to the floating gate can be used for dispersive readout, i.e. the reflected power in an RF-setup will depend on the qubit spin state.Adiabatic separation of the two electrons initializes then the qubits for the following cycle.

Single-qubit logic operations
We assume that the complete 2D-plane is positioned inside a cavity.In order to perform qubit operations, e.g.surface code operation, six individual qubit resonance frequencies are needed to individually control the qubit subsets (the z-axis qubits Z1 and Z2, the data-qubits D1 and D2, and the x-axis qubits X1 and X2) as shown in Fig. 3 of the main text.These operations are controlled globally via the cavity.Individual qubit tuning is controlled electrically via g-factor control [1].This tuning will allow to calibrate the qubits into the required subsets, but also to actively (de)select qubit to create the 'holes', essential in surface code operation [3].

Two-qubit logic operations
Two-qubit operations are achieved via electrically controlling the tunnel coupling and/or detuning energy, experimentally realized in [4].By turning the interaction on, the qubits will acquire a time-integrated phase dependent on the spin state of the coupled qubit.This allows to create either a SWAP or CPHASE operation, set by the interaction strength and the respective qubit resonance frequency difference.These two-qubit gates allows then to perform the surface code cycle, as shown in the main text, Fig. 3b.

Surface code with quantum dot spin qubits
The general surface code cycle [3] is shown in Supplementary Figure 8a, which contains a sequence of CNOT operations together with single qubit Hadamards, readout and initialization steps.For the spin qubit approach, we realize these CNOT operations via a combination of CPHASE and two single qubit pulses, shown in Supplementary Figure 8b.For the readout phase, we implement two 'reference' qubits, which are another two quantum dots.The measurement qubits can then be readout via dispersive charge detection and the resulting sequence is shown in Supplementary Figure 8c.Here, we have included a SWAP operation, such that initialization can be adiabatically achieved using the tuned g-factor difference (see section 3 qubit initialization and readout for details).

3 FIG. 2 .
FIG.2.Electrical circuit and qubit addressing scheme.a Electrical wiring of the 480 qubit module.The word lines (W ), bit lines (B) and data lines (D) can be addressed to enable global control, to couple and readout row-by-row and to individually (de)select qubits.The W and B lines are grouped in five and the D in three, such that a combination of these form the lines of the electrical circuit of a single extendable structure, consisting of a single qubit and two J gates.The zigzag structure in (a) is to accommodate for the different aspect ratios of qubit size, control size and in order to be consistent with surface code operation (see Supplementary Information section 2 and 3 for further details).The purple rectangle displays the region that is occupied by 6 qubits, corresponding to a surface code unit cell (see Fig.3a).Note that the word lines are connected to the qubits in an alternating arrangement in order to make the circuit compatible with our spin qubit surface code scheme.b Typical operation protocol of the electrical circuit shown in c and d.Individual qubit selection can be via lines W1 and B1 that (de)charge floating electrodes (M 1 in c) and (dis)connect the data lines from the corresponding J-gates.Two-qubit operations are performed by activating the associated lines W2 and B2 and sending a pulse through data line D1.Global single-qubit operations can be applied by broadcasting an ESR pulse at the resonance frequencies of the corresponding subgroup of qubits at any time of the sequence.Readout is enabled via the lines W2, B2, W3, and B3.Then a pulse turns on the selected J gates, and RF readout is performed via the data line D2 connected to the qubit.The electrical circuits in c and d show the corresponding structures to control the qubits and the exchange coupling between them.The floating memories M1 and M2 are to maintain the desired electric fields on the respective J and Q gates and may be periodically refreshed.

4 FIG. 4 .
FIG.4.Control element, transistor circuit, a.Control element for a single qubit and two J-gates (electrical circuit is depicted in the main text, Fig.1b).The grey elements correspond to the transistor switches allowing to activate a line.The scale λ is the features size, which is taken constant for each metal or dielectric layer.b Corresponding qubit and associated J gates.

FIG. 5 . 2 FIG. 6 .
FIG.5.Control element, word and bit lines.Control element for a single qubit (zoomed out version of Supplementary Figure4).In a 2D quantum dot array with nearest neighbour coupling, the basic elementary scalable structure is one qubit, and two J-gates.

FIG. 8 .
FIG.8.Surface code operation.a General surface code operation.b Surface code cycle after decomposing the CNOT into CPHASE and Hadamard operations.CPHASE operations with quantum dot qubits usually result in additional ẑ rotations, which can be corrected using single qubit gates.Here this is included in the Hadamard, resulting in a Hadamard-like operation.c Surface code on a 2D quantum dot array.