Benchmarking an 11-qubit quantum computer

The field of quantum computing has grown from concept to demonstration devices over the past 20 years. Universal quantum computing offers efficiency in approaching problems of scientific and commercial interest, such as factoring large numbers, searching databases, simulating intractable models from quantum physics, and optimizing complex cost functions. Here, we present an 11-qubit fully-connected, programmable quantum computer in a trapped ion system composed of 13 171Yb+ ions. We demonstrate average single-qubit gate fidelities of 99.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}%, average two-qubit-gate fidelities of 97.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}%, and SPAM errors of 0.7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}%. To illustrate the capabilities of this universal platform and provide a basis for comparison with similarly-sized devices, we compile the Bernstein-Vazirani and Hidden Shift algorithms into our native gates and execute them on the hardware with average success rates of 78\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}% and 35\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}%, respectively. These algorithms serve as excellent benchmarks for any type of quantum hardware, and show that our system outperforms all other currently available hardware.

S mall universal quantum computers that can execute textbook quantum circuits exist in both academic [1][2][3][4][5] and industrial [6][7][8][9][10] settings. With a range of 2-72 qubits and sufficient fidelity for only tens of entangling gates, these devices and the underlying qubit implementations can be difficult to compare. Even within the trapped ion platform, there is large diversity in atomic species, system architectures, and gate implementations. Trapped ion systems with one to two qubits have shown single-qubit gate fidelities of 99.9999% 11 with microwave-based operations and better than 99.99% fidelity with laser-based operations 12,13 , state preparation and measurement (SPAM) error below 10 À411, 14 , and two-qubit gates with fidelities exceeding 99.9% 12,13 . Algorithms have been executed on up to seven trapped-ion qubits 15 and, while not optimized for universal quantum computing, quantum simulators with more than 50 ions have modeled fundamental quantum systems including Ising chains 16 and quantum magnetism 17 .
Benchmarking across implementations needs to be both universal across platforms and agnostic to the differences in the underlying hardware. In traditional computing, the performance of computers is measured by executing a set of benchmark problems representing various use-case scenarios, to provide users with an estimate of how the computers would perform in their specific applications. Canonical quantum algorithms demonstrate unambiguous advantage of quantum computers over classical computation, and provide verifiable outcomes to assess successful execution of the algorithm. Therefore, they can serve as ideal candidate problems for benchmarking the performance of any quantum computers. These benchmark algorithms exercise the full hardware/software stack. A hardware-specific compiler breaks down algorithms into the target hardware's native gate set, optimizing for qubit connectivity, gate times, and coherence 18 to enhance the system's performance. After execution on the hardware, the measurements can be directly compared with the expected output state to determine the accuracy of the device. This accuracy can then be compared with other devices that have compiled and run the same algorithm 19 .
We benchmark two algorithms on an IonQ-trapped ion quantum computer, shown schematically in Fig. 1. Our qubit register is comprised of a chain of trapped 171 Yb + ions, spatially confined near a microfabricated surface electrode trap 20 , separating this work from similar implementations in more macroscopic traps 3,18 . By using a microfabricated trap, the underlying hardware of this quantum computer is more extensible than a traditional macroscopic trap. This is due in large part to the highly reproducible nature of microfabricated devices. In addition to this advantage, microfabricated surface traps have many more control electrodes, which allows for the fine control of the trapping potential. This becomes practically very important when trying to maintain equal spacing confinement in long chains of ions. To the best of our knowledge the largest similar algorithm implementation using a surface electrode trap was limited to three qubits 21 ; for this work, we loaded 13 ions, the middle 11 of which were used as qubits. The two end ions allowed for a more uniform spacing of the central 11 ions. However, on this same apparatus we have successfully loaded over 150 ions and have done selective single qubit rotations on subsets of chains of up to 79 qubits. The choice to use 11 qubits was informed by the number of gates required for full oracle implementations, our underlying gate fidelities, and the time required to run all of the oracles.

Results
Gate implementation and characterization. The ions are lasercooled close to their motional ground state using a combination of Doppler and resolved sideband cooling. We encode quantum information into the hyperfine sublevels, 0 At the beginning of each computation, each qubit is initialized to 0 j i via optical pumping with high accuracy. After qubit operations (described below), we read out the state of all of the qubits simultaneously by directing laser light resonant with the 2 S 1=2 F ¼ 1 j ito 2 P 1=2 transition, imaging each ion onto an independent detector and thresholding the photon counts to determine if each qubit was in the 1 j i (spin up) or 0 j i (spin down) state. Thresholding is done by taking a histogram of the collected photons and discriminating between collecting on average zero photons for the 0 j i state and 10 photons on average for the 1 j i state. Thresholding is a sufficient discriminating function in our system because our detectors are highly isolated from one another illuminate all of the ions during cooling, initialization, and detection. Each ion's fluorescence is imaged through a 0.6 numeric aperture lens (detection optics) and directed onto individual photomultiplier tube channels. Two linearly polarized counterpropagating 355 nm Raman beams are aligned to each qubit-ion, a globally addressing beam that couples to all of the qubits (red) and an individual addressing beam that is focused onto each ion (blue). Acousto-optic modulators (AOMs) modulate the frequency and amplitude of each of these beams to generate single-qubit rotations and XX-gates between arbitrary pairs of qubit ions.
resulting in detection crosstalk between adjacent ions below a part in 10 4 . A two-photon Raman transition drives single-qubit and twoqubit coherent operations by applying a pair of counterpropagating beams from a mode-locked pulsed 355 nm laser 22 . One of these beams globally addresses all of the ions simultaneously, while the other beam addresses any of the ions individually ( Fig. 1). The individually addressing beams pass through a multi-channel acousto-optic modulator (AOM), which allows for the simultaneous modulation of the phase, frequency, and amplitude of each beam. To perform a single-qubit gate, we tune the frequency difference between Raman beams to resonantly drive a spin-flip transition ( 1 j i $ 0 j i). In order to perform a two-qubit gate, we off-resonantly drive motional sideband transitions to generate an XX-interaction 23 . Both the global and individual beams are directed over the trap surface perpendicular to the axis of the ion chain to excite one principal axis of motion transverse to the chain axis. Individual addressing allows us to perform single-qubit and two-qubit gates on any targeted qubits.
Native two-qubit entangling XX-gates are achieved by driving a spin-dependent force 23 . Using an amplitude-modulated (AM) pulse on any selected pair of qubits, we address multiple transverse motional modes of the ion chain to mediate a spin-spin Ising interaction between qubits 24 . To achieve high fidelity, the amplitude modulation is calculated to simultaneously decouple all motional modes from the spin at the end of the gate operation. Additionally, these pulse shapes are designed to provide robustness against frequency drift of motional modes and suppress residual off-resonant carrier excitation during the XX-gate [24][25][26][27] . This gate, in conjunction with single-qubit rotations, forms a universal gate set for performing circuit model quantum computation. Since the XX-gates are mediated by the collective motion of the ion chain, we have all-to-all connectivity between qubits, allowing two-qubit gates to be executed between any qubit pair (Fig. 2a).
We perform randomized benchmarking 28 to characterize the single-qubit operations on each ion of the 11-qubit chain. We apply a randomly chosen sequence of π=2 gates with length L about the X and Y axes. In between each of these π=2 gates, we either add a π rotation about the X, Y, or Z-axis, or an identity operation (leaving the qubit idle for the duration of a gate). A final π=2 gate is chosen such that the final state is in the Z computational basis (i.e. 0 j i or 1 j i). We measure the overlap between the measured and expected output states across 500 iterations for at least 24 sequences for each L 2 f2; 4; 6; 8; 10; 12g. The fidelity of our single-qubit π=2 gate is then determined by fitting the resulting overlap as a function of sequence length to a power law, Bp L þ 1 2 . Here, the base p is the gate fidelity and the intercept B þ 1 2 is the SPAM fidelity, equivalent to measuring the ion after a single π rotation when it is in either state 0 j i or 1 j i. For a chain of 11 qubits, we measure an average singlequbit fidelity of 99.5% (Fig. 2b) and an average SPAM fidelity of 99.3%.
To quantify the performance of our two-qubit gates and estimate their fidelity, we measure the state fidelity of the Bell state 1 ffiffi 2 p ð 00 j i þ e iϕ 11 j iÞ prepared using a single XX-gate by performing partial tomography of the state 12,13 . The diagonal terms of the twoqubit density matrix are extracted by measuring the populations in the even parity states. The populations are measured when the overall AM pulse height for the XX-gate is tuned to achieve maximal entanglement such that the even-parity two-qubit states, P 00 and P 11 , are equal (P 00 ¼ P 11 ). The off-diagonal elements are obtained from the amplitude Φ of a parity oscillation, where the parity is given by P 00 þ P 11 À P 01 À P 10 (P 01 and P 10 are the populations of the odd parity two-qubit states). The fidelity can then be calculated as F ¼ 1 2 (P 00 þ P 11 þ Φ) 13 . We use maximumlikelihood estimation on experimentally observed data to extract the parameters of the fidelity expression 13 . We have performed this analysis for all 55 pairs of qubits in the 11-qubit chain (Fig. 2c) and measure an average fidelity of 97.5% with a minimum and maximum fidelity of 95.1 þ0:5 À0:7 % and 98.9 þ0:1 À0:3 %, respectively. The uncertainty here is determined by a statistical confidence interval on the maximum-likelihood estimation. The reported fidelity represents a lower bound of the Bell state creation as we do not correct for SPAM errors on the two-qubit states or errors in single-qubit rotations used to observe the parity oscillations of the Bell state, which on average are 0.7% and 0.5% respectively.
Bernstein-Vazirani (BV). To benchmark our system, we implement two well-known algorithms: the BV and Hidden Shift (HS).  Both of these algorithms have previously been run on trappedion 3,18,21 and superconducting 4,18,19 systems of up to five qubits. By comparing the results of this algorithm to the ideal result, we obtain a direct measure of the system performance, which accounts for our native gates, connectivity, coherence times, gate duration, and all other isolated metrics of system performance. These results can be used as part of a suite of algorithms to compare our hardware with other systems. The qubit number in these results is higher than any comparable published BV or HS results using a programmable quantum computer 3,4,18,19,21 . The BV algorithm is an oracle problem in which the user tries to determine an unknown bit string c of size N, implemented by a specific oracle. The algorithm takes a binary input string x and performs a controlled inversion of an ancillary bit or qubit based on the bit-wise product of the input and the unknown bit string c modulo two, f ðxÞ ¼ c Á x ðmod 2Þ 29 . For a quantum BV implementation (example shown in Fig. 3a), a single quantum query is sufficient to determine the bit string c 30 . This is a linear improvement over the best classical algorithm, which requires N queries. The BV algorithm was developed to help separate a class of problems that can be solved in polynomial time on a quantum computer with bounded errors, bounded-error quantum polynomial (BQP), from its classical counterpart. For an algorithm to belong to BQP, it must succeed with probability at least 2/3 on all possible inputs after only a polynomial number of repetitions. This implies that the single-shot success probability must exceed 1/2 for all inputs, which allows reaching the 2/3 threshold by classical majority vote on multiple repetitions. This way, the 2/3 threshold success for the algorithm to be above the BQP theshold may be met with a polynomial number of queries 29 .
We compile the BV algorithm into our native gate set, comprised of single-qubit rotations and two-qubit XX-gates.
Optimization during compilation reduces the number of needed gates compared to naively translating the textbook circuit from CNOT gates into rotations and XX-gates. The compilation exploits the full connectivity of our qubits, since we do not need SWAP operations. The implementation of BV requires a singlequbit ancilla and a register of N qubits. There are 2 N possible bit strings, therefore for our 10-qubit register there are 1024 possible oracle implementations. We measured each implementation 500 times, conditioned upon on the measured ancilla state, and plot the output distribution in Fig. 3b. Each oracle implementation has, depending on the unknown bit string c, between 0 and 10 two-qubit gates between the ancilla and the qubit register, corresponding to the number of ones in the binary representation of the unknown bit string. The process matrix that maps the encoded oracle to the measured output state is nearly diagonal, resulting in a highly peaked distribution at the encoded oracle. For our system, the average overlap between output state and unknown bit string is 78% (Fig. 3c), where 87.8% of oracle implementations achieve the 2/3 success criteria defined by BQP. Conditioning the output on the ancilla state results in a 5.1 percentage point increase in the raw success probability of 73-78% and an 14.5 percentage point increase in the fraction of oracle implementations above the BQP threshold from 73.3% to 87.8%. The average overlap in Fig. 3c decreases with the number of two-qubit gates needed in the oracle. The off-diagonal components of the process matrix show errors since these should all have zero population. In Fig. 3b, the dominant error is singlequbit bit-flips from 1 j i to 0 j i during the measurement process, which appear as faint diagonals in the lower left quadrant of the figure. However, even for the oracle implementation where we have the lowest success probability, the next-most-probable state is still four times less likely than the correct string.  c Shows the probability (inset plot) of detecting the encoded hidden bit string for all 1024 oracle implementations, as a function of the number of ones in the binary representation of the unknown bit string, which is equivalent to the number of two-qubit gates (n), which is maximally 10 in the case of this algorithm. The boxplots highlight the minimum, first quartile, median, third quartile, and maximum of the data. Note that there is only one oracle implementation for n = 0, 10, which explains the lower observed variances for these points. In contrast, there are many more oracles that consist of five two-qubit gates, where each included gate has slightly different fidelity. This leads to increased variance across the full set of five two-qubit gate oracle implementations. The shaded area spans the expected fidelity (excluding crosstalk errors) F n 2Q F 2ð n þ1Þ 1Q F 10 SPAM (where F 2Q is the fidelity of two-qubit gates, F 1Q is the fidelity of singlequbit gates, and F SPAM is the average SPAM fidelity) if all of our gates share the best measured fidelity or, alternatively, all share the worst fidelity. The result of a shared average fidelity is plotted as a dashed line. The average probability of success is 78% with 899 out of the 1024 oracle implementations exceeding the 2=3 BQP single-shot success threshold.
Hidden shift. The HS algorithm consists of two N-bit to N-bit function oracles f and g, which are the same up to a shift by a hidden bit string s, such that gðxÞ ¼ f ðx þ sÞ. The goal is to determine the HS s by querying the oracles. In our implementation 31 of the HS algorithm, the oracles are inner product or bent functions f ¼ P i x 2iÀ1 x 2i and g ¼ f ðx þ sÞ, where x is the input and x i is the i-th bit of x (an example is shown in (Fig. 4a). Classically it can be shown that determining the shift s requires ffiffiffiffiffi ffi 2 N p queries where N is the length of the bit string s. On a quantum computer, in principle, the shift can be read out in a single query 31,32 . In contrast to the BV algorithm, the quantum implementation of the HS algorithm shows an exponential reduction in the number of queries to the oracle compared to a classical computer 31 .
As with the BV algorithm, we compile the HS algorithm into our native gates. There are 2 N available oracle implementations corresponding to the 2 N possible hidden bit strings s. We execute all 1024 possible implementations on our 10-qubit register (Fig. 4b).
The correct output state is the state corresponding to the HS. The average overlap between the output state and s was 35% (Fig. 4c), and of the 1024 oracles, 1017 had most likely output states corresponding to the shift. The success probability for HS is lower and more uniform than that of BV because all of the oracles have the same number of two-qubit gates (10) and many more singlequbit gates (25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40). Every oracle implementation in HS has at least as many gates as the most challenging BV oracle implementation and therefore is more difficult. Given our average single-qubit and two-qubit fidelities, we would not expect to surpass the BQP threshold for the HS oracles. However, the successful determination of the shift was achieved much more frequently than if we sampled a classical distribution where the success probability would have been 0.1%.

Discussion
In summary, although there are several superconducting quantum computing platforms with large qubit number, IBM and Rigetti for instance, we have constructed the most powerful programmable quantum computer to date that has demonstrated algorithms with success rates above the BQP threshold. We have used a trapped ion quantum computer to perform the largest quantum implementations of the BV and HS algorithms. Using a 10-qubit register, we implement all 1024 possible oracles for each algorithm. We exceed the BQP threshold for 87.8% of the oracle implementations in the BV algorithm, an application designed to define this complexity class. Our worst-case oracle implementation, when taking into account detection and preparation error, had a success probability of 50.2%. This implies that it would take <11,500 repetitions to reach the BQP threshold on our worst case oracle. We also demonstrate 35% overlap between the measured and expected output states in the implementation of the HS algorithm, which is a more demanding application due to its higher gate count and exponential speed up over its classical analog. The success of both algorithms is a result of high-fidelity native gates and efficient gate compilation and compression in the fully connected ion trap system. The demonstration of these two canonical algorithms is a starting point for benchmarking any quantum computer. Computing real problems on larger systems with more qubits will require even more gates in the future with even higher quality, and similar standard algorithms to those demonstrated here will likely play a crucial role in benchmarking quantum computers in the future.

Data availability
The data presented in this manuscript are available from the corresponding author upon reasonable request. Hidden Shift (HS) algorithm implementation on 10 qubits. a Shows a textbook implementation of the HS algorithm with hidden shift 1111101010. The circuit for each oracle was measured at least 50 times. We trace out the spectator ion and interpret the binary output state of the 10-qubit register as an integer. The full output distribution is shown in b. c Shows the probability of detecting the encoded shift s for each of the 1024 oracle implementations versus the number of single-qubit gates (m). The shaded area represents the expected fidelity F 10 2Q F m 1Q F 10 SPAM (where F 2Q is the fidelity of two-qubit gates, F 1Q is the fidelity of single-qubit gates, and F SPAM is the average SPAM fidelity) if all of our gates share the best measured fidelity or, alternatively, all share the worst fidelity. Additionally, the success probability is reduced by crosstalk onto adjacent ions from the individually addressing Raman beams. This error impacts the result of the HS oracles more than the BV oracles. The result of a shared average fidelity is plotted as a dashed line. The average probability of success is 35%, and 1017 of the 1024 oracle implementations correctly return the hidden shift as the maximal probability state.