Introduction

Cross-platform quantum state comparisons are critical in the early stages of developing QC systems, as they may expose particular types of hardware-specific errors and also inform the fabrication of next-generation devices. There are straightforward methods for comparing generic output from different quantum computers, such as coherently swapping information between them1,2,3,4,5, and full quantum state tomography6. However, these schemes require either establishing a coherent quantum channel between the systems7, which may be impossible with highly disparate hardware types; or transforming quantum states to classical measurements, requiring resources that scale exponentially with system size.

Recently, a new type of cross-platform comparison based on randomized measurements has been proposed8,9. While this approach still scales exponentially with the number of qubits, it has a significantly smaller exponent prefactor compared with full quantum state tomography10, allowing scaling to larger quantum computer systems.

Here, we demonstrate a cross-platform comparison based on randomized-measurement8,9,11, obtained independently over different times and locations on several disparate quantum computers built by different teams using different technologies, comparing the outcomes of four families of quantum circuits.

To quantify the comparison, we use the cross-platform fidelity defined as8,12

$${{{{{{{\mathcal{F}}}}}}}}({\rho }_{1},\,{\rho }_{2})=\frac{{{{{{{{\rm{tr}}}}}}}}[{\rho }_{1}{\rho }_{2}]}{\sqrt{{{{{{{{\rm{tr}}}}}}}}[{\rho }_{1}^{2}]{{{{{{{\rm{tr}}}}}}}}[{\rho }_{2}^{2}]}},$$
(1)

where ρi is the density matrix of the desired N qubits quantum state produced by system i. To evaluate this fidelity, for each system, we first initialize N qubits in the state \(\left|0,\,0,\ldots,0\right\rangle\) and apply the unitary V to nominally prepare the desired quantum states on each platform. In order to perform randomized-measurement, we measure the quantum states in MU different bases. In particular, we sample MU distinct combinations of random single-qubit rotations U = u1u2uN and append them to the circuit that implements V as shown in Fig. 1a. Finally, we perform projective measurements in the computational basis. For each rotation setting U, the measurements are repeated MS times("shots”) on each platform. We infer the cross-platform fidelity defined in Eq. (1) from the randomized measurement results via either the statistical correlations between the randomized measurements8 (Protocol I in Method) or constructing an approximate classical representation of a quantum state using randomized measurements, the so-called the classical shadow11,13 (Protocol II in Method).

Fig. 1: Schematic diagram of the cross-platform comparison.
figure 1

a Test quantum circuit, represented by unitary operator V for state preparation, with appended random rotations ui to each qubit i for measurements in a random (particular) basis. b The circuits are transpiled for different quantum platforms into their corresponding native gates. Each of the MU circuits is repeated MS times for each platform. c The measurement results are sent to a central data repository for processing the fidelities defined in Eq. (1). As an example, d The cross-platform fidelity results for a 5-qubit GHZ state, including a row of comparisons between each of the six hardware systems and theory (labeled "simulation"). Entry i,  j corresponds to the cross-platform fidelity between platform-i and platform-j. The cross-platform fidelity is inferred from MU = 100 randomized measurements and MS = 2000 repetitions for each U.

We use four ion-trap platforms, the University of Maryland (UMD) EURIQA system14 (referred to as UMD_1), the University of Maryland TIQC system15 (UMD_2), and two IonQ quantum computers16,17 (IonQ_1, IonQ_2), as well as five separate IBM superconducting quantum computing systems hosted in New York, ibmq_belem (IBM_1), ibmq_casablanca (IBM_2), ibmq_melbourne (IBM_3), ibmq_quito (IBM_4), and ibmq_rome (IBM_5)18. See Supplementary Information Sec. S4 for more details of these systems, which includes refs. 14, 18,19,20,21,22.

We first demonstrate the application of randomized measurements for comparing 5-qubit GHZ (Greenberger-Horne-Zeilinger) states23 generated on different platforms and the ideal 5-qubit GHZ state obtained from classical simulation. Using the same protocol, we also compare states generated with three random circuits of different width and depth, each sharing a similar construction to circuits used in quantum volume (QV) measurements24.

Results

We first measure the cross-platform fidelity to compare 5-qubit GHZ states. Specifically, the circuit that prepares the GHZ states is appended with a total of 243 = 35 different sets of single-qubit Clifford gates. These appended circuits complete all the measurements needed for quantum state tomography. Each appended circuit is repeated for MS = 2000 shots. We sample MU = 100 out of the 243 different Us to calculate the cross-platform fidelity defined in Eq. (1) (Fig. 1d). We see that our method has good enough resolution to reveal the performance difference between platforms. In Supplementary Information Sec. S2, we benchmark our method against full quantum state tomography by computing the fidelity as a function of MU. The comparison shows that the fidelity obtained via randomized measurements approaches that obtained via the full quantum state tomography rapidly.

We present cross-platform fidelity results for 7- and 13-qubit QV-like circuits24. QV circuits have been studied extensively, both theoretically and experimentally24,25,26, making them an ideal choice for the cross-platform comparison. Also, quantum volume provides a single-number metric for the overall performance of a quantum computer. However, in our randomized measurement scheme, we can obtain more information for the state we prepare. In particular, by using the classical post-processing scheme presented in11, we can estimate many observables from the randomized measurement data. An N-qubit QV circuit consists of d = N layers : each layer contains a random permutation of the qubit labels, followed by random two-qubit gates among every other neighboring pair of qubits. In our study, we call circuits of such construction but different circuit depth d QV-like circuits. Specifically, a QV-like circuit can be written as a unitary operation \(V=\mathop{\prod }\nolimits_{i=1}^{d}{V}^{(i)}\), where \({V}^{(i)}={V}_{{\pi }_{i}(N^{\prime} -1),{\pi }_{i}(N^{\prime} )}^{i}\otimes \cdots \otimes {V}_{{\pi }_{i}(1),{\pi }_{i}(2)}^{i}\) and \(N^{\prime}=2\lfloor N/2\rfloor\). The operation π(a) is a random permutation sampled from the permutation group SN. The unitary operation \({V}_{a,b}^{i}\) is a random two-qubit gate acting on qubits a and b and sampled from SU(4). The circuit diagram of an example QV-like circuit is shown in Fig. 2a. In this experiment, we infer the fidelity for 7-qubit QV-like states with d = 2 and d = 3 and a 13-qubit QV-like state with d = 2.

Fig. 2: The cross-platform fidelity for 7-qubit and 13-qubit QV-like circuit.
figure 2

a The quantum volume circuit diagram for d = 3. The d = 2 case does not have the operations in the dashed rectangle. bd Cross-platform fidelity between different quantum computers. Entry i,  j corresponds to the cross-platform fidelity \({{{{{{{\mathcal{F}}}}}}}}({\rho }_{i},\,{\rho }_{j})\) between platform-i and platform-j as defined in Eq. (1). b N = 7 and d = 2; c N = 7 and d = 3; d N = 13 and d = 2.

Similar to the GHZ case, we first distribute the circuits, synthesize them into device-specific native gates, and allow optimizations/error-mitigation that satisfies the aforementioned state-preparation rule.

On each platform, we append the circuit with MU = 500 different Us sampled using the greedy method. Outcomes are measured in the computational basis for MS = 2000 shots. The cross-platform fidelities for d = 2 and d = 3 are shown in Fig. 2b, c. Our results verify that with only a fraction of the number of measurements required to perform full quantum state tomography, we can estimate the fidelities to sufficiently high precision to be able to see clear differences among them.

We also infer the cross-platform fidelity with a 13-qubit QV-like circuit with d = 2. The results are shown in Fig. 2d. Here we use MU = 1000 and MS = 2000, in contrast with the much larger MU = 313 = 1,594,323 needed for full quantum state tomography.

We find several interesting features by analyzing the cross-platform fidelity of 7-qubit QV-like results. First, we observe that the cross-platform fidelity drops significantly when the number of layers d increases from d = 2 to d = 3 for the IBM quantum computers. The drop may be due to the restricted nearest-neighbor connectivity of superconducting quantum computers27, requiring additional SWAP gates overhead for the execution of the permutation gates. In Supplementary Information Sec. S3, we numerically evaluate the number of entangling gates as function of the number of layers d with different connectivity graphs. We see that, according to IBM’s native compiler QISKit (see Supplementary Information Sec. S3 and Sec. S6 for measurement error calibration) extra entangling gates are used to perform two-qubit gates for non-nearest-neighbor qubits on superconducting platforms, resulting in extra errors.

The cross-platform fidelity between IBM_2 and IBM_3 is higher than the cross-platform fidelity between either of them and the ion-trap systems (and classical simulation) as shown in Fig. 2c. This motivates us to study whether quantum states generated from different devices tend to be similar to each other if the underlying technology of the two devices is the same. Therefore, we perform a further analysis to investigate this phenomenon, which we refer to as intra-technology similarity.

We first study the fidelity between subsystems of the 7-qubit QV-like states prepared on different quantum computers for both d = 2 and d = 3. The subsystem fidelity provides a scalable way to estimate the upper bound for the full system fidelity, since the cost of measuring all possible subsystem fidelities of a fixed subsystem size scales polynomially with the full system size. For a given subsystem, we use the same data collected for the full system, but trace out qubits not within the subsystem of interest. The results are presented in Fig. 3a. We observe that the cross-platform fidelity between for all subsystem sizes from the same technology is higher for a given subsystem size.

Fig. 3: The cross-platform fidelity for subsystem and intra-technology similarity.
figure 3

a The cross-platform fidelity between subsystems prepared on different quantum computers. Left : 7-qubit quantum volume circuit of two layers. Right: 7-qubit quantum volume circuit of three layers. The mean for each subsystem size is calculated via bootstrap re-sampling. b The projection of randomized measurement dataset onto the first two principal axes, PC1 and PC2. Triangle marker is the 7-qubit quantum volume state with d = 2. Circle marker is the 7-qubit quantum volume state with d = 3. Magenta, orange, and violet correspond to simulation, trapped-ion, and IBM systems respectively.

To further characterize the intra-technology similarity, we perform principal component analysis28 (PCA) on the randomized measurement data for the 7-qubit quantum volume states with d = 2 and d = 3 from all the platforms. PCA is commonly used to reduce the dimensionality of a dataset. It has been applied extensively in signal processing such as human face recognition and audio compression. When implementing PCA, we project the dataset onto the first few principal components to obtain lower-dimensional data while preserving as much of the variation as possible.

To prepare the data for PCA, we randomly sample 1000 shots from the randomized measurement data out of MU × MS = 1,000,000 for each platform. We identify the set of Pauli strings whose expectation values can be evaluated using the sample. We then evaluate the expectation value of these identified Pauli strings by taking the average over the samples, and repeat the sampling Nsample = 500 times without replacement to make Nsample data points in the 4N dimensional feature space. The feature vectors represent averaged classical shadow of the quantum state generated from the quantum computers11,29. We perform a rotation on the feature space and find the first two principal axes, which are the axes that show the two most significant variances on the dataset. Figure 3b shows the projection of the Nsample data points to the first two principal axes. We observe that the first principal component separates the two quantum volume states, and the second principal component can distinguish the technology that generates the states. The clustering of the data from the same technology indicates that each technology may share similar noise characteristics that can be distinguished through the cross-platform fidelity and machine-learning techniques.

Discussion

In this manuscript, we experimentally performed the cross-platform comparison of four quantum states allowing the characterization of the quantum states generated from different quantum computers with significantly fewer measurements than those required by full quantum state tomography. To expand our understanding of the intra-technology similarity, more quantum states, in particular those designed to probe the effect of different settings on the cross-platform comparison results, should be studied. Our method could be extended to additional technological platforms such as Rydberg atoms and photonic quantum computers30. With the large volume of quantum data generated from the randomized measurement protocol, we have only begun to explore the possibilities that machine learning techniques can offer. We envision extensions of our method will be indispensable in quantitatively comparing near-term quantum computers, especially across different qubit technologies.

Methods

Inference of cross-platform fidelity

Here we briefly introduce the two protocols used for inferring cross-platform fidelity (Eq. 1) from randomized measurements. In Protocol I, we calculate the second-order cross-correlations8 between the outcomes of the two platforms i and j via the relation

$${{{{{{{\rm{Tr}}}}}}}}[{\rho }_{i}{\rho }_{j}]={2}^{N}\mathop{\sum}\limits_{s,s^{\prime} }{(-2)}^{-D[s,s^{\prime} ]}\overline{{P}_{U}^{(i)}(s){P}_{U}^{(j)}(s^{\prime} )},$$
(2)

where i, j {1, 2}, s = s1, s2, . . . , sN is the bit string of the binary measurement outcomes sk of kth qubit, \(D[s,s^{\prime} ]\) is the Hamming distance between s and \(s^{\prime}\), \({P}_{U}^{(i)}(s)={{{{{{{\rm{Tr}}}}}}}}[U{\rho }_{i}{U}^{{{{\dagger}}} }\left|s\right\rangle \left\langle s\right|]\), and the overline denotes the average over random unitaries U.

For Protocol II, we reconstruct the classical shadow of the quantum state for each shot of measurement as \(\hat{\rho }{=\bigotimes }_{k=1}^{N}(3{u}_{k}^{{{{\dagger}}} }\left|{s}_{k}\right\rangle \left\langle {s}_{k}\right|{u}_{k}-I)\), where I is the 2 × 2 identity matrix11,13. The overlap can be calculated as11

$${{{{{{{\rm{Tr}}}}}}}}[{\rho }_{i}{\rho }_{j}]=\overline{{{{{{{{\rm{Tr}}}}}}}}[{\hat{\rho }}_{i}{\hat{\rho }}_{j}]},$$
(3)

where i, j {1, 2} and the overline denotes the average over all the experimental realizations. We note that, for both protocols, unbiased estimators are necessary when calculating the purity i = j8,11 using Eqs. (2) and (3).

While the fidelity inferred from the two protocols is identical in the asymptotic limit with M = MS × MU → , the fidelity error inferred from Protocol II converges faster in the number of random unitaries11. Therefore, we implement Protocol II for 5- and 7-qubit experiments. However, this protocol is more costly for post-processing. Therefore, for the 13-qubit experiment, we post-process the result with Protocol I.

We explore two different schemes for sampling the single-qubit unitary rotations U, a random method and a greedy method. In the regime MS 2N, we observe that the greedy method outperforms the random method (see Supplementary Information Sec. S1, which includes refs. 8, 11, 31). Therefore, for N = 5, 7, we sample the single-qubit unitary operation with the greedy method. For N = 13, we use the random method because to satisfy MS 2N, the total number of measurements becomes too large. The specified target states and rotations are sent to each platform as shown in Fig. 1b, c. The circuit that implements the specified unitary UV are synthesized and optimized for each platform in terms of its native gates.

When preparing a quantum state on a quantum system, one can perform various error-mitigation and circuit optimization techniques. While these techniques can greatly simplify the circuit and reduce the noise of the measurement outcomes, they can make the definition of state preparation ambiguous. For example, when we prepare a GHZ state and perform the projective measurement in the computational basis, we can defer the CNOT gates right before the measurement to the post-processing, instead of physically applying them. Although one can still obtain the same expectation value for any observable using such a circuit optimization technique, the GHZ state is not actually prepared in the quantum computer. In order to standardize the comparison, in this study, we require that one can perform arbitrary error-mitigation and circuit optimization techniques provided that the target state \(|{\psi }_{target}\rangle=V\left|0\right\rangle\) is prepared at the end of the state-preparation stage.

After performing the experiments, the results are sent to a data repository. Finally, we process the results and calculate the cross-platform fidelities. The statistical uncertainty of the measured fidelity is inferred directly from the measurement results via a bootstrap resampling technique32. The bootstrap resampling allows us to evaluate the statistical fluctuation of the measurements together with the system performance fluctuation within the duration of the data taking, which is typically two to three days.