Cross-platform comparison of arbitrary quantum states

As we approach the era of quantum advantage, when quantum computers (QCs) can outperform any classical computer on particular tasks, there remains the difficult challenge of how to validate their performance. While algorithmic success can be easily verified in some instances such as number factoring or oracular algorithms, these approaches only provide pass/fail information of executing specific tasks for a single QC. On the other hand, a comparison between different QCs preparing nominally the same arbitrary circuit provides an insight for generic validation: a quantum computation is only as valid as the agreement between the results produced on different QCs. Such an approach is also at the heart of evaluating metrological standards such as disparate atomic clocks. In this paper, we report a cross-platform QC comparison using randomized and correlated measurements that results in a wealth of information on the QC systems. We execute several quantum circuits on widely different physical QC platforms and analyze the cross-platform state fidelities.

As we approach the era of quantum advantage, when quantum computers (QCs) can outperform any classical computer on particular tasks 1 , there remains the difficult challenge of how to validate their performance.While algorithmic success can be easily verified in some instances such as number factoring 2 or oracular algorithms 3 , these approaches only provide pass/fail information for a single QC.On the other hand, a comparison between different QCs on the same arbitrary circuit provides a lower-bound for generic validation: a quantum computation is only as valid as the agreement between the results produced on different QCs.Such an approach is also at the heart of evaluating metrological standards such as disparate atomic clocks 4 .In this paper, we report a crossplatform QC comparison using randomized and correlated measurements that results in a wealth of information on the QC systems.We execute several quantum circuits on widely different physical QC platforms and analyze the cross-platform fidelities.
Cross-platform quantum circuit comparisons are critical in the early stages of developing QC systems, as they may expose particular types of hardware-specific errors and also inform the fabrication of next-generation devices.There are straightforward methods for comparing generic output from different quantum computers, such as coherently swapping information between them 5 , and full quantum state tomography 6 .However, these schemes require either establishing a coherent quantum channel between the systems 7 , which may be impossible with highly disparate hardware types; or transforming quantum states to classical measurements, requiring resources that scale exponentially with system size.
Recently, a new type of cross-platform comparison based on randomized measurements has been proposed 8,9 .While this approach still scales exponentially with the number of qubits, it has a significantly smaller exponent prefactor compared with full quantum state tomography, allowing scaling to larger quantum computer systems.
Here, we demonstrate a cross-platform comparison based on randomized-measurement [8][9][10] , obtained independently over different times and locations on several disparate quantum computers built by different teams using different technologies, comparing the outcomes of four families of quantum circuits.We use four ion-trap platforms, the University of well as five separate IBM superconducting quantum computing systems hosted in New York, ibmq belem (IBM 1), ibmq casablanca (IBM 2), ibmq melbourne (IBM 3), ibmq quito (IBM 4), and ibmq rome (IBM 5) 15 .See Supplementary Information Sec.S4 for more details of these systems.
We first demonstrate the application of randomized measurements for comparing 5-qubit GHZ (Greenberger-Horne-Zeilinger) states 16 generated on different platforms and the ideal 5-qubit GHZ state obtained from classical simulation.Using the same protocol, we also compare states generated with three random circuits of different width and depth, each sharing a similar construction to circuits used in quantum volume (QV) measurements 17 .
The cross-platform fidelity that we use is defined as 8,18 where ρ i is the density matrix of the desired quantum state produced by system i.To evaluate this fidelity, for each system, we first initialize N qubits in the state |0, 0, . . ., 0 and apply the unitary V to nominally prepare the desired quantum states on each platform.
In order to measure the quantum states in M U different bases, we sample M U distinct combinations of random single-qubit rotations U = u 1 ⊗ u 2 ⊗ • • • ⊗ u N and append them to the circuit that implements V as shown in Fig. 1 a.Finally, we perform projective measurements in the computational basis.For each rotation setting U , the measurements are repeated M S times("shots") on each platform.
The fidelity can be inferred from the randomized measurement results via either the statistical correlations between the randomized measurements 8 (Protocol I) or constructing an approximate classical representation of a quantum state using randomized measurements, the so-called the classical shadow 10,19 (Protocol II).In Protocol I, we calculate the secondorder cross-correlations 8 between the outcomes of the two platforms i and j via the relation where i, j ∈ {1, 2}, s = s 1 , s 2 , ..., s N is the bit string of the binary measurement outcomes s k of kth qubit, D[s, s ] is the Hamming distance between s and s , P The overlap can be calculated as 10 where i, j ∈ {1, 2} and the overline denotes the average over all the experimental realizations.
We note that, for both protocols, unbiased estimators are necessary when calculating the purity i = j 8,10 using Eq. ( 2) and (3).
While the fidelity inferred from the two protocols is identical in the asymptotic limit with M = M S × M U → ∞, the fidelity error inferred from Protocol II converges faster in the number of random unitaries 10 .Therefore, we implement Protocol II for 5-and 7-qubit experiments.However, this protocol is more costly for post-processing.Therefore, for the 13-qubit experiment, we post-process the result with Protocol I.
We explore two different schemes for sampling the single-qubit unitary rotations U , a random method and a greedy method.In the regime M S 2 N , we observe that the greedy method outperforms the random method (see supplementary Information Sec.S1).
Therefore, for N = 5, 7, we sample the single-qubit unitary operation with the greedy method.For N = 13, we use the random method because to satisfy M S 2 N , the total number of measurements becomes too large.The specified target states and rotations are sent to each platform as shown in Fig. 1b,c.The circuit that implements the specified unitary U V are synthesized and optimized for each platform in terms of its native gates.
When preparing a quantum state on a quantum system, one can perform various errormitigation and circuit optimization techniques.While these techniques can greatly simplify the circuit and reduce the noise of the measurement outcomes, they can make the definition of state preparation ambiguous.For example, when we prepare a GHZ state and perform the projective measurement in the computational basis, we can defer the CNOT gates right before the measurement to the post-processing, instead of physically applying them.Although one can still obtain the same expectation value for any observable using such a circuit optimization technique, the GHZ state is not actually prepared in the quantum computer.In order to standardize the comparison, in this study, we require that one can perform arbitrary error-mitigation and circuit optimization techniques provided that the target state |ψ target = V |0 is prepared at the end of the state-preparation stage.
After performing the experiments, the results are sent to a data repository.Finally, we process the results and calculate the cross-platform fidelities.The statistical uncertainty of the measured fidelity is inferred directly from the measurement results via a bootstrap resampling technique 20 .The bootstrap resampling allows us to evaluate the statistical fluctuation of the measurements as well as the system performance fluctuation within the duration of the data taking, which is typically two to three days.However, we note that it does not show system performance variations on longer time scale.
We first measure the cross-platform fidelity to compare 5-qubit GHZ states.Specifically, the circuit that prepares the GHZ states are appended with a total of 243 different sets of single-qubit Clifford gates.Each appended circuit is repeated for M S = 2000 shots.
We sample M U = 100 out of the 243 different U s to calculate the cross-platform fidelity defined in Eq. (1) (Fig. 1d).We see that our method has good enough resolution to reveal the performance difference between platforms.In Supplementary Information sec.S2, we benchmark our method against full quantum state tomography by computing the fidelity as a function of M U .The comparison shows that the fidelity obtained via randomized measurements approaches that obtained via the full quantum state tomography rapidly.
We present cross-platform fidelity results for 7-and 13-qubit QV circuits 17 .QV circuits have been studied extensively, both theoretically and experimentally 17,21,22 , making them an ideal choice for the cross-platform comparison.An N -qubit QV circuit consists of d layers : each layer contains a random permutation of the qubit labels, followed by random two-qubit gates among every other neighboring pair of qubits.Specifically, a QV circuit can be written as a unitary operation V = d i=1 V (i) , where 2) and N = 2 N/2 .The operation π(a) is a random permutation sampled from the permutation group S N .The unitary operation V i a,b is a random two-qubit gate acting on qubits a and b and sampled from SU (4).The circuit diagram of an example QV circuit is shown in Fig. 2 a.In this experiment, we infer the fidelity for 7-qubit QV states with d = 2 and d = 3 and a 13-qubit QV state with d = 2.
Similar to the GHZ case, we first distribute the circuits, synthesize them into devicespecific native gates, and allow optimizations/error-mitigation that satisfies the aforementioned state-preparation rule.
On each platform, we append the circuit with M U = 500 different U s sampled using the greedy method.Outcomes are measured in the computational basis for M S = 2000 shots.
The cross-platform fidelities for d = 2 and d = 3 are shown in Figs. 2 c,d.Our results verify that with only a fraction of the number of measurements required to perform full quantum state tomography, we can estimate the fidelities to sufficiently high precision to be able to see clear differences among them.
We also infer the cross-platform fidelity with a 13-qubit QV circuit with d The cross-platform fidelity between IBM 2 and IBM 3 is higher than the cross-platform fidelity between either of them and the ion-trap systems (and classical simulation) as shown in Fig. 2c.This motivates us to study whether quantum states generated from different devices tend to be similar to each other if the underlying technology of the two devices is the same.Therefore, we perform a further analysis to investigate this phenomenon, which we refer to as intra-technology similarity.
We first study the fidelity between subsystems of the 7-qubit QV states prepared on different quantum computers for both d = 2 and d = 3.The subsystem fidelity provides a scalable way to estimate the upper bound for the full system fidelity, since the cost of measuring all possible subsystem fidelities of a fixed subsystem size scales polynomially with the full system size.For a given subsystem, we use the same data collected for the full system, but trace out qubits not within the subsystem of interest.The results are presented in Fig. 3 a.We observe that the cross-platform fidelity between for all subsystem sizes from the same technology is higher for a given subsystem size.
To further characterize the intra-technology similarity, we perform principal component analysis 24 (PCA) on the randomized measurement data for the 7-qubit quantum volume states with d = 2 and d = 3 from all the platforms.PCA is commonly used to reduce the dimensionality of a dataset.It has been applied extensively in signal processing such as human face recognition and audio compression.When implementing PCA, we project the dataset onto the first few principal components to obtain lower-dimensional data while preserving as much of the variation as possible.
To prepare the data for PCA, we randomly sample 1000 shots from the randomized measurement data out of M U × M S = 1, 000, 000 for each platform.We identify the set of Pauli strings whose expectation values can be evaluated using the sample.We then evaluate the expectation value of these identified Pauli strings by taking the average over the samples, and repeat the sampling N sample = 500 times without replacement to make N sample data points in the 4 N dimensional feature space.The feature vectors represent averaged classical shadow of the quantum state generated from the quantum computers 10,25 .We perform a rotation on the feature space and find the first two principal axes, which are the axes that show the two most significant variances on the dataset.In this manuscript, we experimentally performed the cross-platform comparison of four quantum states allowing the characterization of the quantum states generated from differ-ent quantum computers with significantly fewer measurements than those required by full quantum state tomography.To expand our understanding of the intra-technology similarity, more quantum states should be studied.Our method could be extended to additional technological platforms such as Rydberg atoms and photonic quantum computers.With the large volume of quantum data generated from the randomized measurement protocol, we have only begun to explore the possibilities that machine learning techniques can offer.
We envision extensions of our method will be indispensable in quantitatively comparing near-term quantum computers, especially across different qubit technologies.
a set of unitary operators U , we generate a sequence of unitary operators while maximizing the distance between each random unitary.Specifically, we define the distance between two unitary operators as And we generate the M U unitary operators {u i }, where 1 ≤ i ≤ M U sequentially.For i = 1, we sample a unitary operator randomly from V .For i > 1, we search for a unitary operator u i that minimizes the cost function C(u i ; u 1 , . . ., u i−1 ) = − i−1 j=1 d(u i , u j ).In order to minimize the cost function efficiently, we randomly generate N sample distinct unitary operators u i,x , where 1 ≤ x ≤ N sample and we define u i = min u i,x C(u i,x ; u 1 , . . ., u i−1 ).In practice, we find that N sample = 200 is enough to find the minimum for N = 7 and V = Cl(2) ⊗N , where Cl(2) is the single qubit Clifford group.The greedy method is summarized in Algorithm 1.We compare the two different methods of sampling the random unitary U : the randomized sampling and the greedy method.Using these two methods, we evaluate the fidelity between the state prepared on the UMD 1 system and that prepared on the IBM 1 system, by sampling subset of various size M U from the full state tomography measurements.Fig. S1 shows the error of the fidelity estimation between UMD 1 and IBM 1 as function of M U for M S = 2000.We see that the greedy method outperforms the random method in this regime.

S2. FULL STATE TOMOGRAPHY VS. RANDOMIZED MEASUREMENT FOR 5-QUBIT GHZ STATE
Here, we compare the cross-platform fidelity obtained from full-state tomography and that from the randomized measurement on the 5-qubit GHZ state prepared on different platforms.We perform the full-state-tomography on a platform by measuring all the 243 independent 5-qubit Pauli operators.To do so, we first independently generate the 5-qubit GHZ state circuits on each platform, with all the optimizations that satisfy the application based criterion described in the main text.Then we append different single-qubit rotations to the circuit to create the 243 different circuits.Each of the circuits gives the projective measurement result of one of the 243 independent 5-qubit Pauli operators.We set M S = 2000 for all the platforms.For the randomized measurement, because a random Pauli basis measurement is equivalent to a randomize measurement with single qubit Clifford gate 10 , we directly sample from the 243 Pauli basis measurements used for the full state tomography.

S4. QUANTUM SYSTEMS
In this section we detail the quantum systems used in this study.

IBM Quantum Experience
We use IBM Quantum Experience service to access several of their superconducting quantum computers. 15The ones used are ibmq belem (IBM 1), ibmq casablanca (IBM 2),

TI EURIQA (UMD 1)
Error-corrected Universal Reconfigurable Ion-trap Quantum Archetype (EURIQA) is a trapped-ion quantum computer currently located at the University of Maryland.This quantum computer supports up to thirteen qubits in a single chain of fifteen trapped 171 Yb + ions in a microfabricated chip trap 28 .The system achieves native single-qubit gate fidelities of 99.96% and two-qubit XX gate fidelities of 98.5-99.3% 11.On this platform, we compile the circuits to its native gate set through KAK decomposition.We optimize the qubit assignment through exhaustive search to minimize the anticipated noise of entangling gates.No SPAM correction was applied in post-processing.

TI UMD (UMD 2)
The second trapped-ion quantum computer system at Maryland is part of the TIQC (Trapped Ion Quantum Computation) team.This quantum computer supports up to nine qubits made of a single chain of 171 Yb + ions trapped in a linear Paul trap with blade electrodes 29 .Typical single-and two-qubit gate fidelities are 99.5(2)% and 98 − 99%, respectively.On this platform, we compile the quantum volume to its native gate set through KAK decomposition.We apply SPAM correction to mitigate the detection noise assuming that the preparation noise is negligible.

IonQ (IonQ 1 and IonQ 2)
The commercial trapped-ion quantum systems used by IonQ contain eleven fully connected qubits in a single chain of 171 Yb + ions trapped in a linear Paul trap with surface electrodes 29 .The single-qubit fidelities are 99.7% for both systems at the time of measurement, while two-qubit fidelities are 95 − 96% and 96 − 97% for IonQ 1 and IonQ 2 respectively.On this platform, we apply the technique describe in Ref. 30 to optimize the circuit.Quantum volume circuits were decomposed in terms of partially entangling MS gates.No SPAM correction was applied in post-processing.

UFIG. 1 .
FIG. 1. Schematic diagram of the cross-platform comparison.a Test quantum circuit, representedby unitary operator V for state preparation, with appended random rotations u i to each qubit i for measurements in a random (particular) basis.b The circuits are transpiled for different quantum platforms into their corresponding native gates.Each of the M U circuits is repeated M S times for each platform.c The measurement results are sent to a central data repository for processing the fidelities defined in Eq. (1).As an example, d shows the cross-platform fidelity results for a 5-qubit GHZ state, including a row of comparisons between each of the six hardware systems and theory (labeled "simulation").Entry i, j corresponds to the cross-platform fidelity between platform-i and platform-j.The cross-platform fidelity is inferred from M U = 100 randomized measurements and M S = 2000 repetitions for each U .
FIG. 2. a The quantum volume circuit diagram for d = 3.The d = 2 case does not have the operations in the dashed rectangle.b to d Cross-platform fidelity between different quantum computers.Entry i, j corresponds to the cross-platform fidelity F(ρ i , ρ j ) between platform-i and platform-j as defined in Eq. 1. b N = 7 and d = 2; c N = 7 and d = 3; d N = 13 and d = 2.

FIG. 3 .
FIG. 3. a The cross-platform fidelity between subsystems prepared on different quantum computers.Left : 7-qubit quantum volume circuit of 2 layers.Right: 7-qubit quantum volume circuit of 3 layers.The mean and error for each subsystem size are calculated via bootstrap re-sampling.b The projection of randomized measurement dataset onto the first two principal axes, P C 1 and P C 2 .Triangle marker is the 7-qubit quantum volume state with d = 2. Circle marker is the 7-qubit quantum volume state with d = 3. Magenta, orange ,and violet correspond to simulation, trapped-ion, and IBM systems respectively.

Input: 1 :
Number of random unitary M U , a set of unitary operator S Output : M U random unitary operations for randomized measurement {u i }, where 1 ≤ i ≤ M U .Sample u 1 randomly from S. 2 : for i = 2 to M U do 3 : Find a unitary u i ∈ S to minimize the cost function C(u i ; u 1 , . . ., u i−1 ). 4 : end for 5 : return {u i } FIG. S2.Fidelity error, |F e −F|, for 5 randomly selected 5-qubit GHZ state cross-platform fidelities implemented on different platforms vs. number of randomized measurements M U .The number of measurement is M S = 2000 for all cases.
FIG. S3.(a) Connectivity graph of IBM 2, IBM 3, and trapped ion (UMD 1 as an example) (b) Average number of two-qubit (entangling) gates needed to implement quantum volume circuits of layer d, on different quantum computers.The trapped ion quantum computers have the same all-to-all connectivity.
ibmq melbourne (IBM 3), ibmq quito (IBM 4), and ibmq rome (IBM 5).All the IBM systems use superconducting transmon qubits.The native gate sets are made of arbitrary single qubit rotations and nearest-neighbor two-qubit CNOT gates according to the connectivity graph.The error of single-qubit gates in IBM systems ranges from 3.32 × 10 −4 to 5.03 × 10 −2 , and the two-qubit errors range from 7.47 × 10 −3 to 1.07 × 10 −1 .Detailed specifications of each quantum device including qubit-connectivity diagram can be found on (https://quantum-computing.ibm.com/).On this platform, the synthesis and circuit optimization are implemented using the QISKit open-source software27 .