Abstract
Variational hybrid quantumclassical algorithms are promising candidates for nearterm implementation on quantum computers. In these algorithms, a quantum computer evaluates the cost of a gate sequence (with speedup over classical cost evaluation), and a classical computer uses this information to adjust the parameters of the gate sequence. Here we present such an algorithm for quantum state diagonalization. State diagonalization has applications in condensed matter physics (e.g., entanglement spectroscopy) as well as in machine learning (e.g., principal component analysis). For a quantum state ρ and gate sequence U, our cost function quantifies how far \(U\rho U^\dagger\) is from being diagonal. We introduce shortdepth quantum circuits to quantify our cost. Minimizing this cost returns a gate sequence that approximately diagonalizes ρ. One can then read out approximations of the largest eigenvalues, and the associated eigenvectors, of ρ. As a proofofprinciple, we implement our algorithm on Rigetti’s quantum computer to diagonalize onequbit states and on a simulator to find the entanglement spectrum of the Heisenberg model ground state.
Introduction
The future applications of quantum computers, assuming that largescale, faulttolerant versions will eventually be realized, are manifold. From a mathematical perspective, applications include number theory,^{1} linear algebra,^{2,3,4} differential equations,^{5,6} and optimization.^{7} From a physical perspective, applications include electronic structure determination^{8,9} for molecules and materials and realtime simulation of quantum dynamical processes^{10} such as protein folding and photoexcitation events. Naturally, some of these applications are more longterm than others. Factoring and solving linear systems of equations are typically viewed as longer term applications due to their high resource requirements. On the other hand, approximate optimization and the determination of electronic structure may be nearer term applications, and could even serve as demonstrations of quantum supremacy in the near future.^{11,12}
A major aspect of quantum algorithms research is to make applications of interest more near term by reducing quantum resource requirements including qubit count, circuit depth, numbers of gates, and numbers of measurements. A powerful strategy for this purpose is algorithm hybridization, where a fully quantum algorithm is turned into a hybrid quantumclassical algorithm.^{13} The benefit of hybridization is twofold, both reducing the resources (hence allowing implementation on smaller hardware) as well as increasing accuracy (by outsourcing calculations to “errorfree” classical computers).
Variational hybrid algorithms are a class of quantumclassical algorithms that involve minimizing a cost function that depends on the parameters of a quantum gate sequence. Cost evaluation occurs on the quantum computer, with speedup over classical cost evaluation, and the classical computer uses this cost information to adjust the parameters of the gate sequence. Variational hybrid algorithms have been proposed for Hamiltonian ground state and excited state preparation,^{8,14,15} approximate optimization,^{7} error correction,^{16} quantum data compression,^{17,18} quantum simulation,^{19,20} and quantum compiling.^{21} A key feature of such algorithms is their nearterm relevance, since only the subroutine of cost evaluation occurs on the quantum computer, while the optimization procedure is entirely classical, and hence standard classical optimization tools can be employed.
In this work, we consider the application of diagonalizing quantum states. In condensed matter physics, diagonalizing states is useful for identifying properties of topological quantum phases—a field known as entanglement spectroscopy.^{22} In data science and machine learning, diagonalizing the covariance matrix (which could be encoded in a quantum state^{2,23}) is frequently employed for principal component analysis (PCA). PCA identifies features that capture the largest variance in one’s data and hence allows for dimensionality reduction.^{24}
Classical methods for diagonalization typically scale polynomially in the matrix dimension.^{25} Similarly, the number of measurements required for quantum state tomography—a general method for fully characterizing a quantum state—scales polynomially in the dimension. Interestingly, Lloyd et al. proposed a quantum algorithm for diagonalizing quantum states that can potentially perform exponentially faster than these methods.^{2} Namely, their algorithm, called quantum principal component analysis (qPCA), gives an exponential speedup for lowrank matrices. qPCA employs quantum phase estimation combined with density matrix exponentiation. These subroutines require a significant number of qubits and gates, making qPCA difficult to implement in the near term, despite its longterm promise.
Here, we propose a variational hybrid algorithm for quantum state diagonalization. For a given state ρ, our algorithm is composed of three steps: (i) Train the parameters α of a gate sequence U_{p}(α) such that \(\tilde \rho = U_p({\boldsymbol{\alpha }}_{{\mathrm{opt}}})\rho U_p({\boldsymbol{\alpha }}_{{\mathrm{opt}}})^\dagger\) is approximately diagonal, where α_{opt} is the optimal value of α obtained (ii) Read out the largest eigenvalues of ρ by measuring in the eigenbasis (i.e., by measuring \(\tilde \rho\) in the standard basis), and (iii) Prepare the eigenvectors associated with the largest eigenvalues. We call this the variational quantum state diagonalization (VQSD) algorithm. VQSD is a nearterm algorithm with the same practical benefits as other variational hybrid algorithms. Employing a layered ansatz for U_{p}(α) (where p is the number of layers) allows one to obtain a hierarchy of approximations for the eigevalues and eigenvectors. We therefore think of VQSD as an approximate diagonalization algorithm.
We carefully choose our cost function C to have the following properties: (i) C is faithful (i.e, it vanishes if and only if \(\tilde \rho\) is diagonal), (ii) C is efficiently computable on a quantum computer, (iii) C has operational meanings such that it upper bounds the eigenvalue and eigenvector error (see Sec. IIA), and (iv) C scales well for training purposes in the sense that its gradient does not vanish exponentially in the number of qubits. The precise definition of C is given in Sec. IIA and involves a difference of purities for different states. To compute C, we introduce shortdepth quantum circuits that likely have applications outside the context of VQSD.
To illustrate our method, we implement VQSD on Rigetti’s 8qubit quantum computer. We successfully diagonalize onequbit pure states using this quantum computer. To highlight future applications (when larger quantum computers are made available), we implement VQSD on a simulator to perform entanglement spectroscopy on the ground state of the onedimensional (1D) Heisenberg model composed of 12 spins.
Our paper is organized as follows. Section II outlines the VQSD algorithm and presents its implementation. In Sec. III, we give a comparison to the qPCA algorithm, and we elaborate on future applications. Section IV presents our methods for quantifying diagonalization and for optimizing our cost function.
Results
The VQSD algorithm
Overall structure
Figure 1 shows the structure of the VQSD algorithm. The goal of VQSD is to take, as its input, an nqubit density matrix ρ given as a quantum state and then output approximations of the mlargest eigenvalues and their associated eigenvectors. Here, m will typically be much less than 2^{n}, the matrix dimension of ρ, although the user is free to increase m with increased algorithmic complexity (discussed below). The outputted eigenvalues will be in classical form, i.e., will be stored on a classical computer. In contrast, the outputted eigenvectors will be in quantum form, i.e., will be prepared on a quantum computer. This is necessary because the eigenvectors would have 2^{n} entries if they were stored on a classical computer, which is intractable for large n. Nevertheless, one can characterize important aspects of these eigenvectors with a polynomial number of measurements on the quantum computer.
Similar to classical eigensolvers, the VQSD algorithm is an approximate or iterative diagonalization algorithm. Classical eigenvalue algorithms are necessarily iterative, not exact. This can be seen by noting that computing eigenvalues is equivalent to computing roots of a polynomial equation (namely the characteristic polynomial of the matrix) and that no closedform solution exists for the roots of general polynomials of degree greater than or equal to five.^{25} Iterative algorithms are useful in that they allow for a tradeoff between runtime and accuracy. Higher degrees of accuracy can be achieved at the cost of more iterations (equivalently, longer runtime), or short runtime can be achieved at the cost of lower accuracy. This flexibility is desirable in that it allows the user of the algorithm to dictate the quality of the solutions found.
The iterative feature of VQSD arises via a layered ansatz for the diagonalizing unitary. This idea similarly appears in other variational hybrid algorithms, such as the Quantum Approximate Optimization Algorithm.^{7} Specifically, VQSD diagonalizes ρ by variationally updating a parameterized unitary U_{p}(α) such that
is (approximately) diagonal at the optimal value α_{opt}. (For brevity we often write \(\tilde \rho\) for \(\tilde \rho _p({\boldsymbol{\alpha }})\)). We assume a layered ansatz of the form
Here, p is a hyperparameter that sets the number of layers L_{i}(α_{i}), and each α_{i} is a set of optimization parameters that corresponds to internal gate angles within the layer. The parameter α in (1) refers to the collection of all α_{i} for i = 1, …, p. Once the optimization procedure is finished and returns the optimal parameters α_{opt}, one can then run a particular quantum circuit (shown in Fig. 1c and discussed below) N_{readout} times to approximately determine the eigenvalues of ρ. The precision (i.e, the number of significant digits) of each eigenvalue increases with N_{readout} and with the eigenvalue’s magnitude. Hence for small N_{readout} only the largest eigenvalues of ρ will be precisely characterized, so there is a connection between N_{readout} and how many eigenvalues, m, are determined. The hyperparameter p is a refinement parameter, meaning that the accuracy of the eigensystem (eigenvalues and eigenvectors) typically increases as p increases. We formalize this argument as follows.
Let C denote our cost function, defined below in (10), which we are trying to minimize. In general, the cost C will be nonincreasing (i.e., will either decrease or stay constant) in p. One can ensure that this is true by taking the optimal parameters learned for p layers as the starting point for the optimization of p + 1 layers and by setting α_{p+1} such that L_{p+1} (α_{p+1}) is an identity. This strategy also avoids barren plateaus^{26,27} and helps to mitigate the problem of local minima, as we discuss in Appendix B of Supplementary Material (SM).
Next, we argue that C is closely connected to the accuracy of the eigensystem. Specifically, it gives an upper bound on the eigensystem error. Hence, one obtains an increasingly tighter upper bound on the eigensystem error as C decreases (equivalently, as p increases). To quantify eigenvalue error, we define
where d = 2^{n}, and {λ_{i}} and \(\{ \tilde \lambda _i\}\) are the true and inferred eigenvalues, respectively. Here, i is an index that orders the eigenvalues in decreasing order, i.e., \(\lambda _i \ge \lambda _{i + 1}\) and \(\tilde \lambda _i \ge \tilde \lambda _{i + 1}\) for all i ∈ {1, …, d − 1}. To quantify eigenvector error, we define
Here, \(\tilde v_i\rangle\) is the inferred eigenvector associated with \(\tilde \lambda _i\), and \({\Pi}_i^ \bot = 1  \tilde v_i\rangle \langle \tilde v_i\) is the projector onto the subspace orthogonal to \(\tilde v_i\rangle\). Hence, δ_{i}〉 is a vector whose norm quantifies the component of \(\rho \tilde v_i\rangle\) that is orthogonal to \(\tilde v_i\rangle\), or in other words, how far \(\tilde v_i\rangle\) is from being an eigenvector of ρ.
As proven in Sec. IV A, our cost function upper bounds the eigenvalue and eigenvector error up to a proportionality factor β,
Because C is nonincreasing in p, the upper bound in (5) is nonincreasing in p and goes to zero if C goes to zero.
We remark that Δ_{v} can be interpreted as a weighted eigenvector error, where eigenvectors with larger eigenvalues are weighted more heavily in the sum. This is a useful feature since it implies that lowering the cost C will force the eigenvectors with the largest eigenvalues to be highly accurate. In many applications, such eigenvectors are precisely the ones of interest. (See Sec. II B for an illustration of this feature).
The various steps in the VQSD algorithm are shown schematically in Fig. 1. There are essentially three main steps: (1) an optimization loop that minimizes the cost C via backandforth communication between a classical and quantum computer, where the former adjusts α and the latter computes C for U_{p}(α), (2) a readout procedure for approximations of the m largest eigenvalues, which involves running a quantum circuit and then classically analyzing the statistics, and (3) a preparation procedure to prepare approximations of the eigenvectors associated with the m largest eigenvalues. In the following subsections, we elaborate on each of these procedures.
Parameter optimization loop
Naturally, there are many ways to parameterize U_{p}(α). Ideally one would like the number of parameters to grow at most polynomially in both n and p. Figure 2 presents an example ansatz that satisfies this condition. Each layer L_{i} is broken down into layers of twobody gates that can be performed in parallel. These twobody gates can be further broken down into parameterized onebody gates, for example, with the construction in ref. ^{28}. We discuss a different approach to parameterize U_{p}(α) in Appendix B of SM.
For a given ansatz, such as the one in Fig. 2, parameter optimization involves evaluating the cost C on a quantum computer for an initial choice of parameters and then modifying the parameters on a classical computer in an iterative feedback loop. The goal is to find
The classical optimization routine used for updating the parameters can involve either gradientfree or gradientbased methods. In Sec. IV B, we explore this further and discuss our optimization methods.
In Eq. (6), C(U_{p}(α)) quantifies how far the state \(\tilde \rho _p({\boldsymbol{\alpha }})\) is from being diagonal. There are many ways to define such a cost function, and in fact there is an entire field of research on coherence measures that has introduced various such quantities.^{29} We aim for a cost that is efficiently computable with a quantumclassical system, and hence we consider a cost that can be expressed in terms of purities. (It is well known that a quantum computer can find the purity Tr(σ^{2}) of an nqubit state σ with complexity scaling only linearly in n, an exponential speedup over classical computation^{30,31}). Two such cost functions, whose individual merits we discuss in Sec. IV A, are
Here, \({\cal{Z}}\) and \({\cal{Z}}_j\) are quantum channels that dephase (i.e., destroy the offdiagonal elements) in the global standard basis and in the local standard basis on qubit j, respectively. Importantly, the two functions vanish under the same conditions:
So the global minima of C_{1} and C_{2} coincide and correspond precisely to unitaries U_{p}(α) that diagonalize ρ (i.e., unitaries such that \(\tilde \rho\) is diagonal).
As elaborated in Sec. IV A, C_{1} has operational meanings: it bounds our eigenvalue error, \(C_1 \ge \Delta _\lambda\), and it is equivalent to our eigenvector error, C_{1} = Δ_{v}. However, its landscape tends to be insensitive to changes in U_{p}(α) for large n. In contrast, we are not aware of a direct operational meaning for C_{2}, aside from its bound on C_{1} given by \(C_2 \ge (1/n)C_1\). However, the landscape for C_{2} is more sensitive to changes in U_{p}(α), making it useful for training U_{p}(α) when n is large. Due to these contrasting merits of C_{1} and C_{2}, we define our overall cost function C as a weighted average of these two functions
where q ∈ [0, 1] is a free parameter that allows one to tailor the VQSD method to the scale of one’s problem. For small n, one can set q ≈ 1 since the landscape for C_{1} is not too flat for small n, and, as noted above, C_{1} is an operationally relevant quantity. For large n, one can set q to be small since the landscape for C_{2} will provide the gradient needed to train U_{p}(α). The overall cost maintains the operational meaning in (5) with
Appendix B illustrates the advantages of training with different values of q.
Computing C amounts to evaluating the purities of various quantum states on a quantum computer and then doing some simple classical postprocessing that scales linearly in n. This can be seen from Eqs. (7) and (8). The first term, Tr(ρ^{2}), in C_{1} and C_{2} is independent of U_{p}(α). Hence, Tr(ρ^{2}) can be evaluated outside of the optimization loop in Fig. 1 using the Destructive Swap Test (see Sec. IV A for the circuit diagram). Inside the loop, we only need to compute \({\mathrm{Tr}}({\cal{Z}}(\tilde \rho )^2)\) and \({\mathrm{Tr}}({\cal{Z}}_j(\tilde \rho )^2)\) for all j. Each of these terms are computed by first preparing two copies of \(\tilde \rho\) and then implementing quantum circuits whose depths are constant in n. For example, the circuit for computing \({\mathrm{Tr}}({\cal{Z}}(\tilde \rho )^2)\) is shown in Fig. 1b, and surprisingly it has a depth of only one gate. We call it the Diagonalized Inner Product (DIP) Test. The circuit for computing \({\mathrm{Tr}}({\cal{Z}}_j(\tilde \rho )^2)\) is similar, and we call it the Partially Diagonalized Inner Product (PDIP) Test. We elaborate on both of these circuits in Sec. IV A.
Eigenvalue readout
After finding the optimal diagonalizing unitary U_{p}(α_{opt}), one can use it to readout approximations of the eigenvalues of ρ. Figure 1c shows the circuit for this readout. One prepares a single copy of ρ and then acts with U_{p}(α_{opt}) to prepare \(\tilde \rho _p({\boldsymbol{\alpha }}_{{\mathrm{opt}}})\). Measuring in the standard basis {z〉}, where z = z_{1}z_{2}… z_{n} is a bitstring of length n, gives a set of probabilities \(\{ \tilde \lambda _{\boldsymbol{z}}\}\) with
We take the \(\tilde \lambda _{\boldsymbol{z}}\) as the inferred eigenvalues of ρ. We emphasize that the \(\tilde \lambda _{\boldsymbol{z}}\) are the diagonal elements, not the eigenvalues, of \(\tilde \rho _p({\boldsymbol{\alpha }}_{{\mathrm{opt}}})\).
Each run of the circuit in Fig. 1c generates a bitstring z corresponding to the measurement outcomes. If one obtains z with frequency f_{z} for N_{readout} total runs, then
gives an estimate for \(\tilde \lambda _{\boldsymbol{z}}\). The statistical deviation of \(\tilde \lambda _{\boldsymbol{z}}^{{\mathrm{est}}}\) from \(\tilde \lambda _{\boldsymbol{z}}\) goes with \(1{\mathrm{/}}\sqrt {N_{{\mathrm{readout}}}}\). The relative error \(\epsilon _{\boldsymbol{z}}\) (i.e., the ratio of the statistical error on \(\tilde \lambda _{\boldsymbol{z}}^{{\mathrm{est}}}\) to the value of \(\tilde \lambda _{\boldsymbol{z}}^{{\mathrm{est}}}\)) then goes as
This implies that events z with higher frequency f_{z} have lower relative error. In other words, the larger the inferred eigenvalue \(\tilde \lambda _{\boldsymbol{z}}\), the lower the relative error, and hence the more precisely it is determined from the experiment. When running VQSD, one can predecide on the desired values of N_{readout} and a threshold for the relative error, denoted \(\epsilon _{\mathrm{max}}\). This error threshold \(\epsilon _{\mathrm{max}}\) will then determine m, i.e., how many of the largest eigenvalues that get precisely characterized. So \(m = m(N_{{\mathrm{readout}}},\epsilon _{\textrm{max}},\{ \tilde \lambda _{\boldsymbol{z}}\} )\) is a function of N_{readout}, \(\epsilon _{\mathrm{max}}\), and the set of inferred eigenvalues \(\{ \tilde \lambda _{\boldsymbol{z}}\}\). Precisely, we take \(m = {\tilde{\lambda} }^{{\mathrm{est}}}\) as the cardinality of the following set:
which is the set of inferred eigenvalues that were estimated with the desired precision.
Eigenvector preparation
The final step of VQSD is to prepare the eigenvectors associated with the mlargest eigenvalues, i.e., the eigenvalues in the set in Eq. (15). Let \({\boldsymbol{Z}} = \{ {\boldsymbol{z}}:\tilde \lambda _{\boldsymbol{z}}^{{\mathrm{est}}} \in {\tilde{\lambda} }^{{\mathrm{est}}}\}\) be the set of bitstrings z associated with the eigenvalues in \({\tilde{\lambda} }^{{\mathrm{est}}}\). (Note that these bitstrings are obtained directly from the measurement outcomes of the circuit in Fig. 1c, i.e., the outcomes become the bitstring z). For each z ∈ Z, one can prepare the following state, which we take as the inferred eigenvector associated with our estimate of the inferred eigenvalue \(\tilde \lambda _{\boldsymbol{z}}^{{\mathrm{est}}}\),
The circuit for preparing this state is shown in Fig. 1d. As noted in (17), one first prepares z〉 by acting with X operators raised to the appropriate powers, and then one acts with \(U_p({\boldsymbol{\alpha }}_{{\mathrm{opt}}})^\dagger\) to rotate from the standard basis to the inferred eigenbasis.
Once they are prepared on the quantum computer, each inferred eigenvector \(\tilde v_{\boldsymbol{z}}\rangle\) can be characterized by measuring expectation values of interest. That is, important physical features such as energy or entanglement (e.g., entanglement witnesses) are associated with some Hermitian observable M, and one can evaluate the expectation value \(\langle \tilde v_{\boldsymbol{z}}M\tilde v_{\boldsymbol{z}}\rangle\) to learn about these features.
Implementations
Here we present our implementations of VQSD, first for a onequbit state on a cloud quantum computer to show that it is amenable to currently available hardware. Then, to illustrate the scaling to larger, more interesting problems, we implement VQSD on a simulator for the 12spin ground state of the Heisenberg model. See Appendices A and B of SM for further details. The code used to generate some of the examples presented here and in SM can be accessed from ref. ^{32}
Onequbit state
We now discuss the results of applying VQSD to the onequbit plus state ρ = +〉〈+ on the 8QAgave quantum computer provided by Rigetti.^{33} Because the problem size is small (n = 1), we set q = 1 in the cost function (10). Since ρ is a pure state, the cost function is
For U_{p}(α), we take p = 1, for which the layered ansatz becomes an arbitrary single qubit rotation.
The results of VQSD for this state are shown in Fig. 3. In Fig. 3a, the solid curve shows the cost versus the number of iterations in the parameter optimization loop, and the dashed curves show the inferred eigenvalues of ρ at each iteration. Here we used the Powell optimization algorithm, see Section IV B for more details. As can be seen, the cost decreases to a small value near zero and the eigenvalue estimates simultaneously converge to the correct values of zero and one. Hence, VQSD successfully diagonalized this state.
Figure 3b shows the landscape of the optimization problem on Rigetti’s 8QAgave quantum computer, Rigetti’s noisy simulator, and a noiseless simulator. Here, we varied the angle α in the diagonalizing unitary U(α) = R_{x}(π/2)R_{z}(α) and computed the cost at each value of this angle. The landscape on the quantum computer has local minima near the optimal angles α = π/2, 3π/2 but the cost is not zero. This explains why we obtain the correct eigenvalues even though the cost is nonzero in Fig. 3a. The nonzero cost can be due to a combination of decoherence, gate infidelity, and measurement error. As shown in Fig. 3b, the 8QAgave quantum computer retuned during our data collection, and after this retuning, the landscape of the quantum computer matched that of the noisy simulator significantly better.
Heisenberg model ground state
While current noise levels of quantum hardware limit our implementations of VQSD to small problem sizes, we can explore larger problem sizes on a simulator. An important application of VQSD is to study the entanglement in condensed matter systems, and we highlight this application in the following example.
Let us consider the ground state of the 1D Heisenberg model, the Hamiltonian of which is
with \({\boldsymbol{S}}^{(j)} = (1/2)(\sigma _x^{(j)}\hat x + \sigma _y^{(j)}\hat y + \sigma _z^{(j)}\hat z)\) and periodic boundary conditions, S^{(2n+1)} = S^{(1)}. Performing entanglement spectroscopy on the ground state ψ〉_{AB} involves diagonalizing the reduced state ρ = Tr_{B}(ψ〉〈ψ_{AB}). Here we consider a total of eight spins (2n = 8). We take A to be a subset of four nearestneighbor spins, and B is the complement of A.
The results of applying VQSD to the fourspin reduced state ρ via a simulator are shown in Fig. 4. Panel (a) plots the inferred eigenvalues versus the number of layers p in our ansatz (see Fig. 2). One can see that the inferred eigenvalues converge to their theoretical values as p increases. Panel (b) plots the inferred eigenvalues resolved by their associated quantum numbers (zcomponent of total spin). This plot illustrates the feature we noted previously that minimizing our cost will first result in minimizing the eigenvector error for those eigenvectors with the largest eigenvalues. Overall our VQSD implementation returned roughly the correct values for both the eigenvalues and their quantum numbers. Resolving not only the eigenvalues but also their quantum numbers is important for entanglement spectroscopy,^{22} and clearly VQSD can do this.
In Appendix B of SM we discuss an alternative approach employing a variable ansatz for U_{p}(α), and we present results of applying this approach to a sixqubit reduced state of the 12qubit ground state of the Heisenberg model.
Discussion
We emphasize that VQSD is meant for states ρ that have either low rank or possibly high rank but low entropy H(ρ) = −Tr(ρ log ρ). This is because the eigenvalue readout step of VQSD would be exponentially complex for states with high entropy. In other words, for high entropy states, if one efficiently implemented the eigenvalue readout step (with N_{readout} polynomial in n), then very few eigenvalues would get characterized with the desired precision. In Appendix B of SM we discuss the complexity of VQSD for particular example states.
Examples of states for which VQSD is expected to be efficient include density matrices computed from ground states of 1D, local, gapped Hamiltonians. Also, thermal states of some 1D systems in a manybody localized phase at low enough temperature are expected to be diagonalizable by VQSD. These states have rapidly decaying spectra and are eigendecomposed into states obeying a 1D area law.^{34,35,36} This means that every eigenstate can be prepared by a constant depth circuit in alternating ansatz form,^{35} and hence VQSD will be able to diagonalize it.
Comparison to literature
Diagonalizing quantum states with classical methods would require exponentially large memory to store the density matrix, and the matrix operations needed for diagonalization would be exponentially costly. VQSD avoids both of these scaling issues.
Another quantum algorithm that extracts the eigenvalues and eigenvectors of a quantum state is qPCA.^{2} Similar to VQSD, qPCA has the potential for exponential speedup over classical diagonalization for particular classes of quantum states. Like VQSD, the speedup in qPCA is contingent on ρ being a lowentropy state.
We performed a simple implementation of qPCA to get a sense for how it compares to VQSD, see Appendix B in SM for details. In particular, just like we did for Fig. 3, we considered the onequbit plus state ρ = +〉〈+. We implemented qPCA for this state on Rigetti’s noisy simulator (whose noise is meant to mimic that of their 8QAgave quantum computer). The circuit that we implemented applied one controlledexponentialswap gate (in order to approximately exponentiate ρ, as discussed in ref. ^{2}). We employed a machinelearning approach^{37} to compile the controlledexponentialswap gate into a shortdepth gate sequence (see Appendix B in SM). With this circuit we inferred the two eigenvalues of ρ to be approximately 0.8 and 0.2. Hence, for this simple example, it appears that qPCA gave eigenvalues that were slightly off from the true values of 1 and 0, while VQSD was able to obtain the correct eigenvalues, as discussed in Fig. 3.
Future applications
As noted in ref. ^{2}, one application of quantum state diagonalization is benchmarking of quantum noise processes, i.e., quantum process tomography. Here one prepares the Choi state by sending half of a maximally entangled state through the process of interest. One can apply VQSD to the resulting Choi state to learn about the noise process, which may be particular useful for benchmarking nearterm quantum computers.
A special case of VQSD is variational state preparation. That is, if one applies VQSD to a pure state ρ = ψ〉〈ψ, then one can learn the unitary U(α) that maps ψ〉 to a standard basis state. Inverting this unitary allows one to map a standard basis state (and hence the state 0〉^{⊗n}) to the state ψ〉, which is known as state preparation. Hence, if one is given ψ〉 in quantum form, then VQSD can potentially find a shortdepth circuit that approximately prepares ψ〉. Variational quantum compiling algorithms that were very recently proposed^{21,38} may also be used for this same purpose, and hence it would be interesting to compare VQSD to these algorithms for this special case. Additionally, in this special case one could use VQSD and these other algorithms as an error mitigation tool, i.e., to find a shortdepth state preparation that achieves higher accuracy than the original state preparation.
In machine learning, PCA is a subroutine in supervised and unsupervised learning algorithms and also has many direct applications. PCA inputs a data matrix X and finds a new basis such that the variance is maximal along the new basis vectors. One can show that this amounts to finding the eigenvectors of the covariance matrix E[XX^{T}] with the largest eigenvalues, where E denotes expectation value. Thus PCA involves diagonalizing a positivesemidefinite matrix, E[XX^{T}]. Hence VQSD can perform this task provided one has access to QRAM^{23} to prepare the covariance matrix as a quantum state. PCA can reduce the dimension of X as well as filter out noise in data. In addition, nonlinear (kernel) PCA can be used on data that is not linearly separable. Very recent work by Tang^{39} suggests that classical algorithms could be improved for PCA of lowrank matrices, and potentially obtain similar scaling as qPCA and VQSD. Hence future work is needed to compare these different approaches to PCA.
Perhaps the most important nearterm application of VQSD is to study condensed matter physics. In particular, we propose that one can apply the variational quantum eigensolver^{8} to prepare the ground state of a manybody system, and then one can follow this with the VQSD algorithm to characterize the entanglement in this state. Ultimately this approach could elucidate key properties of condensed matter phases. In particular, VQSD allows for entanglement spectroscopy, which has direct application to the identification of topological order.^{22} Extracting both the eigenvalues and eigenvectors is useful for entanglement spectroscopy,^{22} and we illustrated this capability of VQSD in Fig. 4. Finally, an interesting future research direction is to check how the discrepancies in preparation of multiple copies affect the performance of the diagonalization.
Methods
Diagonalization test circuits
Here we elaborate on the cost functions C_{1} and C_{2} and present shortdepth quantum circuits to compute them.
C_{1} and the DIP test
The function C_{1} defined in Eq. (7) has several intuitive interpretations. These interpretations make it clear that C_{1} quantifies how far a state is from being diagonal. In particular, let \(D_{{\mathrm{HS}}}(A,B): = {\mathrm{Tr}}\left( {(A  B)^\dagger (A  B)} \right)\) denote the Hilbert–Schmidt distance. Then we can write
In other words, C_{1} is (1) the minimum distance between \(\tilde \rho\) and the set of diagonal states \({\cal{D}}\), (2) the distance from \(\tilde \rho\) to \({\cal{Z}}(\tilde \rho )\), and (3) the sum of the absolute squares of the offdiagonal elements of \(\tilde \rho\).
C_{1} can also be written as the eigenvector error in Eq. (4) as follows. For an inferred eigenvector \(\tilde v_{\boldsymbol{z}}\rangle\), we define \(\delta _{\boldsymbol{z}}\rangle = \rho \tilde v_{\boldsymbol{z}}\rangle  \tilde \lambda _{\boldsymbol{z}}\tilde v_{\boldsymbol{z}}\rangle\) and write the eigenvector error as
since \(\langle \tilde v_{\boldsymbol{z}}\rho \tilde v_{\boldsymbol{z}}\rangle = \tilde \lambda _{\boldsymbol{z}}\). Summing over all z gives
which proves the bound in Eq. (5) for q = 1.
In addition, C_{1} bounds the eigenvalue error defined in Eq. (3). Let \({\tilde{\lambda} } = (\tilde \lambda _1, \ldots ,\tilde \lambda _d)\) and λ = (λ_{1}, …, λ_{d}) denote the inferred and actual eigenvalues of ρ, respectively, both arranged in decreasing order. In this notation we have
Since the eigenvalues of a density matrix majorize its diagonal elements, \({\boldsymbol{\lambda }} \succ {\tilde{\lambda} }\), and the dot product with an ordered vector is a Schur convex function, we have
Hence from Eq. (29) and Eq. (30) we obtain the bound
which corresponds to the bound in Eq. (5) for the special case of q = 1.
For computational purposes, we use the difference of purities interpretation of C_{1} given in Eq. (7). The Tr(ρ^{2}) term is independent of U_{p}(α). Hence it only needs to be evaluated once, outside of the parameter optimization loop. It can be computed via the expectation value of the swap operator S on two copies of ρ, using the identity
This expectation value is found with a depthtwo quantum circuit that essentially corresponds to a Bellbasis measurement, with classical postprocessing that scales linearly in the number of qubits.^{37,40} This is shown in Fig. 5a. We call this procedure the Destructive Swap Test, since it is like the Swap Test, but the measurement occurs on the original systems instead of on an ancilla.
Similarly, the \({\mathrm{Tr}}({\cal{Z}}(\tilde \rho )^2)\) term could be evaluated by first dephasing \(\tilde \rho\) and then performing the Destructive Swap Test, which would involve a depththree quantum circuit with linear classical postprocessing. This approach was noted in ref. ^{41}. However, there exists a simpler circuit, which we call the Diagonalized Inner Product (DIP) Test. The DIP Test involves a depthone quantum circuit with no classical postprocessing. An abstract version of this circuit is shown in Fig. 5b, for two states σ and τ. The proof that this circuit computes \({\mathrm{Tr}}({\cal{Z}}(\sigma ){\cal{Z}}(\tau ))\) is given in Appendix B of SM. For our application we will set \(\sigma = \tau = \tilde \rho\), for which this circuit gives \({\mathrm{Tr}}({\cal{Z}}(\tilde \rho )^2)\).
In summary, C_{1} is efficiently computed by using the Destructive Swap Test for the Tr(ρ^{2}) term and the DIP Test for the \({\mathrm{Tr}}({\cal{Z}}(\tilde \rho )^2)\) term.
C_{2} and the PDIP test
Like C_{1}, C_{2} can also be rewritten in terms of of the Hilbert–Schmidt distance. Namely, C_{2} is the average distance of \(\tilde \rho\) to each locally dephased state \({\cal{Z}}_j(\tilde \rho )\):
where \({\cal{Z}}_j( \cdot ) = \mathop {\sum}\nolimits_z (z\rangle \langle z_j \otimes 1_{k \ne j})( \cdot )(z\rangle \langle z_j \otimes 1_{k \ne j})\). Naturally, one would expect that \(C_2 \le C_1\), since \(\tilde \rho _{}^{}\) should be closer to each locally dephased state than to the fully dephased state. Indeed this is true and can be seen from:
However, C_{1} and C_{2} vanish under precisely the same conditions, as noted in Eq. (9). One can see this by noting that C_{2} also upper bounds (1/n)C_{1} and hence we have
Combining the upper bound in Eq. (35) with the relations in Eq. (26) and Eq. (31) gives the bounds in Eq. (5) with β defined in Eq. (11). The upper bound in Eq. (35) is proved as follows. Let z = z_{1}…z_{n} and \({\boldsymbol{z}}^{\prime} = z_1^{\prime} \ldots z_n^{\prime}\) be ndimensional bitstrings. Let \({\cal{S}}\) be the set of all pairs (z, z′) such that z ≠ z′, and let \({\cal{S}}_j\) be the set of all pairs (z, z′) such that \(z_j \ne z_{_j}^\prime\). Then we have \(C_1 = \mathop {\sum}\nolimits_{({\boldsymbol{z}},{\boldsymbol{z}}^{\prime} ) \in {\cal{S}}} \langle {\boldsymbol{z}}\tilde \rho {\boldsymbol{z}}^{\prime} \rangle ^{2}\), and
where \({\cal{S}}^U = \mathop {\bigcup}\nolimits_{j = 1}^n {{\cal{S}}_j}\) is the union of all the \({\cal{S}}_j\) sets. The inequality in Eq. (37) arises from the fact that the \({\cal{S}}_j\) sets have nontrivial intersection with each other, and hence we throw some terms away when only considering the union \({\cal{S}}^U\). The last equality follows from the fact that \({\cal{S}}^U = {\cal{S}}\), i.e, the set of all bitstring pairs that differ from each other (\({\cal{S}}\)) corresponds to the set of all bitstring pairs that differ for at least one element (\({\cal{S}}^U\)).
Writing C_{2} in terms of purities, as in Eq. (8), shows how it can be computed on a quantum computer. As in the case of C_{1}, the first term in Eq. (8) is computed with the Destructive Swap Test. For the second term in Eq. (8), each purity \({\mathrm{Tr}}({\cal{Z}}_j(\tilde \rho )^2)\) could also be evaluated with the Destructive Swap Test, by first locally dephasing the appropriate qubit. However, we present a slightly improved circuit to compute these purities that we call the PDIP Test. The PDIP Test is shown in Fig. 5c for the general case of feeding in two distinct states σ and τ with the goal of computing the inner product between \({\cal{Z}}_{\boldsymbol{j}}(\sigma )\) and \({\cal{Z}}_{\boldsymbol{j}}(\tau )\). For generality we let l, with \(0 \le l \le n\), denote the number of qubits being locally dephased for this computation. If l > 0, we define j = (j_{1}, …, j_{l}) as a vector of indices that indicates which qubits are being locally dephased. The PDIP Test is a hybrid of the Destructive Swap Test and the DIP Test, corresponding to the former when l = 0 and the latter when l = n. Hence, it generalizes both the Destructive Swap Test and the DIP Test. Namely, the PDIP Test performs the DIP Test on the qubits appearing in j and performs the Destructive Swap Test on the qubits not appearing in j. The proof that the PDIP Test computes \({\mathrm{Tr}}({\cal{Z}}_{\boldsymbol{j}}(\sigma ){\cal{Z}}_{\boldsymbol{j}}(\tau ))\), and hence \({\mathrm{Tr}}({\cal{Z}}_{\boldsymbol{j}}(\tilde \rho )^2)\) when \(\sigma = \tau = \tilde \rho\), is given in Appendix B of SM.
C_{1} versus C_{2}
Here we discuss the contrasting merits of the functions C_{1} and C_{2}, hence motivating our cost definition in Eq. (10).
As noted previously, C_{2} does not have an operational meaning like C_{1}. In addition, the circuit for computing C_{1} is more efficient than that for C_{2}. The circuit in Fig. 5b for computing the second term in C_{1} has a gate depth of one, with n CNOT gates, n measurements, and no classical postprocessing. The circuit in Fig. 5c for computing the second term in C_{2} has a gate depth of two, with n CNOT gates, n − 1 Hadamard gates, 2n − 1 measurements, and classical postprocessing whose complexity scales linearly in n. So in every aspect, the circuit for computing C_{1} is less complex than that for C_{2}. This implies that C_{1} can be computed with greater accuracy than C_{2} on a noisy quantum computer.
On the other hand, consider how the landscape for C_{1} and C_{2} scale with n. As a simple example, suppose ρ = 0〉〈0⊗⋯⊗0〉〈0. Suppose one takes a single parameter ansatz for U, such that U(θ) = R_{X}(θ)⊗⋯⊗R_{X}(θ), where R_{X}(θ) is a rotation about the Xaxis of the Bloch sphere by angle θ. For this example,
where \(x(\theta ) = {\mathrm{Tr}}({\cal{Z}}(R_X(\theta )0\rangle \langle 0R_X(\theta )^\dagger )^2) = (1 + \cos ^2\theta )/2\). If θ is not an integer multiple of π, then x(θ) < 1, and x(θ)^{n} will be exponentially suppressed for large n. In other words, for large n, the landscape for x(θ)^{n} becomes similar to that of a delta function: it is zero for all θ except for multiples of π. Hence, for large n, it becomes difficult to train the unitary U(θ) because the gradient vanishes for most θ. This is just an illustrative example, but this issue is general. Generally speaking, for large n, the function C_{1} has a sharp gradient near its global minima, and the gradient vanishes when one is far away from these minima. Ultimately this limits C_{1}’s utility as a training function for large n.
In contrast, C_{2} does not suffer from this issue. For the example in the previous paragraph,
which is independent of n. So for this example the gradient of C_{2} does not vanish as n increases, and hence C_{2} can be used to train θ. More generally, the landscape of C_{2} is less barren than that of C_{1} for large n. We can argue this, particularly, for states ρ that have low rank or low entropy. The second term in Eq. (8), which is the term that provides the variability with α, does not vanish even for large n, since (as shown in Appendix B of SM):
Here, \(H(\rho ) =  {\mathrm{Tr}}(\rho \log _2\rho )\) is the von Neumann entropy, and r is the rank of ρ. So as long as ρ is low entropy or low rank, then the second term in C_{2} will not vanish. Note that a similar bound does not exist for second term in C_{1}, which does tend to vanish for large n.
Optimization methods
Finding α_{opt} in Eq. (6) is a major component of VQSD. While many works have benchmarked classical optimization algorithms (e.g., ref. ^{42}), the particular case of optimization for variational hybrid algorithms^{43} is limited and needs further work.^{44} Both gradientbased and gradientfree methods are possible, but gradientbased methods may not work as well with noisy data. Additionally, ref. ^{26} notes that gradients of a large class of circuit ansatze vanish when the number of parameters becomes large. These and other issues (e.g., sensitivity to initial conditions, number of function evaluations) should be considered when choosing an optimization method.
In our preliminary numerical analyses (see Appendix B in SM), we found that the Powell optimization algorithm^{45} performed the best on both quantum computer and simulator implementations of VQSD. This derivativefree algorithm uses a bidirectional search along each parameter using Brent’s method. Our studies showed that Powell’s method performed the best in terms of convergence, sensitivity to initial conditions, and number of correct solutions found. The implementation of Powell’s algorithm used in this paper can be found in the opensource Python package SciPy Optimize.^{46} Finally, Appendix B of SM shows how our layered ansatz for U_{p}(α) as well as proper initialization of U_{p}(α) helps in mitigating the problem of local minima.
Data availability
Data generated and analyzed during current study are available from the corresponding author upon reasonable request.
Code availability
The code used to generate some of the examples presented here and in Supplementary Material can be accessed from ref. ^{32}
Change history
02 August 2019
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
References
 1.
Shor, P. W. Polynomialtime algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 41, 303–332 (1999).
 2.
Lloyd, S., Mohseni, M. & Rebentrost, P. Quantum principal component analysis. Nat. Phys. 10, 631–633 (2014).
 3.
Harrow, A. W., Hassidim, A. & Lloyd, S. Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103, 150502 (2009).
 4.
Rebentrost, P., Steffens, A., Marvian, I. & Lloyd, S. Quantum singularvalue decomposition of nonsparse lowrank matrices. Phys. Rev. A 97, 012327 (2018).
 5.
Leyton, S. K. & Osborne, T. J. A quantum algorithm to solve nonlinear differential equations. arXiv:0812.4423, https://arxiv.org/abs/0812.4423 (2008).
 6.
Berry, D. W. Highorder quantum algorithm for solving linear differential equations. J. Phys. A 47, 105301 (2014).
 7.
Farhi, E., Goldstone, J. & Gutmann, S. A quantum approximate optimization algorithm. arXiv:1411.4028, https://arxiv.org/abs/1411.402 (2014).
 8.
Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 4213 (2014).
 9.
Kandala, A. et al. Hardwareefficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549, 242 (2017).
 10.
Berry, D. W., Childs, A. M., Cleve, R., Kothari, R. & Somma, R. D. Simulating Hamiltonian dynamics with a truncated taylor series. Phys. Rev. Lett. 114, 090502 (2015).
 11.
Preskill, J. Quantum computing and the entanglement frontier. arXiv:1203.5813, https://arxiv.org/abs/1203.5813 (2012).
 12.
Harrow, A. W. & Montanaro, A. Quantum computational supremacy. Nature 549, 203 (2017).
 13.
Bravyi, S., Smith, G. & Smolin, J. A. Trading classical and quantum computational resources. Phys. Rev. X 6, 021043 (2016).
 14.
Higgott, O., Wang, D. & Brierley, S. Variational quantum computation of excited states. arXiv:1805.08138, https://arxiv.org/abs/1805.08138 (2018).
 15.
Endo, S., Jones, T., McArdle, S., Yuan, X. & Benjamin, S. Variational quantum algorithms for discovering Hamiltonian spectra. arXiv:1806.05707, https://arxiv.org/abs/1806.05707 (2018).
 16.
Johnson, P. D., Romero, J., Olson, J., Cao, Y. & AspuruGuzik, A. QVECTOR: an algorithm for devicetailored quantum error correction. arXiv:1711.02249, https://arxiv.org/abs/1711.02249 (2017).
 17.
Romero, J., Olson, J. P. & AspuruGuzik, A. Quantum autoencoders for efficient compression of quantum data. Quantum Sci. Technol. 2, 045001 (2017).
 18.
Khoshaman, A., Vinci, W., Denis, B., Andriyash, E. & Amin, M. H. Quantum variational autoencoder. Quantum Sci. Technol. 4, 014001 (2018).
 19.
Li, Y. & Benjamin, S. C. Efficient variational quantum simulator incorporating active error minimization. Phys. Rev. X 7, 021050 (2017).
 20.
Kokail, C. et al. Selfverifying variational quantum simulation of the lattice schwinger model. Nature 569, 355 (2019).
 21.
Khatri, S. et al. Quantumassisted quantum compiling. Quantum 3, 140 (2019).
 22.
Li, H. & Haldane, F. D. M. Entanglement spectrum as a generalization of entanglement entropy: identification of topological order in nonabelian fractional quantum hall effect states. Phys. Rev. Lett. 101, 010504 (2008).
 23.
Giovannetti, V., Lloyd, S. & Maccone, L. Quantum random access memory. Phys. Rev. Lett. 100, 160501 (2008).
 24.
Pearson, K. On lines and planes of closest fit to systems of points in space. Lond., Edinb., Dublin Philos. Mag. J. Sci. 2, 559–572 (1901).
 25.
Trefethen, L. N. & Bau, D. Numerical Linear Algebra (SIAM, Philadelphia, PA, 1997).
 26.
McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush, R. & Neven, H. Barren plateaus in quantum neural network training landscapes. Nat. Commun. 9, 4812 (2018).
 27.
Grant, E., Wossnig, L., Ostaszewski, M. & Benedetti, M. An initialization strategy for addressing barren plateaus in parametrized quantum circuits. arXiv:1903.05076, https://arxiv.org/abs/1903.05076 (2019).
 28.
Vatan, F. & Williams, C. Optimal quantum circuits for general twoqubit gates. Phys. Rev. A 69, 032315 (2004).
 29.
Baumgratz, T., Cramer, M. & Plenio, M. B. Quantifying coherence. Phys. Rev. Lett. 113, 140401 (2014).
 30.
Buhrman, H., Cleve, R., Watrous, J. & De Wolf, R. Quantum fingerprinting. Phys. Rev. Lett. 87, 167902 (2001).
 31.
Gottesman, D. & Chuang, I. Quantum digital signatures. quantph/0105032, https://arxiv.org/abs/quantph/0105032 (2001).
 32.
VQSD source code. https://github.com/rmlarose/vqsd.
 33.
Smith, R. S., Curtis, M. J. & Zeng, W. J. A Practical Quantum Instruction Set Architecture. arXiv:1608.03355, https://arxiv.org/abs/1608.03355 (2016).
 34.
Hastings, M. B. An area law for onedimensional quantum systems. J. Stat. Mech.: Theory Exp. 2007, 08024 (2007).
 35.
Bauer, B. & Nayak, C. Area laws in a manybody localized state and its implications for topological order. J. Stat. Mech.: Theory Exp. 2013, 09005 (2013).
 36.
Grover, T. Certain general constraints on the manybody localization transition. arXiv:1405.1471, https://arxiv.org/abs/1405.1471 (2014).
 37.
Cincio, L., Subas, Y., Sornborger, A. T. & Coles, P. J. Learning the quantum algorithm for state overlap. New J. Phys. 20, 113022 (2018).
 38.
Jones, T. & Benjamin, S. C. Quantum compilation and circuit optimisation via energy dissipation. arXiv:1811.03147, https://arxiv.org/abs/1811.03147 (2018).
 39.
Tang, E. Quantuminspired classical algorithms for principal component analysis and supervised clustering. arXiv:1811.00414, https://arxiv.org/abs/1811.00414 (2018).
 40.
GarciaEscartin, J. C. & ChamorroPosada, P. Swap test and HongOuMandel effect are equivalent. Phys. Rev. A 87, 052330 (2013).
 41.
Smith, G. et al. Quantifying coherence and entanglement via simple measurements. arXiv:1707.09928, https://arxiv.org/abs/1707.09928 (2017).
 42.
Rios, L. M. & Sahinidis, N. V. Derivativefree optimization: a review of algorithms and comparison of software implementations. J. Glob. Optim. 56, 1247–1293 (2013).
 43.
Guerreschi, G. G. & Smelyanskiy, M. Practical optimization for hybrid quantumclassical algorithms. arXiv:1701.01450, https://arxiv.org/abs/1701.01450 (2017).
 44.
McClean, J. R., Romero, J., Babbush, R. & AspuruGuzik, A. The theory of variational hybrid quantumclassical algorithms. New J. Phys. 18, 023023 (2016).
 45.
Powell, M. J. D. A fast algorithm for nonlinearly constrained optimization calculations. In Numerical Analysis, Lecture Notes in Mathematics (ed. Watson, G. A.) 144–157 (Springer, Berlin, 1978).
 46.
Scipy optimization and root finding. https://docs.scipy.org/doc/scipy/reference/optimize.html (2018).
Acknowledgements
We thank Rigetti for providing access to their quantum computer. The views expressed in this paper are those of the authors and do not reflect those of Rigetti. R.L., E.O.NJ, and A.T. acknowledge support from the U.S. Department of Energy through a quantum computing program sponsored by the LANL Information Science & Technology Institute. R.L. acknowledges support from an Engineering Distinguished Fellowship through Michigan State University. L.C. was supported by the U.S. Department of Energy through the J. Robert Oppenheimer fellowship. P.J.C. was supported by the LANL ASC Beyond Moore’s Law project. LC and P.J.C. were also supported by the LDRD program at Los Alamos National Laboratory, by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, and also by the U.S. DOE, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, Condensed Matter Theory Program.
Author information
Affiliations
Contributions
R.L., A.T., and L.C. implemented the algorithms and performed numerical analysis. L.C. and P.J.C. designed the project. P.J.C. proposed the cost function and proved the analytical results. R.L., A.T., É.O.N.J., L.C., and P.J.C. contributed to data analysis, as well as writing and editing the final paper.
Corresponding author
Correspondence to Lukasz Cincio.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A correction to this article is available online at https://doi.org/10.1038/s4153401901783.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
LaRose, R., Tikku, A., O’NeelJudy, É. et al. Variational quantum state diagonalization. npj Quantum Inf 5, 57 (2019) doi:10.1038/s4153401901676
Received
Accepted
Published
DOI
Further reading

Variational consistent histories as a hybrid algorithm for quantum foundations
Nature Communications (2019)

Parameterized quantum circuits as machine learning models
Quantum Science and Technology (2019)

PhysicalLayer Supervised Learning Assisted by an Entangled Sensor Network
Physical Review X (2019)

Quantum circuits synthesis using Householder transformations
Computer Physics Communications (2019)

Quantumassisted quantum compiling
Quantum (2019)