Sign-problem free quantum stochastic series expansion algorithm on a quantum computer

A quantum implementation of the Stochastic Series Expansion (SSE) Monte Carlo method is proposed, and is shown to offer significant advantages over classical implementations of SSE. In particular, for problems where classical SSE encounters the sign problem, the cost of implementing a Monte Carlo iteration scales only linearly with system size in quantum SSE, while it may scale exponentially with system size in classical SSE. In cases where classical SSE can be efficiently implemented, quantum SSE still offers an advantage by allowing for more general observables to be measured.


INTRODUCTION
Quantum Monte Carlo methods have evolved to be indispensable in the study of strongly correlated many-body systems where the interplay between competing interactions result in quantum phases that are not observed in their non-interacting counterparts. In the study of strongly correlated systems, computational methods are vital, as standard analytical techniques based on single particle picture or perturbation theory are often rendered ineffective. Among the different numerical techniques, quantum Monte Carlo methods have proven to be very powerful for their ability to simulate a wide range of realistic microscopic Hamiltonians on relatively large system sizes at all temperatures and in all dimensions.
The phenomenal development of quantum information theory over the past two decades and the advent of quantum computers in the past couple of years have significantly broadened the potential for simulating strongly interacting quantum many-body systems. The ability to represent superposition of states directly on a quantum computer promises exponential speedup of quantum algorithms over their classical counterparts. Algorithms that exploit quantum hardware to speed up simulations of the thermal Gibbs state of many-body systems have previously been explored in refs. 1-8 , but we are still at a nascent stage of harnessing the power of quantum computation in studying correlated systems.
The Stochastic Series Expansion (SSE) 9-12 method is a widely used Quantum Monte Carlo (QMC) method for simulating models of quantum many-body systems. It is based on sampling the series expansion of expðÀβHÞ up to a sufficiently high order. A significant advantage of SSE is that expectation values that are obtained via this method are exact, up to statistical errors. Other approaches include the world line method [13][14][15][16] , and the Density Matrix Renormalization Group (DMRG) method 16,17,18 . In this article, we propose an implementation of SSE on a quantum computer and compare its efficiency to conventional SSE on a classical computer. We refer to the former as quantum SSE and the latter as classical SSE and demonstrate several advantages that quantum SSE has over classical SSE. In particular, it will be argued that the no-branching requirement 19 from classical SSE can be relaxed in quantum SSE, which leads to important consequences for the simulation of many-body systems. This quantum advantage stems from the ability of quantum processors to prepare nontrivial superpositions of quantum states. In the subsequent discussion, it is assumed the simulated object is a quantum many-body system with N particles and k-local interactions.
First, lifting the no-branching requirement in quantum SSE means that we are no longer limited to basis states that permit a diagonal representation. This has the effect of allowing more general quantum observables to be measured in quantum SSE.
The second consequence is that quantum SSE always leads to nonnegative weights, which are directly sampled from the measurement probabilities of quantum circuits that scale linearly with system size. By comparison, classical SSE by itself cannot guarantee nonnegative weights without imposing strict conditions such as no-branching. This implies that quantum computers may be able to simulate many-body systems currently inaccessible to classical SSE methods due to the famous sign problem 20,21 . Notably, the Quantum Metropolis Sampling (QMS) 3 algorithm also avoids the sign problem by repeated use of the quantum phase estimation algorithm 22 . However, quantum phase estimation requires deep quantum circuits, and approximates the unitary operation U ¼ expðiHtÞ via the Suzuki-Trotter decomposition 23 . This necessarily introduces a systematic error, unlike exact QMC methods such as SSE, which does not involve Trotterization. The same argument applies to any putative quantum version of Trotter-based algorithms such as the World Line QMC. Although the Continuous Time (CT) World Line method 16,24 avoids Trotterization, the simpler structure of the SSE operator string compared to CT QMC makes it better suited for implementation on quantum architectures, especially on near term, noisy quantum processors which require simpler quantum circuits in general.
This article is structured as follows: First, we introduce the broad ideas underlying the SSE QMC method. Second, we will describe a possible SSE implementation on a quantum computer, first for a simpler special case, then for the more general case. Third, we simulate the quantum SSE algorithm for one dimensional spin chains and compare it with exact results. Finally, we discuss how the no-branching condition and the sign problem affects classical SSE, and evaluate the advantages that quantum SSE offers over classical SSE. 1

Stochastic series expansion
The canonical SSE is a finite temperature quantum Monte Carlo algorithm based on the stochastic sampling of diagonal matrix elements in the Taylor series expansion of the density matrix in a suitably chosen basis 19 . For efficient implementation, it is necessary to decompose the Hamiltonian as H = −∑ b H b . The partition function is written as where f α j ig is a complete set of states, b denotes the operator string b n …b 1 and M is some sufficiently large cutoff in the expansion power. Assuming that each term α h jH bn H b1 α j i is nonnegative, the algorithm proceeds by randomly sampling the configuration space C :¼ fðn; b; αÞ 8 n; b; αg. The thermal expectation value of any operator, O, is given by where p C :¼ β n n! α h jH bn H b1 α j i=Z. Finding 〈f(O, C)〉 for any O is not necessarily trivial, but when O is diagonal in the chosen basis, f ðO; CÞ :¼ α h jO α j i is an unbiased estimator that is readily evaluated.
The SSE method is numerically exact up to statistical errors because the truncation of the Taylor series expansion of the density matrix does not introduce any systematic error. While in principle, operator strings for every expansion power n should be considered, their weight decreases rapidly with β n n! for large values of n ≫ β, and is not reachable in practice when a finite number of MC steps are performed. A finite simulation always has some maximum operator string length that can be found empirically via the Monte Carlo updating scheme described as follows: The maximum expansion power depends on the values of the inverse temperature, β, and the Hamiltonian parameters, and is determined empirically by dynamically adjusting the operator string length during the equilibration stage of the simulation. M is then set as some higher number that is never reached by the finite simulation. Importantly, because no operator string length sampled ever reaches M, the truncated terms do not contribute to the final statistics of the simulation so there is no systematic contribution to the error, i.e., the only errors are statistical. In contrast, Trotterization with a finite number of steps always contribute errors regardless of the number of samples obtained, so the error contribution is inherently systematic.

SSE on a quantum computer
We now propose a method of implementing the SSE Monte Carlo algorithm on a quantum computer. The classical implementation of the SSE method requires that H b satisfy a no-branching condition in order for the algorithm to be efficient (see Discussion). However, this requirement is no longer necessary on quantum computers as they naturally allow for superpositions of a large number of states. We can therefore choose a more convenient decomposition. In general, it is always possible to decompose any Hamiltonian as a sum of products of Pauli matrices: where in general b i = 0, 1, 2, 3 and σ 0 :¼ 1, σ 1 := σ x , σ 2 := σ y and σ 3 := σ z . Note that in this notation, we used the upper index to label the ith Pauli matrix in the product. This is different from the lower index used to label the operator string b = b n …b 1 in H bn H b1 . In order to illustrate the quantum SSE method, we first consider a special case where the operators h bi mutually commute, for example, b i = 0, 1 (1 and σ x ). Specifically, we consider the Hamiltonian Such problems can already be nontrivial. For instance, in ref. 25 , the Hamiltonian (4) was considered as an example of a many-body system that is NP hard to simulate for certain lattice configurations. We , which ensures that H b is always positive semidefinite. We can readily verify that {H b } forms a set of mutually commuting semidefinite operators: where we used the fact that b i = 0, 1 and σ ðiÞ b i can only be either be 1 or σ x and they mutually commute. The positive semidefiniteness and commutativity of H b guarantees that a product of such operators H bn H b1 is also positive semidefinite (see Supplementary Information under "Positive semidefiniteness: special case"). Making H b positive semidefinite is equivalent to adding a constant to the Hamiltonian H ! H þ k1 where k: = ∑ b |h b |, such that the total Hamiltonian is also positive semidefinite. With the positivity of α h jH bn H b1 α j i assured, we need a method of sampling the relative weight of a given configuration (n, b, α). In classical SSE, positive weights cannot be assured except for some special choices of α j i (see Discussion). Here, we describe a quantum implementation which ensures positivity for arbitrary α j i. A quantum algorithm relies on defining appropriate states and unitary operations that can be implemented efficiently on a quantum computer. In the following, we outline how to develop an efficient estimator of the relative weight of a configuration. Let us consider a state with (N + n) qubits of the form: , N is the number of particles in the system we are simulating, n is the expansion power in SSE, A = A 1 …A N denotes the simulated system, and A i with i ∈ {1, 2, …N} denotes the ith particle in the simulated system. Observe that and 1 A . We define the following controlled unitary operation: By calculating the expectation value hψ in jU A;Bi jψ in i, one can evaluate an estimator for the relative weight The spectrum of H bi =jh bi j lies in the range [0, 2] and so the spectrum of H bn H b1 =jh bn h b1 j is within [0, 2 n ]. The projected amplitude is therefore not necessarily exponentially small due to the 1/2 n factor even for relatively large expansion orders n. In order to find a bounded error estimate for q(n, b, α), quantum circuits are repeatedly prepared and then measured in the computational basis t times. An example of the type of quantum circuit being K.C. Tan et al. measured is shown in Fig. 1b. Each of these measurements yield a classical bitstring that is a series of 0s or 1s and out of these t bitstrings, we count the number of bitstrings that are all 0s. The ratio to the total number of samples t estimates q(n, b, α). Since this is a binary distribution, the sample variance scales with~1/t, so estimating q(n, b, α) to target precision ε requires t~1/ε 2 samples in general. In this way, the configuration weights can be estimated to any target degree of numerical precision.
Alternatively, we can also perform a quantum subroutine called amplitude estimation 26 (see Supplementary Information under "Amplitude estimation"). In general, to estimate the probability p to any desired precision ϵ with success probability 1 − δ, the subroutine needs to be invoke certain unitary operations a total of t = t(ϵ, δ) times, where t(ϵ, δ) only depends on the desired precision ϵ and success probability 1 − δ. In this case, the variance scales with~1/t 2 , where t is now the number of times the unitary operations are applied rather than the number of independent samples.

Stochastic sampling of operator space
Once the relative weight of a configuration C is sampled, the Monte Carlo simulation proceeds by stochastically sampling the operator space via the Metropolis method. This consists of randomly selecting some new configuration C 0 , and then accepting the newly chosen configuration with probability P accept ðC ! C 0 Þ :¼ min WðC 0 Þ=WðCÞ; 1 ½ where W(C) is the weight of a configuration C = (n, b, α). It is given by the following expression where q(n, b, α) is the probability sampled in Eq. 9. In the Metropolis algorithm, it is implicitly assumed that the probability of selecting C 0 when the current configuration is C is the same as the probability of selecting C when the current configuration is C 0 , i.e., P select ðC ! C 0 Þ ¼ P select ðC 0 ! CÞ. Each of the variables n, b, α is updated separately using the ratio WðC 0 Þ=WðCÞ as the probability of accepting an update. In quantum SSE, the update probabilities are determined by ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qðn 0 ; b 0 ; α 0 Þ=qðn; b; αÞ p , which are obtained by sampling the probabilities q(n, b, α) from the quantum circuit (see Eq. 9). Explicit expressions for the update probabilities are provided in the Supplementary Information under "Metropolis acceptance weights". The calculation of the update probabilities does not involve the 2 n multiplicative factor in Eq. 10, so an exponential blowup in the sampling error does not occur.

Quantum implementation of SSE for general Hamiltonians
In the section "SSE on a quantum computer" we demonstrated a special case implementation of quantum SSE where the quantity α h jH bn H b1 α j i is guaranteed to be non-negative. For general Hamiltonians, this may not always be possible because the operator H bn H b1 may not be Hermitian. In this section, we discuss an implementation for more general Hamiltonians.
Recall the expression for the expectation value of an operator: hOi ¼ X  Therefore, in order to implement quantum SSE, we only need to sample the real portion of α h jH bn H b1 α j i and ensure that it is nonnegative. We show that this can be done by adding a sufficiently large constant to the Hamiltonian (see Supplementary Information under "Positive semidefiniteness: general case").
Suppose M ≥ n is the cutoff in the expansion power (see Eq. 1).
For a fixed M, let . We note that this is an unequal superposition of 2 unitary operations that depends on the cutoff value M.
We introduce the state where As before, by introducing appropriate unitary operators, the relative weight of the configurations can be evaluated. In addition to U A;Bi , we further define the unitary V AB,C , which is controlled by qubit C: For any given expansion power n, we can verify the expression: which allows us to define an estimator for the relative weight: Re αA h jHb n H b 1 αA j i ð2M þ 1Þ n jhb n hb 1 j 2 (15) Note that the spectrum of H bi =jh bi j is in the range [0, 2M + 1] so the absolute value of Re α A h jH bn H b1 α A j i=jh bn h b1 j is within the range [0, (2M + 1) n ]. The configuration weights, and hence Re α A h jH bn H b1 α A j i, are always nonnegative. From the above arguments, we see that the configurations can be sampled by measuring the probability q(n, b, α). The Metropolis portion of the simulation then proceeds as before, where the acceptance probability depends on the ratio ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qðn 0 ; b 0 ; α 0 Þ=qðn; b; αÞ p . Note that the above argument finds a sufficiently large constant to add to the Hamiltonian to avoid negative weights. This constant may be too large for many specific problems. We expect that the minimum constant that is required can be optimized on a case by case basis. We also highlight that adding a constant to the Hamiltonian is only necessary when the initial Hamiltonian contains negatively weighted configurations. The above arguments show that quantum SSE is able to avoid negative weights for general Hamiltonians and arbitrary basis states α j i, which is not possible in general for classical SSE except for special cases.
Example: 1D antiferromagnetic spin-1/2 chain To demonstrate the proposed algorithm, we simulated the Hamiltonian of one dimensional periodic spin chains with N = 3, 4, 5 sites. The algorithm was implemented using the quantum simulation toolkit Qiskit 27 and the measured observables were compared with results from exact diagonalization. The Hamiltonian with antiferromagnetic exchange is given by where J > 0 and b(i) is the i-th site of the b-th bond (see Fig. 1a). The classical SSE implementation violates the no-branching condition and suffers from the sign problem if we do not use the eigenstates of σ x to construct the basis states, α j i. In quantum SSE this constraint no longer exists as the no-branching requirement is lifted and the string of bond operators have positive-semidefinite weights. To illustrate this, we choose the basis states α j i to be product states of σ z eigenvectors (i.e., products of " j i; # j i) rotated by Hadamard gates and non-Clifford phase gates T ¼ 100e iπ=4 À Á . In general, quantum circuits with non-Clifford gates are known to be difficult to simulate classically 28,29 .
The energy calculations from quantum SSE as a function of the number of Metropolis iterations are shown Fig. 1c-e for site numbers N = 3, 4, 5 respectively at β = 1. For all the cases considered, it was observed that the mean energy computed via quantum SSE converges towards the exact finite temperature energy of the system obtained from exact diagonalization, thus demonstrating the validity of the algorithm.

DISCUSSION
Recall that implementing SSE Monte Carlo requires each term α h jH bn H b1 α j i in the expansion to be nonnegative. In general, this cannot be always guaranteed except for special cases. This results in the infamous sign problem 20,21 which severely restricts the applicability of QMC methods.
In the classical implementation of SSE, the sign problem arises from the so-called no-branching condition. This is the requirement that H b α j i / α 0 j i, where α 0 j i is also a basis vector. In other words, we always have to use a decomposition of H = ∑ b H b such that H b does not create superpositions of basis states. For any given basis, this means that every H b satisfying the no-branching requirement can be classified as a diagonal operator satisfying H b α j i / α j i for every α, or an off-diagonal update satisfying H b α j i / α 0 j i where α ≠ α 0 for some α.
A diagonal update can always be made positive by adding a sufficiently large constant. This is because if H b is a diagonal update, then H 0 b α j i :¼ ðH b þ k1Þ α j i / α j i is also a diagonal update.
On the other hand, we see that if H b is an off-diagonal update, adding a constant will necessarily create a superposition of basis states, since ðH b þ k1Þ α j i / h b;α α 0 j i þ k α j i where α ≠ α 0 . This means that we cannot guarantee that H b is always positive semidefinite for off-diagonal updates, without violating the nobranching condition. This in turn implies that α h jH bn H b1 α j i is not necessarily positive, which is the sign problem.
From the above, we see that the sign problem exists because of the no-branching requirement. If we avoid the sign problem by lifting no-branching requirement, one will have to keep track of all the off-diagonal elements of H bn H b1 α j i. In the worst case, the computational resources required to keep track of an arbitrary superposition of basis states is of the order OðexpðNÞÞ, where N is the number of particles.
The primary benefit of the quantum SSE method is that it does not require the no-branching condition, as quantum computers naturally allows for the creation of superpositions of quantum states. This allows us to sample the relative weights of a given configuration directly, without needing to keep track of all the offdiagonal elements. By lifting the no-branching requirement, we can always ensure that the relative weights are nonnegative, thus also avoiding the sign problem. We have shown this for the special case where the Hamiltonian can be decomposed into products of 1 or σ x , as well as for more general Hamiltonians.
We now discuss several advantages that quantum SSE offers over classical SSE. The first advantage is that a much wider range of observables can be measured using quantum SSE. Both quantum and classical SSE compute statistical averages most easily when the observable O is diagonal in the basis α j i. Unlike classical SSE, however, quantum SSE does not require the nobranching condition so there is no longer any limitations on the choice of basis states α j i. For any given operator O, we can now choose any basis f α j ig that diagonalizes the observable O, as long as the the basis state α j i can be efficiently prepared on the quantum computer. The corresponding estimator for the observable O is then f ðO; CÞ :¼ α h jO α j i. Consequently, quantum SSE allows more general quantum observables to be measured compared to classical SSE. An example of this is when O ¼ ϕ j i ϕ h j for a known quantum state ϕ j i. In this case, O is the projector onto the state ϕ j i and hOi ¼ ϕ h je ÀβH 0 =Z ϕ j i is the overlap between ϕ j i and the thermal state e ÀβH 0 =Z. In general, finding the state overlap is not easily implementable using classical SSE. In the Shastry-Sutherland model [30][31][32] for instance, this can be used to directly verify that the ground state is a product of singlet pairs. This is achieved by by letting ϕ j i be a product of singlets and then sampling the expectation values using quantum SSE.
A second advantage of quantum SSE is the low circuit complexity required to estimate the configuration weight. Consider the computational cost of implementing quantum SSE for the special case discussed in the section "SSE on a quantum computer". Here, sampling α A h jH bn H b1 α A j i given some operator string b requires a total of n unitary operations to be performed, multiplied by the number of samples t for any target numerical precision. Combine this with the fact that 〈n〉, the average expansion power in SSE, is proportional to the system energy and scales with βN, and the end result is that the total number of quantum gates required to sample the configuration weight requires OðnÞ $ OðNÞ number of operations, i.e., it scales linearly with system size. A similar argument can also be made if we employ the amplitude estimation algorithm (see Supplementary Information under "Amplitude estimation"). An analysis for general Hamiltonians will lead to the same conclusion, since essentially the same set of unitary operations are performed, except with an additional control operation. We therefore expect the size of the quantum circuits in quantum SSE to scale with OðNÞ in general.
Compare this to the classical SSE algorithm. In classical SSE, the cost of sampling the configuration weight is OðNÞ only when the no-branching condition is satisfied and there is no sign problem. As previously discussed, a consequence of this is that the no-branching condition fixes the basis α j i, which limits the kinds of observables that classical SSE can measure. More generally, without the nobranching condition, classical SSE encounters the sign problem and the computational cost of avoiding negative weights is generally $ OðexpðNÞÞ. By comparison, the complexity of the quantum circuit is OðNÞ if the basis state α j i is generated by a circuit of size OðNÞ, or poly(N) if α j i is generated by a circuit of size poly(N). Taking into account all such possible basis states, we expect the improvement in terms of computational complexity to be exponential.
The third advantage of quantum SSE is that it can be efficiently parallelized to a short depth quantum circuit. Short depth circuits belong to the quantum complexity class QNC, which is the class of quantum circuits with polylogarithmic depth. In order to see this, observe that the unitary operations U A;Bi and V AB,C are composed of layers of controlled-Pauli operations. In general, such operations are known to belong to the quantum complexity class QNC 133 , which has depth complexity Oðlog NÞ. This means that the depth complexity of quantum SSE belongs to QNC 1 for any α j i generated in logarithmic depth, or more generally QNC if α j i is generated in polylogarithmic depth. In particular, the example in Fig. 1 belongs to QNC 1 . Note that this circuit contains non-Clifford T gates and that non-Clifford gates are not efficiently simulable on classical computers 28,29 . Such short depth quantum circuits are also especially well suited for implementation on NISQ processors 34 with limited coherence times. To the best of our knowledge, we are not aware of a similar result for classical SSE.
In summary, we proposed a quantum implementation of the SSE Monte Carlo algorithm and compared it to its classical counterpart. It was shown that the cost of implementing a single Monte Carlo update in quantum SSE scales linearly with the number of particles N. We compare this to the classical implementation of SSE, where certain many-body systems exhibit the sign problem and incurs an additional cost that scales exponentially with N. The quantum algorithm was able to avoid this by ensuring that the weight of the configuration is always positive, regardless of the chosen basis. This suggests that quantum computers can significantly speed up the simulation of complex quantum many-body systems. Even when the sign problem is not present in classical SSE, quantum SSE can still be advantageous, since it allows for more general observables to be measured. This was demonstrated via a numerical simulation of a 1D spin-1/2 chain using the quantum SSE algorithm in combination with a basis that is typically hard to implement in classical SSE. In all cases considered, it was shown that quantum SSE converged to the correct results obtained from exact diagonalization.
It is known that estimating the ground state energy of a k-local Hamiltonian is QMA complete 25,35,36 . Here, we have shown that quantum SSE can perform bounded error estimates of the configuration weights in polynomial time, which is not always possible in classical SSE. While this quantum speedup removes a major bottleneck in classical SSE, it does not necessarily imply an exponentially fast rate of convergence to the ground state energy, nor that QMA complete problems can be solved in polynomial time in general. Nonetheless, the quantum SSE algorithm shows that quantum computers are promising tools for accelerating the SSE Monte Carlo simulation in many scenarios. This may provide a pathway for probing the quantum properties of many-body systems that are currently inaccessible to existing classical techniques.

METHODS
We describe in detail how the simulation in Section "Example: 1D antiferromagnetic spin chain" was performed. After adding identity operators to the bond operators to make them positive-semidefinite, the effective Hamiltonian for J = 1 in the quantum SSE simulation is where H b ¼ 1 À σ bð1Þ x σ bð2Þ x . The controlled unitary operator U b A;Bi performs the map, that were introduced in the section "SSE on a quantum computer". The expectation value of operator strings H b1 H b2 H bn -needed to evaluate the partition function-is related to U b A;Bi according to (|h b | = 1 ∀ b since J = 1): An example quantum circuit performing this measurement is shown in Fig. 1b. In this example circuit, there are six physical qubits where qubits q 0 , q 1 , q 2 are ancillae, and qubits q 3 , q 4 , q 5 represent the system. The quantum circuit determines the expectation value of string of U b3 A;q 2 U b2 A;q 1 U b1 A;q 0 for a three site periodic system when n = 3. We will describe in detail the steps involved in the circuit in Fig. 1b. In Step I, the ancilla qubits q 0 , q 1 and q 3 are prepared in the states À q 0 ; À q 1 ; À q 2 respectively, using Hadamard and Pauli X gates. In Step II the system qubits q 3 , q 4 and q 5 representing the spin-1/2 sites of the physical spin chain are prepared in some initial state α j i A by applying either the identity or the Pauli X-gate followed by a Hadamard and the non-Clifford T-gate. This generates basis states via a constant depth, non-Clifford quantum circuit. In Step III the unitary operations U b3 A;q 2 ; U b2 A;q 1 and U b1 A;q 0 are completed by sequentially applying CNOT operations onto α j i A À q 0 À q 1 À q 2 .