Classical variational simulation of the Quantum Approximate Optimization Algorithm

A key open question in quantum computing is whether quantum algorithms can potentially offer a significant advantage over classical algorithms for tasks of practical interest. Understanding the limits of classical computing in simulating quantum systems is an important component of addressing this question. We introduce a method to simulate layered quantum circuits consisting of parametrized gates, an architecture behind many variational quantum algorithms suitable for near-term quantum computers. A neural-network parametrization of the many-qubit wave function is used, focusing on states relevant for the Quantum Approximate Optimization Algorithm (QAOA). For the largest circuits simulated, we reach 54 qubits at 4 QAOA layers, approximately implementing 324 RZZ gates and 216 RX gates without requiring large-scale computational resources. For larger systems, our approach can be used to provide accurate QAOA simulations at previously unexplored parameter values and to benchmark the next generation of experiments in the Noisy Intermediate-Scale Quantum (NISQ) era.


I. INTRODUCTION
The past decade has seen a fast development of quantum technologies and the achievement of an unprecedented level of control in quantum hardware [1], clearing the way for demonstrations of quantum computing applications for practical uses. However, near-term applications face some of the limitations intrinsic to the current generation of quantum computers, often referred to as Noisy Intermediate-Scale Quantum (NISQ) hardware [2]. In this regime, a limited qubit count and absence of quantum error correction constrain the kind of applications that can be successfully realized. Despite these limitations, hybrid classical-quantum algorithms [3][4][5][6] have been identified as the ideal candidates to assess the first possible advantage of quantum computing in practical applications [7][8][9][10].
The Quantum Approximate Optimization Algorithm (QAOA) [5] is a notable example of variational quantum algorithm with prospects of quantum speedup on nearterm devices. Devised to take advantage of quantum effects to solve combinatorial optimization problems, it has been extensively theoretically characterized [11][12][13][14][15][16], and also experimentally realized on state-of-the-art NISQ hardware [17]. While the general presence of quantum advantage in quantum optimization algorithms remains an open question [18][19][20][21], QAOA has gained popularity as a quantum hardware benchmark [22][23][24][25]. As its desired output is essentially a classical state, the question arises whether a specialized classical algorithm can efficiently simulate it [26], at least near the variational optimum. In this paper, we use a variational parametrization of the many-qubit state based on Neural Network Quantum States (NQS) [27] and extend the method of Ref. [28] to simulate QAOA. This approach trades the need for exact brute force exponentially scaling classical simulation with an approximate, yet accurate, classical variational description of the quantum circuit. In turn, we obtain an heuristic classical method that can significantly expand the possibilities to simulate NISQ-era quantum optimization algorithms. We successfully simulate the Max-Cut QAOA circuit [5,11,17] for 54 qubits at depth p = 4 and use the method to perform a variational parameter sweep on a 1D cut of the parameter space. The method is contrasted with state-of-the-art classical simulations based on low-rank Clifford group decompositions [26], whose complexity is exponential in the number of non-Clifford gates as well as tensor-based approaches [29]. Instead, limitations of the approach are discussed in terms of the QAOA parameter space and its relation to different initializations of the stochastic optimization method used in this work.

A. The Quantum Approximate Optimization Algorithm
The Quantum Approximate Optimization Algorithm (QAOA) is a variational quantum algorithm for approximately solving discrete combinatorial optimization problems. Since its inception in the seminal work of Farhi, Goldstone and Gutmann [5,12], QAOA has been applied to Maximum Cut (Max-Cut) problems. With competing classical algorithms [30] offering exact performance bounds for all graphs, an open question remains -can QAOA perform better by increasing the number of free parameters?
In this work, we study a quadratic cost function [31,32] associated with a Max-Cut problem. If we consider a graph G = (V, E) with edges E and vertices V , the Max- Cut of the graph G is defined by the following operator: where w ij are the edge weights and Z i are Pauli operators. The classical bitstring B that minimizes B| C |B is the graph partition with the maximum cut. QAOA approximates such a quantum state through a quantum circuit of predefined depth p: where |+ is a symmetric superposition of all computational basis states: |+ = H ⊗N |0 ⊗N for N qubits. The set of 2p real numbers γ i and β i for i = 1 . . . p define the variational parameters to be optimized over by an external classical optimizer. The unitary gates defining the parametrized quantum circuit read U B (β) = i∈V e −iβXi and U C (γ) = e −iγC . Optimal variational parameters γ and β are then found through an outer-loop classical optimizer of the following quantum expectation value: It is known that, for QAOA cost operators of the general form C = k C k (Z 1 , . . . , Z N ), the optimal value asymptotically converges to the minimum value: where C p is the optimal cost value at QAOA depth p and B are classical bit strings. With modern simulations and implementations still being restricted to lower p values, it is unclear how large p has to get in practice before QAOA becomes comparable with its classical competition.
In this work we consider 3-regular graphs with all weights w ij set to unity at QAOA depths of p = 1, 2, 4.

B. Classical Variational Simulation
Consider a quantum system consisting of N qubits. The Hilbert space is spanned by the computational basis {|B : B ∈ {0, 1} N } of classical bit strings B = (B 1 , . . . , B N ). A general state can be expanded in this basis as |ψ = B ψ(B) |B . The convention Z i |B = (−1) Bi |B is adopted. In order to perform approximate classical simulations of the QAOA quantum circuit, we use a neural-network representation of the manybody wavefunction ψ(B) associated with this system, and specifically adopt a shallow network of the Restricted Boltzmann Machine (RBM) type: [34][35][36] The RBM provides a classical variational representa-

FIG. 2.
Benchmarking the cost function for 20 qubits. a: The exact variational QAOA landscape at p = 1 of a random 20-qubit instance of a 3-regular graph is presented, calculated using the analytical cost formula (see Appendix A). The optimum is found using a gradient-based optimizer [33] and marked. The restricted cut along the constant-β line and at optimal γ is more closely studied in panel b. b: RBM-based output wavefunctions are contrasted with exact results. c: A similar variational landscape cut is presented at p = 2. Optimal p = 2 QAOA parameters are calculated using numerical derivatives and a gradient-based optimizer. Parameters γ1, β1 and β2 are fixed at their optimal values while the cost function γ2-dependence is investigated. We note that our approach is able to accurately reproduce the increased proximity to the combinatorial optimum associated with increasing QAOA depth p. tion of the quantum state [27]. It is parametrized by a set of complex parameters θ = {a, b, W } -visible biases a = (a 1 , . . . , a N ), hidden biases b = (b 1 , . . . , b N h ) and weights W = (W j,k : j = 1 . . . N, k = 1 . . . N h ). The complex-valued ansatz given in Eq. 5 is, in general, not normalized.
We note that the N -qubit |+ state required for initializing QAOA can always be exactly implemented by setting all variational parameters to 0. That choice ensures that the wavefunction ansatz given in Eq. 5 is constant across all computational basis states, as required. The advantage of using the ansatz given in Eq. 5 as an Nqubit state is that a subset of one-and two-qubit gates can be exactly implemented as mappings between different sets of variational parameters θ → θ . In general, such mapping corresponding to an abstract gate G is found as the solution of the following nonlinear equation: for all bitstrings B and any constant C, if a solution exists. For example, consider the Pauli Z gate acting on qubit i. In that case, Eq. 6 reads e a i Bi = C(−1) Bi e aiBi after trivial simplification. The solution is a i = a i +iπ for C = 1, with all other parameters remaining unchanged. In addition, one can exactly implement a subset of twoqubit gates by introducing an additional hidden unit coupled only to the two qubits in question. Labeling the new unit by c, we can implement the RZZ gate relevant for QAOA. The gate is given as RZZ(φ) = e −iφZiZj ∝ diag(1, e iφ , e iφ , 1) up to a global phase. The replacement rules read: where A(φ) = Arccosh e iφ and C = 2. Derivations of replacement rules for these and other common one and two-qubit gates can be found in Sec. IV. Not all gates can be applied through solving Eq. 6. Most notably, gates that form superpositions belong in this category, including U B (β) = i e −iβXi required for running QAOA. This happens simply because a linear combination of two or more RBMs cannot be exactly represented by a single new RBM through a simple variational parameter change. To simulate those gates, we employ a variational stochastic optimization scheme.
For larger p, extra hidden units introduced when applying U C (γ) at each layer can result in a large number of associated parameters to optimize over that are not strictly required for accurate output state approximations. So to keep the parameter count in check, we insert a model compression step which halves the number of hidden units immediately after applying U C doubles it. Specifically we create an RBM with fewer hidden units and fit it to the output distribution of the larger RBM (output of U C ). Exact circuit placement of compression steps are shown on Fig. 1 and details are provided in Sec. IV. As a result of the compression step, we are able to keep the number of hidden units in our RBM ansatz constant, explicitly controlling the variational parameter count.

C. Simulation results for 20 qubits
In this section we present our simulation results for Max-Cut QAOA on random regular graphs of order N [40][41][42]. In addition, we discuss model limitations and its relation to current state-of-the-art simulations.
QAOA angles γ, β are required as an input of our RBM-based simulator. At p = 1, we base our parameter choices on the position of global optimum that can be computed exactly (see Appendix A). For p > 1, we resort to direct numerical evaluation of the cost function as given in Eq. 1 from either the complete state vector of the system (number of qubits permitting) or from importance-sampling the output state as represented by a RBM. For all p, we find the optimal angles using Adam [33] with either exact gradients or their finite-difference approximations.
We begin by studying the performance of our approach on a 20-qubit system corresponding to the Max-Cut problem on a 3-regular graph of order N = 20. In that case, access to exact numerical wavefunctions is not yet severely restricted by the number of qubits. That makes it a suitable test-case. The results can be found in Fig. 2.
In Fig. 2, we present the cost function for several values of QAOA angles, as computed by the RBM-based simulator. Each panel shows cost functions from one typical random 3-regular graph instance. We observe that cost landscapes, optimal angles and algorithm performance do not change appreciably between different random graph instances. We can see that our approach reproduces variations in the cost landscape associated with different choices of QAOA angles at both p = 1 and p = 2. At p = 1, an exact formula (see Appendix A) is available for comparison of cost function values. We report that, at optimal angles, the overall final fidelity (overlap squared) is consistently above 94% for all random graph instances we simulate.
In addition to cost function values, we also benchmark our RBM-based approach by computing fidelities between our variational states and exact simulations. In Fig. 3 we show the dependence of fidelity on the number of qubits and circuit depth p. While, in general, it is hard to analytically predict the behavior of these fidelities, we nonetheless remark that with relatively small NQS we can already achieve fidelities in excess of 92% for all system sizes considered for exact benchmarks.

D. Simulation results for 54 qubits
Our approach can be readily extended to system sizes that are not easily amenable to exact classical simulation. To show this, in Fig. 4 we show the case of N = 54 10  qubits. This number of qubits corresponds, for example, to what implemented by Google's Sycamore processor, while our approach shares no other implementation details with that specific platform. For the system of N = 54 qubits, we closely reproduce the exact error curve (see Appendix A) at p = 1, implementing 81 RZZ (e −iγZ⊗Z ) gates exactly and 54 RX (e −iβX ) gates approximately, using the described optimization method. We also perform simulations at p = 2 and p = 4 and obtain corresponding approximate QAOA cost function values. At p = 4, we exactly implement 324 RZZ gates and approximately implement 216 RX gates. This circuit size and depth is such that there is no available experimental or numerically exact result to compare against. The accuracy of our approach can nonetheless be quantified using intermediate variational fidelity estimates. These fidelities are exactly the cost functions (see Sec. IV) we optimize, separately for each qubit. In Fig. 4 (panel b) we show the optimal variational fidelities (see Eq. 8) found when approximating the action of RX gates with the RBM wave function. At optimal γ 4 (minimum of p = 4 curve at Fig. 4, panel a), the lowest variational fidelity reached was above 98%, for a typical random graph instance shown at Fig. 4. As noted earlier, exact final states of 54-qubit systems are intractable so we are unable to report or estimate the full many-qubit fidelity benchmark results.
We remark that the stochastic optimization performance is sensitive to choices of QAOA angles away from optimum (see Fig. 4 right). In general, we report that the fidelity between the RBM state (Eq. 5) and the exact N -qubit state (Eq. 2) decreases as one departs from optimal by changing γ and β.
For larger values of QAOA angles, the associated optimization procedure is more difficult to perform, resulting where exact state vectors are intractable for direct comparison. In these simulations, β1, β2 and γ1 are kept at their optimal values. Optimal γ2 value (for given β1, β2, γ1) is shown with a dashed line. We note that the fidelity estimates begin to drop approximately as γ2 increases beyond the optimal value. A similar qubit-by-qubit trend can be noticed across all system sizes and depths p we studied.
in a lower fidelity (see the dark patch in Fig. 4, panel b). We find that optimal angles were always small enough not to be in the low-performance region. Therefore, this model is less accurate when studying QAOA states away from the variational optimum. However, even in regions with lowest fidelities, RBM-based QAOA states are able to approximate cost well, as can be seen in Fig. 2 and Fig. 4.
As an additional hint to the high quality of the variational approximation, we capture the QAOA approximation of the actual combinatorial optimum. A tight upper bound on that optimum was calculated to be C opt = −69 for 54 qubits by directly optimizing an RBM to represent the ground state of the cost operator defined in Eq. 3.

E. Comparison with other methods
In modern sum-over-Cliffords/Metropolis simulators, computational complexity grows exponentially with the number of non-Clifford gates. With the RZZ gate being a non-Clifford operation, even our 20-qubit toy example, exactly implementing 60 RZZ gates at p = 2, is approaching the limit of what those simulators can do [26]. In addition, that limit is greatly exceeded by the larger, 54-qubit system we study next, implementing 162 RZZ gates. State-of-the-art tensor-based approaches [29] have been used to simulate larger circuits but are ineffective in the case of non-planar graphs.
Another very important tensor-based method is the Matrix Product State (MPS) variational representation of the many-qubit state. This is is a low-entanglement representation of quantum states, whose accuracy is controlled by the so-called bond-dimension. Routinely adopted to simulate ground states of one-dimensional systems with high accuracy [43][44][45], extensions of this approach to simulate challenging circuits have also been recently put forward [46]. In Fig. 5, our approach is compared with an MPS ansatz. We establish that for small systems, MPS provides reliable results with relatively small bond dimensions. For larger systems, however, our approach significantly outperforms MPS-based circuit simulation methods both in terms of memory requirements (fewer parameters) and overall runtime. This is to be expected in terms of entanglement capacity of MPS wave functions, that are not specifically optimized to handle non-one dimensional interaction graphs, as in this specific case at hand.
For a more direct comparison, we estimate the MPS bond dimension required for reaching RBM performance at p = 2 and 54 qubits to be ∼ 10 4 (see Fig. 5), amounting to ∼ 10 10 complex parameters (≈ 160 GB of storage) while our RBM approach uses ≈ 4500 parameters (≈ 70 kB of storage). In addition, we expect the MPS number of parameters to grow with depth p because of additional entanglement, while RBM sizes heuristically scale weakly (constant in our simulations) with p and can be controlled mid-simulation using our compression step. It should be noted that the output MPS bond dimension depends on the specific implementation of the MPS simulator, namely, qubit ordering and the number of "swap" gates applied to correct for the non-planar nature of the In the 20-qubit case, we see quick convergence to the QAOA cost optimum with increasing bond dimension. Approximation ratio with of the RBM output is shown on the y-axis. However, on a 54-qubit graph, MPS accuracy increases approximately logarithmically with bond dimension. An approximation of the MPS bond dimension required for reaching RBM performance is extrapolated to be ≈ 1.5 × 10 4 which amounts to ∼ 10 10 free parameters.
underlying graph, and that a more efficient implementations might be found. However, determining the optimal implementation is itself a difficult problem and, given the entanglement of a generic circuit we simulate, it would likely produce a model with orders of magnitude more parameters than a RBM-based approach.

III. DISCUSSION
In this work, we introduce a classical variational method for simulating QAOA, a hybrid quantumclassical approach for solving combinatorial optimizations with prospects of quantum speedup on near-term devices. We employ a self-contained approximate simulator based on NQS methods borrowed from many-body quantum physics, departing from the traditional exact simulations of this class of quantum circuits.
We successfully explore previously unreachable regions in the QAOA parameter space, owing to good performance of our method near optimal QAOA angles. Model limitations are discussed in terms of lower fidelities in quantum state reproduction away from said optimum. Because of such different area of applicability and relative low computational cost, the method is introduced as complementary to established numerical methods of classical simulation of quantum circuits.
Classical variational simulations of quantum algorithms provide a natural way to both benchmark and understand the limitations of near-future quantum hardware. On the algorithmic side, our approach can help answer a fundamentally open question in the field, namely whether QAOA can outperform classical optimization algorithms or quantum-inspired classical algorithms based on artificial neural networks [47][48][49].

A. Exact application of one-qubit Pauli gates
As mentioned in the main text, some one-qubit gates gates can be applied exactly to the RBM ansatz given in Eq. 5. Here we discuss the specific case of Pauli gates. Parameter replacement rules we use to directly apply one-qubit gates can be obtained by solving Eq. 6 given in the main text. Consider for example the Pauli X i or NOT i gate acting on qubit i. It can be applied by satisfying the following system of equations: for B i = 0, 1. The solution is: with all other parameters remaining unchanged. A similar solution can be found for the Pauli Y gate: with all other parameters remaining unchanged as well.
For the Pauli Z gate, as described in the main text, one needs to solve e a i Bi = (−1) Bi e aiBi . The solution is simply More generally, it is possible to apply exactly an arbitrary Z rotation gate, as given in matrix form as: where the proportionality is up to a global phase factor. Similar to the Pauli Z i gate, this gate can be implemented on qubit i by solving e a i Bi = e iϕBi e aiBi . The solution is simply: with all other parameters besides a i remaining unchanged. This expression reduces to the Pauli Z gate replacement rules for ϕ = π as required.

B. Exact application of two-qubit gates
We apply two-qubit gates between qubits k and l by adding an additional hidden unit (labeled by c) to the RBM before solving Eq. 6 from the main text. The extra hidden unit couples only to qubits in question, leaving all previously existing parameters unchanged. In that special case, the equation reduces to An important two-qubit gate we can apply exactly are ZZ rotations. The gate RZZ is key for being able to implement the first step in the QAOA algorithm. The definition is: where the proportionality factor is again a global phase. The related matrix element for a RZZ kl gate between qubits k and l is B k B l | RZZ kl (ϕ) |B k B l = e iϕB k ⊕B l where ⊕ stands for the classical exclusive or (XOR) operation. Then, one solution to Eq. 15 reads: where A(ϕ) = Arccosh e iϕ and C = 2.

C. Approximate gate application
Here we provide model details and show how to approximately apply quantum gates that cannot be implemented through methods described in Sec. IV A. In this work we use the Stochastic Reconfiguration (SR) [37] algorithm to approximately apply quantum gates to the RBM ansatz. To that end, we write the "infidelity" between our RBM ansatz and the target state φ, D(ψ θ , φ) = 1 − F (ψ θ , φ), as an expectation value of an effective hamiltonian operator H φ eff : We call the hermitian operator given in Eq. 18 a "hamiltonian" only because the target quantum state |ψ is encoded into it as the eigenstate corresponding to the smallest eigenvalue. Our optimization scheme focuses on finding small parameter updates ∆ k that locally approximate the action of the imaginary time evolution operator associated with H φ eff , thus filtering out the target state: where C is an arbitrary constant included because our variational states (Eq. 5, main text) are not normalized. Choosing both η and ∆ to be small, one can expand both sides to linear order in those variables and solve the resulting linear system for all components of ∆, after eliminating C first. After some simplification, one arrives at the following parameter at each loop iteration (indexed with t): where stochastic estimations of gradients of the cost function D(ψ θ , φ) can be obtained through samples from |ψ θ | 2 at each loop iteration through: Here, O k is defined as a diagonal operator in the computational basis such that Averages over ψ are commonly defined as · ψ ≡ ψ|·|ψ / ψ|ψ . Furthermore, the S-matrix appearing in Eq. 20 reads: and corresponds to the Quantum Geometric Tensor or Quantum Fisher Information (also see Ref. [50] for a detailed description and connection with the natural gradient method in classical machine learning [51]). Exact computations of averages over N qubit states ψ θ and φ at each optimization step range from impractical to intractable, even for moderate N . Therefore, we evaluate those averages by importance-sampling the probability distributions associated with the variational ansatz |ψ θ | 2 and the target state |φ| 2 at each optimization step t. All of the above expectation values are evaluated using Markov Chain Monte Carlo (MCMC) [38,39] sampling with basic single-spin flip local updates. An overview of the sampling method can be found in [52]. In order to use those techniques, we rewrite Eq. 21 as: (23) In our experiments with less than 20 qubits, we take 8 000 MCMC samples from 4 independent chains (totaling 32 000 samples) for gradient evaluation. Between each two recorded samples, we take N MCMC steps (for N qubits). For the 54-qubit experiment, we take 2 000 MCMC samples 4 independent chains because of increased computational difficulty of sampling. The entire Eq. 23 is manifestly invariant to rescaling of ψ θ and φ, removing the need to ever compute normalization constants. We remark that the prefactor in Eq. 23 is identically equal to the fidelity given in Eq. 8 in the main text.
allowing us to keep track of cost function values during optimization with no additional computational cost.
The second step consists of multiplying the variational derivative with the inverse of the S-matrix (Eq. 22) corresponding to a stochastic estimation of a metric tensor on the hermitian parameter manifold. Thereby, the usual gradient is transformed into the natural gradient on that manifold. However, the S-matrix is stochastically estimated and it can happen that it is singular. To regularize it, we replace S with S + 1, ensuring that the resulting linear system has a unique solution. We choose = 10 −3 throughout. The optimization procedure is summarized in Appendix B.
In order to keep the number of hidden units reasonable, we employ a compression step at each QAOA layer (after the first). Immediately after applying the U C (γ k ) gate in layer k to the RBM ψ θ (and thereby introducing the unwanted parameters), we go through the following steps: 1. Construct a new RBMψ θ .
2. Initializeψ θ to exactly represent the state U C 1 k j≤k γ j |+ . Doing this introduces half the number hidden units that are already present in ψ θ .
In essence, we use optimization algorithm with the "larger" ψ θ as the target state φ. The optimization results in a new RBM state with fewer hidden units that closely approximates the old RBM with fidelity > 0.98 in all our tests. We then proceed to simulate the rest of the QAOA circuit and apply the same compression procedure again when the number of parameters increases again. The exact schedule of applying this procedure in context of different QAOA layers can be seen on Fig. 1 in the main text.
We choose the initial state for the optimization as an exactly reproducible RBM state that has non-zero overlap with the target (larger) RBM. In principle, any other such state would work, but we heuristically find this one to be a reliable choice across all p values studied. Alternatively, one can just initializeψ θ to U C (γ ) |+ with γ = argmax γ F (ψ θ , U C (γ) |+ ), using an efficient 1D optimizer to solve for γ before starting to optimize the full RBM.

DATA AVAILABILITY
The authors declare that the data supporting the findings of this study are available within the paper.

CODE AVAILABILITY
Our Python code is available on GitHub to reproduce the results presented in this paper through the following URL: github.com/Matematija/QubitRBM. Then, we can express the MaxCut QAOA cost function at p = 1 as In what follows, we will make repeated use of the following identities: The innermost product can easily be expanded using Eq. A4 twice: The first term vanishes when averaged against ρ 0 . The second and third need to be treated separately. First we look at expressions of the form: where we denoted the set of all neighbors of node k by N (k), used Eq. A6 and Eq. A5. In Eq. A8, tracing out Paulis not associated to either node k or its neighbors is trivial and produces a factor of unity. Furthermore, tracing out immediate neighbors of k other than l produces a factor of cos q k (2γ) where q k + 1 is the degree of node k. Keeping those factors aside, we are left with: Tr kl 1 k + X k 2 1 l + X l 2 Y k Z l (cos 2γ − i sin 2γ Z k Z l ) = sin 2γ .
Therefore, the final contribution from the second term in Eq. A7 is 1 2 sin(4β) Tr ρ 0 e iγC (Y k Z l + Y l Z k ) e −iγC = 1 2 sin(2γ) sin(4β) [cos q k (2γ) + cos q l (2γ)] (A10) Looking at the last factor in Eq. A7, we compute: where we separate the factor in C corresponding to the edge (k, l) out before using Eq. A6. In the second equality, the factor corresponding to the edge (k, l) cancels and others are absorbed into product expressions. Like before, operators corresponding to nodes other than k or l or their neighbors can be traced out trivially. The next step is to trace out Paulis that are neighbors of k but not l. Each of those contributes with a factor of cos(2γ) so we get a factor of cos q k −∆ kl (2γ) for node k and a corresponding factor for node l resulting in cos q k +q l −2∆ kl (2γ).
Here, we denoted the number of common neighbors of k and l by ∆ kl and label their set by CN (k, l) in what remains of Eq. A11: Tr ρ 0 e iγC Y k Y l e −iγC = cos q k +q l −2∆ kl (2γ) Tr where ρ 0 is the reduced density operator. Using Eq. A5, one can trace out nodes in CN (k, l) to obtain a factor of (cos 2 2γ + sin 2 2γ Z k Z l ) ∆ kl . Finally, we expand that expression using the binomial theorem and interchange the trace operation with the sum. The trace in Eq. A12 becomes Putting all of the ingredients together, we have: C(γ, β) = 1 2 k,l sin(4β) sin(2γ) (cos q k (2γ) + cos q l (2γ)) + sin 2 (2β) cos q k +q l −2∆ kl (2γ)(1 − cos ∆ kl (4γ)) (A14)