Quantum State Preparation of Normal Distributions using Matrix Product States

State preparation is a necessary component of many quantum algorithms. In this work, we combine a method for efficiently representing smooth differentiable probability distributions using matrix product states with recently discovered techniques for initializing quantum states to approximate matrix product states. Using this, we generate quantum states encoding a class of normal probability distributions in a trapped ion quantum computer for up to 20 qubits. We provide an in depth analysis of the different sources of error which contribute to the overall fidelity of this state preparation procedure. Our work provides a study in quantum hardware for scalable distribution loading, which is the basis of a wide range of algorithms that provide quantum advantage.


Introduction
Quantum state preparation is an important step in many quantum algorithms such as Monte-Carlo methods for quantum computers 1 , quantum algorithms for systems of linear equations 2 , various quantum machine learning algorithms 3,4 and Hamiltonian simulation 5,6 which are expected to have broad applications in a variety of fields.In order for these algorithms to provide quantum advantage, this state preparation procedure must be performed efficiently and with sufficiently low noise.
Previously, in the literature, many near-term techniques have been proposed to realize quantum state preparation, including numerical integration 7,8 and quantum generative adversarial networks 9,10 .Refs. 8-10 also experimentally implemented their state preparation protocols.However, there is a lack of study into their efficiency.There have also been fault-tolerant proposals for quantum state preparation [11][12][13][14] , with theoretical guarantees in terms of convergence, circuit depth, and the number of ancillas.However, due to their high resource requirements, these techniques cannot be realized experimentally on near-term devices.On the other hand, Refs.15, 16 considered black-box state preparation which assumes the existence of an oracle that produces target coefficients.This unfortunately does not allow an end-to-end implementation.
In this work, we make use of matrix product states (MPS) which are an interesting class of quantum wave functions whose probability amplitudes can be described using a specific tensor structure.MPS based algorithms are some of the most powerful whose amplitudes c x are set to a specified value.It is known that in general, such a circuit must include a number of gates that scales exponentially with N 28 .This is true even if the goal is to only approximately prepare a state |Ψ ′ ⟩ to some fixed precision ε such that |||Ψ ′ ⟩ − |Ψ⟩|| ≤ ε, where ∥ • ∥ is understood as the L2 norm.
However, if we restrict ourselves to certain families of wave functions with specific structure, it becomes possible to efficiently prepare the state with a circuit which contains only a number of quantum gates that scales polynomially with N. For instance, Grover and Rudolf 7 gave a procedure for efficiently preparing wavefunctions whose amplitudes are described by an efficiently integrable probability distribution.However, Ref. 29 pointed out that the Grover-Rudolf method does not give the quadratic speedup in quantum Monte-Carlo algorithm when classical Monte-Carlo integration is used to determine the gate angles in the state preparation circuit.Ref. 30 presented a quantum algorithm to efficiently prepare normal distributions using Mid-Circuit Measurement and Reuse.Ref. 12 presented a quantum algorithm that prepares any N-qubit quantum state with Θ(N)-depth circuit, with an exponential amount of ancillary qubits.Ref. 14 described a state preparation algorithm that uses quantum eigenvalue transformation, and obtains promising gate complexity.However, the actual Toffoli gates count of O (10 4 ) means it will remain unreachable in the near future.
In Ref. 25, it was shown that if the probability distribution to be encoded is a smooth differentiable function, then there exists an efficient preparation scheme based on an encoding of the probability distribution as a matrix product state tensor network.Matrix product state wave functions are a class of quantum states on which efficient classical computations can be performed even when the number of qubits in the system is large.As we will describe in this section, there exist efficient methods of generating MPS representations of low degree polynomial functions, as well as methods for approximately preparing MPS wave functions using low depth quantum circuits.In this work, we use this encoding scheme to prepare states which approximate normal probability distributions.With this technique, we are also able to make use of the well known family of piece-wise polynomial functions known as the Irwin-Hall distributions which approximate the normal distribution to arbitrary accuracy.

MPS formalism
A Matrix Product State (MPS) is a wave function of the form where the terms We call the maximum value of the bond index, χ, the bond dimension of the MPS.Wave functions which can be represented by a bond-dimension χ MPS can be completely defined using only d * N matrices of dimension χ × χ and therefore can be 3/29 stored using only d * N * χ 2 complex numbers instead of storing all 2 N complex amplitudes directly.
Only a small subset of wave functions can be represented exactly with a finite bond-dimension MPS.In particular, MPS wave functions are very good approximations for quantum states with low entanglement.The entanglement entropy, of a MPS wave function with fixed bond dimension χ is bounded by S ≤ log(χ).
In general, if we let χ = 2 N , we can represent any wavefunction |ψ⟩ using the MPS form given by Eq 2. One of the most important features of matrix product states is the ability to compress such a wavefunction in a controlled manner to create a fixed small χ approximation to |ψ⟩.The most straightforward approach to this is the singular value decomposition (SVD) compression scheme, described in Ref. 17.We start with the wave function whose coefficients are given by the N component This tensor is reshaped into a rectangular matrix ψ σ 1 ,(σ 2 ...σ N ) .A singular value decomposition on this matrix allows us to write We restrict the sum to include only the largest m singular values of the diagonal matrix S. We can now reshape the remaining tensor to form the matrix ψ (a 1 σ 2 ),(σ 3 ...σ N ) .A singular value decomposition is applied to this matrix and the process is repeated for all qubits i, resulting in the decomposition given above, where each matrix with m > n, the cost of the SVD is Õ(mn 2 ), which implies that the SVD costs O(Nχ 2 χ ′ ) when truncating from bond dimension χ to bond dimension χ ′ .There also exist more complex compression techniques 17 which may be more effective in certain cases.For example, the iterative variational compression algorithm which solves a series of χ ′2 × χ ′2 linear equations which depends only on the truncated bond-dimension O(χ ′ ).Although deriving these equation involves contracting over the original bonds of size χ, which has cost O(χ ′ χ 2 ), this method can lead to a large practical speedup in certain situations..
For each matrix, the so called truncation error is given by the sum of the squares of the discarded singular values , and controls the fidelity of this compression method 17 .For a state |ψ⟩ and allowed error ε, we say that it can be approximately represented as a MPS if, for arbitrary N, there exists a fixed χ MPS, | ψ⟩ such that In this case, efficient classical computations can be performed on | ψ⟩. 4/29

Smooth Differentiable Functions
It turns out that many states that are input to quantum algorithms inherently possess a low degree of entanglement and therefore can be represented using matrix product states.For example, common distributions used in Monte Carlo methods include the uniform distribution, the normal distribution, and the log-normal distribution.
Most importantly for our purposes, smooth differentiable real-valued functions which are appropriately encoded into the amplitude of quantum states satisfy this low entanglement property.Consider a normalized real smooth probability distribution where h = (2 N − 1)/(b − a) and k = k(σ ) can be represented by a binary bit-string σ = σ 1 σ 2 . . .σ N so that Therefore, the discretized amplitude encoded wave function takes the form where |k⟩ is the integer representation of the computational basis state |σ ⟩ in big-endian notation.For a fixed number of qubits N, the wave function |ψ⟩ encodes a discretized version of the probability distribution f (x), sampled at the discrete points x k .
Consider the effect of adding one additional qubit to the state |ψ⟩ so that |σ 1 σ 2 . . .σ N ⟩ → |σ 1 σ 2 . . .σ N σ N+1 ⟩.Within the big-endian binary encoding scheme, adding one additional quantum register doubles the density of the discretized points in f (x k ).Furthermore, since the added qubit encodes the least significant bit of k, the location of the previous 2 N points is not changed by adding one more qubit to the system.
In Ref. 24, the entanglement entropy of these discretized amplitude encoded wave functions was carefully analyzed.It was found that adding one additional qubit leads to only a small increase in the entanglement of the state |ψ⟩, and that this change is controlled by the maximum value of the derivative of f (x) Specifically, adding one additional qubit increases the entanglement entropy S(ρ) of the final state as Therefore, for each additional qubit we add to our encoding, the discretization error is cut in half, but only a vanishingly small For this, first we define the auxiliary variables t i (σ i ), defined for a given bit-string k = k(σ ).
Now we can write the elements of M [ j]σ j α i ,α j as where C j i is the binomial coefficient and where all bond indices α i ∈ (0, . . ., p).These equations are directly adapted from where the function of the ℓ th region is given by To encode this function, we first represent f ℓ (x) as an MPS on the full domain [a, b], using Eq.'s 13-16, and then 'zero out' elements of the tensors M [ j]σ j α i α j corresponding to regions outside the domain of region ℓ.To do this, first let ℓ be represented by a All matrices M [ j],σ j α i ,α j for j > k are left unchanged.In other words, if we replace all the matrices in Eq. 2, M [ j],σ j α i ,α j , with a zero matrix of equal size if b j ̸ = σ j , it will have the effect of setting all amplitudes outside the domain of region ℓ to zero.
Finally, we can put all these pieces together to write an efficient encoding scheme for representing a piece-wise degree-p polynomial function of 2 k separate domains as a low bond dimension MPS state.For each sub-region on the full domain [a, b], we encode f ℓ (x) into the MPS M ℓ .We then use the property that two Matrix Product States, M 1 and M 2 with bond dimensions χ and χ ′ can be added together to form a MPS M 3 = M 1 + M 2 of bond dimension χ + χ ′ .Therefore, we may add all Matrix Product States on the 2 k sub-domains together to get a final representation . M T is an efficient approximation to the target quantum state which we wish to prepare using a quantum device.The procedure for generating M T is summarized in Algorithm 1.

Algorithm 1 MPS Encoding Procedure
Input: A degree-p piece-wise function

Quantum Circuits for MPS States
We now describe how we can generate low depth quantum circuits which prepare MPS wave functions.A bond dimension χ MPS state can be exactly created with a quantum circuit which is composed of N local unitary operators each acting on m = log(χ) + 1 qubits.While in principle this allows us to efficiently map a given MPS to a polynomial depth quantum circuit, the constant factor overhead of compiling arbitrary m-qubit quantum gates to a basic set of one and two qubit gates quickly becomes infeasible for NISQ devices.
Consequently, a number of proposals for approximately generating bond dimension=χ MPS states using low depth quantum circuits have been put forward 26,[31][32][33][34] .In this work, we take the iterative approach of Ref. 26.In this method, a high bond dimension MPS wave function is iteratively approximated by applying D layers of local unitary operators.Each layer of gates is designed to prepare a χ = 2 MPS wave function and can therefore be efficiently prepared using only two-qubit unitary operators.The procedure for constructing the specific set two-qubit unitary operators which exactly prepares a χ = 2 MPS state is described in Ref. 26, and results in a circuit architecture of the form shown in Fig. 1 a).When several of these layers are combined together, a circuit in the form of Fig. 1 b) is able to approximately prepare a χ ≫ 2 MPS wave function.The key idea of the algorithm is that each layer of gates also acts as a "disentangler" operator which can be applied to the original wavefunction.The algorithm proceeds as follows Algorithm 2 Iterative Circuit Preparation Truncate Generate The truncation procedure applied in Algorithm 2 is the SVD compression method described in section 2.2.Each layer of unitary gates, U i approximately prepares the target wave function |ψ i−1 ⟩.Therefore, the adjoint circuit layer U † i will approximately take the target |ψ i−1 ⟩ to the product state |0⟩.After each iteration, the entanglement of the wave function |ψ i ⟩ is therefore lower than the previous wave function |ψ i−1 ⟩.In this way, the bond dimension-2 approximations | ψ⟩ become increasingly accurate.The final state we produce is The fidelity of this state preparation procedure is given by the expression.
Therefore, the accuracy of this preparation method depends on the ability of the unitary operators to disentangle the wave function.In section 2.6, we look at how the approximation error decreases as the number of layers increases for wave functions described by a normal distribution.For these wave functions, a small number of layers is generally sufficient to prepare a good approximate wave function.Therefore, using the techniques described in this section, we can efficiently create low depth quantum circuits which approximately prepare the target normal distribution wave functions on a large number of qubits.After executing this circuit, a projective measurement on all qubits generates a single bit-string, which can converted, using Eq.6, to

Irwin-Hall Distribution
The Irwin-Hall distribution is the continuous probability distribution for the sum of n i.i.d.U (0, 1) random variables, Here U (0, 1) is the uniform random variable over [0, 1].
The probability density function (pdf) is given by where sgn(x − k) denotes the sign function From here, we see that the pdf of X n is piece-wise polynomial, with n pieces, and each polynomial is of degree n − 1.
By the Central Limit Theorem, as n increases, the Irwin-Hall distribution converges to a normal distribution with mean µ = n/2 and variance σ 2 = n/12.Formally, it means where N (µ, σ 2 ) denotes the normal distribution with mean µ and variance σ 2 .For a sequence of real-valued random variables, convergence in distribution means at which F is continuous.Here F n , F denote cumulative distribution functions (cdf).In Section 2.7, we present a more rigorous convergence result in pdf.

Distance Measures and Statistical Analysis
We use common distance measures such as the L1 and sup norm between the pdf/cdf of the Irwin-Hall distributions and the corresponding normal distributions.This can be done when the pdf/cdf is available analytically or numerically.When experiments are performed, there is no access to the intrinsic quantum state, as only measurement samples are collected.In this case, we use the 1-sided Kolmogorov-Smirnov test, and obtain the KS statistics, defined as Here F n (x) is the empirical cdf derived from the samples, and F is the cdf of the ideal distribution to which it is compared.The KS statistics can be thought of as an approximation of the sup norm between the cdfs.This KS statistic can also be used to test whether the two underlying distributions F n (x) and F(x) differ from each other with statistical significance.When a finite number of samples s are taken from both distibutions, the null hypothesis that F n (x) = F(x) is accepted if where α is the significance level which is commonly taken to be α = 0.05.
We note that there are alternative statistical test methods such as the Shapiro-Wilk test or the Anderson-Darling test.
However, their test statistics are not as useful as the KS statistics.
A thorough understanding of the error analysis associated with state preparation is important.In an application such as Monte Carlo methods, one often needs to obtain results to a desired accuracy.Therefore the state preparation needs to prepare the corresponding distribution to a precision that is compatible with the desired application.

Irwin-Hall error analysis
Define the random variable Let f Z n be the pdf of Z n and f N be the pdf of N (0, 1).
The proof is given in Appendix.
The error between pdfs of Irwin-Hall distributions and the corresponding normal distribution is also analyzed numerically.
Since f Z n has bounded support on [− √ 3n, √ 3n], and f N (x) → 0 monotonically as |x| → ∞, it is sufficient to consider ].We take points with spacing n in this domain and numerically evaluate f Z n and f N on these points.We numerically estimate the gradient of the max error to be -1.57and that of the average error to be -2.06.Therefore, based on numerical calculation shown in Fig. 2 a), a)  We also list similar results for the cdf.
The error between cdf of Irwin-Hall distributions and the corresponding normal distribution is analyzed numerically as well.
Based on numerical calculation show in Fig. 2 b), We also compare the theoretical error scaling of our method against other methods.In Ref. 30, it was numerically studied that the KL divergence between the distribution produced by the algorithm and the corresponding exact analytical normal distribution scales as Θ(1/t 2 ), where t counts the iterations and can be thought of as proportional to the circuit depth.Since log(p/q) ≈ (p − q)/q for p ≈ q, the 1-norm between the distribution produced by Ref. 30 and the corresponding exact analytical normal distribution also scales as Θ(1/t 2 ).However, this algorithm uses the Mid-Circuit Measurement and Reuse scheme, and is thus not suitable for implementation with quantum Monte Carlo algorithms 1 .

Discretization error
We also discuss briefly the discretization error.The normal distribution is a continuous probability distribution.However, in modern computers, decimal numbers are typically represented as as 32/64/128 bit floats and there is some inherent discretization.
When the distribution is loaded onto a quantum state, there is further discretization due to the finite number of qubits.As an example, let us assume the Irwin-Hall order of 16.We discretize N (0, 1) and Z n (Irwin-Hall of order n) for n = 16 over the 3] over qubits up to 23, and compared against the corresponding N (0, 1) over (−∞, ∞) in Fig. 3.As we can see, the error from the discretized Irwin-Hall distribution and from the discretized normal coincides for less than 9 qubits.But for the discretized Irwin-Hall distribution, the error plateaus after 12-13 qubits, indicating that the error now mostly comes from the Irwin-Hall approximation to the normal distribution and not the discretization.

MPS Circuit Approximation Error Analysis
We also look at the convergence of our iterative MPS loading scheme to the ideal normal distribution as circuit depth is increased.First, we make a distinction between the target Irwin-Hall distribution, which is the state we load into the untruncated MPS wave function, and the ideal normal distribution which is the ultimate target distribution.As the number of layers in the iterative MPS circuit construction is increased, the approximation error between the output of the quantum circuit, |ψ⟩, We also look at the KS statistic between output and target distributions, which measures the maximum difference between the cumulative density functions of the two distributions.This metric has additional experimental relevance as it can be directly estimated from a finite number of measurements taken in the computational basis on the prepared quantum state.In Fig. 4 a), we show the behavior of the KS statistic between |ψ⟩ and the target Irwin-Hall distribution as the number of MPS layers increases.Again, we see that the general trend also decreases as a power law with increasing layer depth.However, in this case the behavior of KS statistic is not a smooth function and is not strictly decreasing.This simply demonstrates that state fidelity is not always in one-to-one correspondence with all measures of distance between the prepared and target output distributions.
This fact is further emphasized when we compare distributions prepared by our quantum circuit simulation and the ideal normal distribution.We plot this comparison in Fig. 4 b), for n = 8, 16, 32 and 64 and N = 14.At high circuit depth, where we apply many circuit layers of the iterative MPS state preparation procedure, we very closely reproduce the exact Irwin-Hall distribution in the amplitudes of the quantum state.We find that in all cases, a single layer of the MPS circuit already achieves a fairly low value for the KS-statistic, comparable with the error originating solely from the Irwin-Hall approximation as shown in Fig. 3.We also find that differences between the target Irwin-Hall distribution and the target ideal distribution in many cases offset the differences between the prepared distribution and the target Irwin-Hall distribution.This results in relatively strong fluctuations in the KS statistic value with increasing circuit depth.In all cases, the Irwin-Hall error dominates the MPS approximation error beyond depth D = 5.In fact, it is only for n = 32 and n = 64, that there appears to be any significant benefit in going beyond D = 1.

13/29
In summary, in this section we studied the error introduced to our state preparation routine by the Irwin-Hall approximation, qubit discretization and the MPS circuit preparation routine.We find that for the Irwin Hall orders which we consider, a circuit with gates linear in the number of qubits N and with only a small number of layers D, is sufficient to prepare a probability distribution with error comparable to that introduced by the Irwin Hall approximation.We believe these values of n, N and D will be sufficient for most practical applications for the near future.Of course, a more accurate approximation of the normal distribution will require a higher circuit depth to prepare.In the next subsection we provide an analysis of how these quantities must scale in this higher precision limit.However, any easy-to-prepare probability distribution which approximates the exact normal distribution will likely be well approximated by a finite depth MPS circuit.

Resource Estimations
Here we list the computational resources of our algorithm, both from classical and quantum point of view in table 1, with respect to ε, as defined in Eq. 5 as |||ψ⟩ − | ψ⟩|| ≤ ε, where |ψ⟩ is the wave functions with amplitudes proportional to the square root of ideal normal distribution and | ψ⟩ our MPS state.Note that we assume both the Irwin-Hall approximation and the MPS approximation introduces error of O(ε).First, we numerically study the relation between ε and the Irwin-Hall order n as shown Table 1.Computational complexities of different components of the algorithm with respect to ε in Fig. 5.We derive ε = O(n −1.07 ), and therefore n = O(ε −1/1.07 ).In Appendix, we showed that classically computing all the  For an Irwin-Hall distribution of order n, our MPS is a sum of n piece-wise MPS components each with bond dimension n + 1.Each piece-wise MPS component involves the calculation of Nn 2 coefficients so that the total classical cost of this 14/29 calculation is Õ(Nn 3 ).The sum of these component is then a MPS with χ = n(n + 1), and the cost of generating this MPS state is Nχ 2 ∼ Nn 4 , which is the dominant contribution in this portion of the calculation.This scaling can be reduced by truncating the MPS bond dimension χ after each addition of the piece-wise MPS components which each has χ p = (n + 1), instead of only truncating after summing all components.In this case, there are n computations each with complexity χ 2 p = n 2 , leading to an overall complexity of Nn 3 .The classical processing cost associated with truncating the MPS to bond dimension χ ′ < χ is Nχ ′3 using the variational compression scheme and Nχ 2 χ ′ using the SVD compression method 17 .When generating the quantum circuit, we must repeatedly disentangle the MPS state D times, then truncate to bond-dimension χ ′ = 2.This has complexity Õ(Nχ ′2 D).Therefore overall, the cost associated with generating the quantum circuit for preparing an Irwin-Hall distribution of order n is Õ(Nn 4 + N + ND), where we ignore the constant cost associated with χ ′ .We saw that n = O(ε −1/1.07 ).We also saw that the discretization error decays exponentially with the number of qubits N, so that ε ∼ e −γN , for some constant γ, so that N ∼ log(1/ε).In Appendix, we show that ε 2 ∼ 1/D b for some power b, which we numerically estimate to be b ≥ 1.15 for the Irwin-Hall functions we studied in this work.Therefore we can write D ∼ ε −1.74 , so that the classical processing of the MPS manipulations scales as Õ(log(1/ε)[ε −4/1.07 + ε −1.74 ]) = Õ(ε −3.74 ) after dropping polylog terms and subleading terms.
The quantum circuit corresponding to D rounds of MPS approximations has circuit depth O(DN).In Appendix, we show that for a given ε, the error associated with the MPS circuit approximation to the exact Irwin-Hall wave function is essentially independent of both the number of qubits N and the Irwin-Hall order n.Furthermore, we show that ε decays like a power of the circuit depth D, so that ε 2 ∼ 1/D b for some constant b.We numerically show that b ≥ 1.15.Therefore the quantum circuit complexity is given by Õ(ND) = Õ(log(1/ε)ε −1.74 ) = Õ(ε −1.74 ).
We expect it is possible to improve the complexity scaling of both the classical MPS approximation and the MPS circuit depth D. For example, if we truncate the MPS bond dimension χ after each addition of the piece-wise MPS components, we will get the MPS approximation to scale as Õ(ε −2.80 ) instead of Õ(ε −3.74 ).We believe it is possible to improve the MPS circuit depth D through better approximation methods for low depth circuit construction, for example using Ref. 31.In this sense, the exponent b in the scaling Õ(ND) = Õ(ε −2/b ) should be thought of as an upper bound for this analysis.Furthermore, we see in Appendix, that if we were to use the exact circuit construction for an MPS of bond dimension χ ′ , instead of the low-depth circuit approximation, the error appears to decrease exponentially as ε ∼ e −bχ ′ for some constant b.The depth of an exact MPS circuit construction scales like D ∼ Õ(χ ′2 ), which would imply that the circuit complexity would scale like Õ(log 2 (1/ε)).
While the constant overhead of the exact MPS circuit implementation implies it is less practical at moderate values of ε than the low-depth approximations, it may provide a large advantage when trying to prepare states with very small approximation error ε.
Note that normal distributions of different parameters can all be obtained from the standard one by rescaling and shifting.
So parameters of the normal distribution do not contribute to the circuit complexity here. 15/29

Applicability to Monte Carlo Integration
The resource estimation of our algorithm allows an end-to-end error analysis of applications where our algorithm can be a subroutine.As an example we discuss in detail how our algorithm would affect the results of a Monte Carlo integration.
As a reminder we assume |||ψ⟩ − | ψ⟩|| ≤ ε, where |ψ⟩ is the wave functions with amplitudes proportional to the square root of ideal normal distribution and | ψ⟩ our MPS state.We also implicitly assume both the Irwin-Hall approximation and the MPS approximation introduces error of O(ε).
In classical Monte Carlo integration, we estimate E[g] by 1 s ∑ s i=0 g(x i ), with x i sampled from the desired distribution.The central limit theorem states that where s is the number of times that we sample the data.So the error of Monte Carlo integration is O(σ / √ s), where This assumes x i is sampled from the desired distribution.In our case, we are using the Irwin-Hall distribution together with MPS to approximate N , so it will introduce additional error as g(x i )'s are biased estimators now.The additional error is O(ε), as we show in Appendix.
We may also express the complexities of classical and quantum Monte Carlo integration in terms of δ , which is the accuracy that we want a Monte Carlo integration to achieve.Classically, assuming sampling from a normal distribution takes O(1), the complexity of a Monte Carlo simulation is proportional to the number of samples s, which is O(δ −2 ).We note here that our assumption on the sampling complexity of normal distribution is fairly strong and unlikely to be true.More analysis on the error dependence of classical sampling algorithms for normal distributions, such as the Box-Muller transform 36 or Ziggurat algorithm 37 , is needed.
In quantum Monte Carlo integration, we have t = Õ(δ −1 ).Assuming the additive error introduced by the state preparation algorithm is comparable to δ , we have O(ε) = δ .Thus we can express the computational resources of our algorithm in terms of δ as

Table 2. Computational complexities of different components of the algorithm with respect to δ
The end-to-end complexity of a quantum Monte Carlo algorithm leveraging our state preparation subroutine will be Õ(δ −2.74 ), as the state preparation subroutine is used t times.The one-time classical pre-processing of our state preparation subroutine is dominated by the MPS approximation, which scales as Õ(δ −3.74 ).In reality, it's possible to complete the classical pre-processing before-hand, and have the quantum subroutine as part of a standard state preparation toolkit.
As seen from this analysis (table 2), quantum Monte Carlo integration using our current implementation of state preparation does not provide an asymptotic advantage over classical Monte Carlo integration.However, as pointed in Section 2.10, exact

16/29
MPS constructions can be used in the low δ limit, which should improve the complexity.

Experimental Demonstration
In this section, we show the experimental demonstration of our state preparation procedure on the IonQ Aria generation of trapped ion quantum computer.First, in Fig. 6, we demonstrate our ability to prepare normal probability distributions with different variances on quantum states with 10 to 20 qubits.This involves preparing all 2 10 to 2 20 amplitudes of the quantum wave function.By using the MPS state preparation method, all amplitudes can be set by applying only between 20-120 CNOT gates.We also highlight that to the best of our knowledge, this is the largest demonstration of state preparation for applications and the fact that X n → N (n/12, n/12), the standard deviation of the distribution is n/24.)We see that the experimental measurements give good visual agreement with the clean simulation and the ideal pdf in all cases.As the quantum gate noise is the largest source of error in the algorithm, we find that the best results are always obtained for a single layer of the MPS iterative preparation procedure.In general, the fewer gates applied the higher the fidelity of the output distribution.We can visualize this behavior in Fig. 6.The shortest depth circuits we apply prepare 10-qubit systems with MPS depth 1.We show the case when n = 8, and see very strong agreement with the expected histogram on 10000 samples using the noiseless simulator.
The effect of the noise is to slightly broaden the normal peak and to create additional samples at the tails of the distribution.
The same general behavior is seen when preparing a 20 qubit system with MPS depth 1.We show the case when n = 16.In this case, the number of basic gates is roughly doubled, resulting in a higher rate of noise and a broader measured output distribution.However, we still see a strong visual overlap with the simulated noiseless distribution and the ideal pdf function.
Finally, in order to better visualize the effects of noise, we also show the output for preparing a 10 qubit system at depth 3. We see that in this case, the noise is significantly higher.As explained in the text, with current device noise levels, optimal state preparation occurs with D = 1.b) Scaling of the KS statistic with number of two-qubit gates demonstrates that these circuit operations are the dominant source of noise in the state preparation procedure.
We now quantify the effect of noise over the full set of prepared normal distributions.We show the results in Fig. 7.We evaluate the quality of our experimental state preparation by comparing the finite number of measurements from the quantum circuit to measurements sampled from an ideal normal distribution using the Kolmogorov-Smirnov statistic.In Fig. 7a) we show the KS statistic for the n = 16 normal distributions for 10-20 qubits and 1-3 MPS layers.As mentioned, the best results are found for a single MPS layer.The best KS statistic value we find is D n ≈ 0.09, and increases to around D n ≈ 0.15 for the largest system sizes.We can compare these values to the value required to pass the KS test of the equality of the measured and target distributions given by Eq. 28.Using this equation, we can expect to pass this test with statistical significance α = 0.05, when the number of samples, s, taken from both distributions is s ≤ 450 for D n = 0.09, and s ≤ 150 for D n = 0.15.
In contrast to the noiseless case, on the noisy quantum computer, the value of the KS statistic increases with number of layers 18/29 and number of qubits.That is, the noise generated from the additional gates overcomes the higher fidelity of the underlying normal approximation when preparing states using higher number of MPS layers.There is a clear increase in the KS statistic both as the number of qubits and the number of MPS layers increases.In general, the largest source of noise on near term quantum computers comes from the execution of two-qubit gates.The overall effect of this noise can be clearly seen in Fig. 7 b), where we plot KS statistic vs the total number of two-qubit gates in the associated quantum circuit.While there can be significant variance in the noise from circuit to circuit, likely originating from small fluctuations in system performance with time, the overall trend is consistent.The KS statistic ranges from 0.09 to 0.25 in the worst case, and appears to increase linearly with the total number of two-qubit gates in the circuit.
Note that while we did not extend this pattern down to zero gates, the y-intercept is expected to be nonzero due to the presence of state-preparation and measurement (SPAM) errors.While we expect SPAM errors to be relatively low on ion trap quantum computers, they still likely make a nontrivial contribution to the overall error.Also, we cannot reliable apply our state preparation procedure in the zero gate limit and so the approximation error may be a large contributor to the overall error in this limit.Therefore the zero-gate extrapolation error is likely a combination of both SPAM errors and the state preparation approximation error.

Discussion
Our paper gives a procedure that produces U such that U|0 N ⟩ has probability amplitudes proportional to a normal distribution.
Our circuit for U does not involve any ancilla qubit and succeeds with certainty.This has a number of advantages over other methods in applications where U needs to be applied iteratively 1 .If an algorithm only applies U with probability 1 − δ 13,30 , then m sequential applications of U would succeed with probability (1 − δ ) m , which may be not sufficient for a practical application.If an algorithm uses k ancillary qubits 12,13,30 such that the ancillary qubits cannot be discarded or reused between successive applications of U, then m sequential applications of U would need km ancillary qubits, which is undesirable since quantum computers will be resource-constrained in the foreseeable future.
Our Matrix Product State based procedure, on the other hand, uses only N qubits to prepare a discretized probability distribution on 2 N points, and succeeds with probability 1 in the limit of noiseless quantum gates.The trade-off for these advantages is that we can only approximately prepare the distribution to fixed accuracy for a quantum circuit with gates linear in the number of qubits.The main sources of this error originate from approximating our target function with a piece-wise polynomial function and from approximately preparing the low bond dimension MPS wave function with a short depth quantum circuit.
Throughout this work, we carefully analyzed the error originating from both these sources.We chose to approximate the normal distribution using the order n Irwin-Hall distribution, which gives a fast, deterministic method for generating the appropriate unitary circuit U. We found that the error from this approximation decreases polynomially with n.In order to approximately encode this Irwin-Hall distribution into the wave function amplitudes using a short depth circuit, we use the iterative circuit construction method of Ref. 26.While our method does not give the best theoretical complexity in terms of ε (as defined in Section 2.1), the short circuit depth makes it far more appealing than other methods with better asymptotic complexity.Also, intrinsic restrictions on hardware accuracy and discretization errors arising from floating-point arithmetic mean that it may not be possible to go down to infinitesimal ε in a practical application.
We also note that the circuit structure of our algorithm, as depicted in Fig. 1, only assumes nearest-neighbour interaction of the underlying device.This implies that our algorithm can be implemented on a wide range of hardware architecture, such as ion trap devices, superconducting devices and so on, without any additional swap gate.
We experimentally validated this loading technique on a trapped ion quantum computer for a range of circuits with varying width and depth.We were able to successfully generate the target wave functions on circuits with up to 20 qubits.Furthermore, the measured KS statistics were comparable to previous state preparation experiments 9 , while acting on circuits with many more qubits.We also studied at the scaling of the KS statistic with circuit depth, and found that, as expected, with the hardware noise is currently the largest source of error in this state preparation procedure, limiting the effectiveness of applying higher depth loading circuits.With our current experiments, we are approaching the capacity of generating single precision random quantum variables, although we expect a large improvement in gate-error rates is still required to take full advantage of such high precision numbers.
A number of possible directions stemming from this work immediately present themselves for future research.One possibility is to study improvements to the MPS approximation method for low depth circuit construction.Another is to extend this work to study more general probability distributions.This involves searching for simple piece-wise polynomial representations of these distributions with controllable approximation errors.Finally, we leave to future research a full end-to-end implementation of a quantum amplitude estimation algorithms which makes use of this state preparation subroutine, and an associated analysis of the error rates required to achieve quantum advantage using these approaches.

DATA AVAILABILITY
The data supporting the study's findings are available from the corresponding author upon reasonable request.

CODE AVAILABILITY
The code supporting the study's findings are available from the corresponding author upon reasonable request.

ACKNOWLEDGEMENT
This work is a collaboration between Fidelity Center for Applied Technology, Fidelity Labs, LLC., and IonQ Inc.The Fidelity publishing approval number for this paper is 1075809.2.0.

20/29
tensors, and we use the Einstein summation convention that repeated indices are summed over.Each tensor contains a 'physical' index σ i ∈ [1, d], and 'bond' indices α i ∈ [1, χ] 17 .Here d is the local dimension of the quantum state, so that d = 2 for qubits.

f
(x) defined on an interval [a, b].A discretized version of this function can be encoded into the amplitudes of an N-qubit quantum register.Specifically, throughout this work, we use the big-endian binary encoding on the interval [a, b], such that

one of 2 NFigure 1 .
Figure 1.The circuit structure for preparing an MPS wave function.a) A MPS state with all bond dimensions equal to 2 can exactly be contructed using (N-1) 2-qubit unitary operators plus one single qubit rotation, arranged in the pictured order.b) A higher bond dimension MPS state can be approximately prepared by repeatedly applying single layer MPS approximations, following the construction of Ref. 26. c) If the entries of the MPS state are real, then all 2-qubit gates U are O(4) operators and can be implemented using single qubit rotations and only 2 CNOT gates if det(U) = +1, or d) 2 CNOT gates plus a SWAP operator if det(U) = −1.

Figure 2 .
Figure 2. Distance between Irwin-Hall distributions and normal distributions.(a) Distance in pdf between Irwin-Hall distribution and normal distribution as the Irwin-Hall number n grows.(b) Distance in cdf between Irwin-Hall distribution and normal distribution as the Irwin-Hall number n grows.
Discretization error in cdf n=8 Irwin-Hall approximation n=16 Irwin-Hall approximation n=32 Irwin-Hall approximation without Irwin-Hall approximation

Figure 3 .
Figure 3. Error (sup norm) in cdf when comparing discretized Irwin-Hall and discretized normal distribution pdf to the undiscretized normal distribution.

1 D α for large D with 12 / 29 α = 1 .Figure 4 .
Figure 4.The error introduced by the matrix product state circuit when approximating the target ideal normal distribution.a) The error between the distribution of the prepared wave function and target Irwin-Hall distribution as the number of MPS layers increases.We show both the wave function infidelity and the KS statistic between the output probability distributions.b) The KS statistic between the prepared distributions and the ideal normal distributions for different Irwin-Hall orders as a function of MPS circuit depth.These plots are based on simulations of the quantum circuits with N = 14 qubits.

Figure 5 .
Figure 5. Distance between the ideal state and Irwin-Hall approximated state in L2 norm (ε as defined in Eqn. 5)

Figure 6 . 2
Figure 6.Histogram of measurements for normal distributions which were prepared on the quantum computer.Experiments were run on IonQ Aria on circuits with between 10 to 20 qubits, with MPS depth D between 1 to 3. Each circuit uses 2 × (N − 1) × D CX gates and 10000 shots were taken on each circuit.Here we show three illustrative implementations, with (a) N = 8, n = 8, D = 1, (b) N = 20, n = 16, D = 1 and (c) N = 10, n = 16, D = 3.In all figures, samples taken from the QPU are shown in orange, and can be compared to an equal number of samples drawn from the noiseless simulator in blue, and to the ideal pdf shown in gray.

Figure 7 .
Figure 7.The KS Statistic between the ideal distribution and the distribution sampled from the experimental implementation.a) Scaling of the KS statistic with circuit width (Number of Qubits) for MPS depths D = 1, 2, 3 and n = 16.As explained in the text, with current device noise levels, optimal state preparation occurs with D = 1.b) Scaling of the KS statistic with number of two-qubit gates demonstrates that these circuit operations are the dominant source of noise in the state preparation procedure.
. In this case, Ref. 27 explicitly gives the elements of the matrices M in terms of the coefficients a i .