Precise programmable quantum simulations with optical lattices

We present an efficient approach to precisely simulate tight binding models with optical lattices, based on programmable digital-micromirror-device (DMD) techniques. Our approach consists of a subroutine of Wegner-flow enabled precise extraction of a tight-binding model for a given optical potential, and a reverse engineering step of adjusting the potential for a targeting model, for both of which we develop classical algorithms to achieve high precision and high efficiency. With renormalization of Wannier functions and high band effects systematically calibrated in our protocol, we show the tight-binding models with programmable onsite energies and tunnelings can be precisely simulated with optical lattices integrated with the DMD techniques. With numerical simulation, we demonstrate that our approach would facilitate quantum simulation of localization physics with adequate programmability and atom-based boson sampling for illustration of quantum computational advantage. We expect this approach would pave a way towards large-scale and precise programmable quantum simulations based on optical lattices.


INTRODUCTION
Quantum simulation and quantum computing have been attracting tremendous attention in recent years. Among the rapidly advancing quantum hardwares 1 , cold atoms provide a quantum simulation platform with excellent controllability and scalability [2][3][4][5] . In the last two decades, cold atom based quantum simulations have achieved fantastic progress not only along the line of conceptually exotic physics such as artificial gauge fields [6][7][8] , and topological matters 9 , but also along the line of simulating computationally difficult problems such as BEC-BCS crossover 10 , High-Tc physics [11][12][13][14][15] , and non-equilibrium dynamics 16 , where its exceptional quantum advantage has been demonstrated.
In quantum simulations aiming for demonstration of fundamental physical concepts, it is not crucial to precisely calibrate the system. However, in order to use quantum simulations to solve computationally difficult problems, it is required to make the simulation precise-for example in the study of quantum criticality and in solving spin-glass problems, the physical properties of interest are sensitive to Hamiltonian parameters. And in quantum simulations of many-body localization using an incommensurate optical lattice, it has been found that calibration problems cause qualitative disagreement [17][18][19][20] with the targeting Aubry-Andre (AA) model 21,22 . This issue also arises generically in using speckle-pattern induced disorder optical potentials to simulate localization physics [23][24][25][26][27][28][29][30][31][32][33][34] , as the onsite energies and tunnelings are not programmable, let alone the simulation precision.
Here, we consider integration of the recently developed DMD techniques in controlling optical potentials 13,[35][36][37][38] to optical lattices, and calibrate the platform towards precise programmable quantum simulations. We develop an efficient algorithm, which can systematically construct an inhomogeneous optical potential to precisely simulate a given tight binding lattice model, i.e., both the onsite energies and the tunnelings are made precisely programmable. Its efficiency relies on the physical locality. For benchmarking, we provide detailed numerical results for AA and Anderson localization (AL) models, where we show our approach has adequate programmability and systematically eliminates calibration errors. We show that our approach can also be used to implement atom-based quantum sampling algorithms such as boson sampling 39,40 and determinantal point process 41,42 , having promising applications to quantum machine learning. Our protocol provides precise programmability to the quantum platform of optical lattice, which is intrinsically demanded for quantum simulations aiming for computationally difficult problems.

Theory setup
For atoms confined in an optical potential, the Hamiltonian description is Here we have separated the optical potential into a primary part V p ðxÞ ¼ Vp 2 cosð2kxÞ created by standard counter propagating laser beams and an additional potential V D (x) created by DMD 13,[35][36][37][38] or sub-wavelength potential 43,44 techniques. The primary part has lattice translation symmetry with the lattice spacing determined by the forming laser wavelength. Hereafter, we use the lattice constant a = π/k as the length unit and the photon recoil energy of the lattice E R = ℏ 2 k 2 /2m as the energy unit. The added potential V D (x) in general has no homogeneity, and with the present technology it is typically much weaker than the primary lattice. A targeting tight-binding Hamiltonian matrix for the continuous system to simulate is referred to as H ? , which contains onsite energies ϵ i and tunnelings J hii 0 i , with i; i 0 labeling lattice sites determined by the primary optical potential. In the following, we describe our numerical method to reverse engineer V D (x) and V p (x) that makes the precise tight-binding model description of H in Eq. (1) our target, H ? . 1 Firstly, we describe our method for efficient extraction of a tight-binding model of the continuous Hamiltonian H to obtain a single band effective Hamiltonian. Without the inhomogeneous potential V D (x), the precise tight binding model of the system can be efficiently constructed by introducing Bloch modes, because different modes with different lattice momenta are decoupled due to lattice translation symmetry. We denote the Hamiltonian matrix in the Wannier function basis {w m (x − x i )} as H mi;m 0 i 0 , with m; m 0 labeling different bands running from zero (lowest band) to a high-band cutoff M c , and i; i 0 the Wannier function localized centers (or equivalently the lattice sites of the primary lattice). This Hamiltonian matrix takes a block diagonalized form with the decoupled blocks corresponding to different bands 45,46 . However, in the presence of an inhomogeneous potential V D (x), the generated matrix elements R dx w m ðx À x i ÞV D ðxÞw m 0 ðx À x i 0 Þ induce inter-band couplings. We propose to use Wegner flow 47,48 to decouple different bands, which then produces a precise tightbinding model.
Following the flow from l = 0 to +∞, HðlÞ converges to a matrix that commutes with G because Tr½HðlÞ À G 2 ! 0; d dl Tr½HðlÞ À G 2 ¼ À2Tr½η y ðlÞηðlÞ 0: This means the coupling between the m = 0 block of the matrix H and other blocks monotonically converges to 0. A more thorough analysis shows an exponential convergence with a convergence speed inversely proportional to the band gap (see "Methods" section). We remark that although there are different ways of choosing the generator η, other generators may not produce a Wegner flow with strict exponential convergence, which we prove for the particular generator used here. This means our approach is applicable as long as the inhomogeneous potential V D (x) is not too strong to close the band gap. The finite-depth flow equation generates a local unitary that defines a precise tight-binding model as the converged m = 0 Hamiltonian block, which is denoted as the single band effective Hamiltonian H eff .
Secondly, we develop a numerical optimization method to adjust the potential V D (x) to minimize the difference between H eff and H ? . We choose a Frobenius-norm based cost function f = f 0 + λ 1 f 1 , where f 0 and f 1 are Frobenius norms for the difference in the onsite energies and tunnelings, respectively, and a hyperparameter λ 1 is introduced to afford extra weight to the tunneling for better optimization-performance. In our numerics, we parameterize where L is the number of periods of the primary lattice, and e V n , e ϕ n are variational parameters. The maximal spatial frequency of the DMD potential is about twice over the frequency of the primary lattice. Regarding the phase stabilization, it is required to stabilize the phase between the DMD and the primary lattice potential, which has been achieved in recent experiments 49 . We start from a random initialization, obtain H eff through Wegner flow, and then update the optical potential through a gradient descent method. This procedure is iterated until the cost function is below a threshold of our request. Furthermore, our method is highly efficient by making use of locality. Considering a system with large system size, instead of performing the Wegner-flow for the full problem which then has a computation complexity of O(L 3 ), we split the system into small pieces, with an individual length L p . The adjacent pieces have about one third of their length overlapped with each other. We optimize the optical potential to reproduce the precise tightbinding model piece-by-piece, and then glue them together. This is sensible because of the locality in the problem-the onsite energy at one site and the tunnelings between two sites are both determined by their neighboring potential, following the finitedepth Wegner flow. Note that one problem arises that the potential may not be smooth in the overlapping regions, as the obtained potential could be inconsistent in optimizing the two adjacent pieces. To solve this problem, we add λ 2 f 2 to the cost function, where f 2 is the Frobenius norm of the difference of the potential in the overlapping region obtained in the optimization of its belonging two pieces (see "Methods" section). The piece-by-piece procedure is swept back-and-forth for convergence, analogous to the optimization in the standard density-matrix-renormalization group calculation 50 . In the sweeping process, we find a monotonic decrease in the difference between H eff and H ? in the whole system, and that the converged optical potential is smooth. The computation complexity scaling is thus reduced to O(L).
Although this work focuses on one-dimensional lattices, the developed approach is adaptable to high-dimensional lattices as well (see Supplementary Note 1 and Supplementary Fig. 1). The sweeping process in the piece-by-piece optimization has more choices for higher-dimensional systems. How to perform the sweeping in an optimal way is worth future study.
Application to quantum simulation of AA model In the study of quantum localization physics, AA model has been investigated extensively in both theory and experiment [17][18][19][20]28,49,[51][52][53][54][55][56] . Its Hamiltonian reads as where b y i (b i ) denotes the creation (annihilation) operator on a lattice site i, α is an irrational number, J AA is the site-independent tunneling, ϵ AA describes the strength of the onsite energies, and ϕ is an arbitrary phase. Here, we choose α as the golden ratio ð ffiffi ffi 5 p À 1Þ=2, which is approximated by the Fibonacci sequence (F n ) as F n /F n+1 in a finitesize calculation. Because of its energy independent duality defined by a Fourier transform, the model exhibits a phase transition from all wave-function localized to all extended, which makes it natural place to examine one-dimensional localization criticality.
In the optical lattice experiment 17 , the AA model Hamiltonian is achieved by using an incommensurate bichromatic potential, a primary lattice perturbed by a second weak incommensurate lattice with V D;rough ðxÞ ¼ V 1 cosð2αkxÞ=2 following our notation in Eq. (1). However, its corresponding tight-binding model is not a precise AA model-there are corrections making tunnelings inhomogeneous and generating higher-order harmonics, which generically breaks the central ingredient of duality of the AA model 57 . The effects of such corrections have been established both in theory 18 and experiment 19,20 . This problem can be solved by using our precise quantum simulation method.
Taking the general form of V D (x) in Eq. (4), we find the variational minimization for the precise quantum simulation of the AA model automatically reduces to a more specific form since all other parameters except e V 1;2 in Eq. (4) are found to X. Qiu et al.
vanish. As an example, we consider a specific model H ? AA with parameters J AA = −0.0308E R , ϵ AA = 0.0841E R , α ≈ F n /F n+1 (F n = 34 and F n+1 = 55), and ϕ = −πα. This target model is reached by Fig. 1a, we show the optical potentials corresponding to V D,rough and V D,precise for comparison. We find that the resultant onsite energies are approximately the same (Fig. 1b), yet with the potential V D,precise giving a more precise solution. More drastically, the tunnelings out of our potential with V D,precise (x) are precisely homogeneous, with a relative inhomogeneity below 10 −4 (Fig. 1c). This cannot be achieved with the potential of V D,rough (x).
We also emphasize here that our constructed potential V D, precise (x) possesses a vanishing derivative at the individual sites, as exhibited in Fig. 1a. This is crucial to experiments as a potential with finite derivative at the position of atoms would make the system more susceptible to shaking-induced heating processes 49 .
Anderson localization with programmable disorder potential To further demonstrate the precise programmability enabled by our method, we also carry out an application to quantum simulation of Anderson localization models whose previous experimental realization by speckle pattern lacks programmability [23][24][25][26][27][28][29][30][31][32][33][34] . The Hamiltonians of 1D AL models are given as We consider three different cases: (a) random onsite model with h i 2 ½Àe ϵ AL =2; e ϵ AL =2 and t i = J AL being homogeneous, (b) random hopping model with h i = ϵ AL homogenous and t i 2 ½J AL À e J AL =2; J AL þ e J AL =2, and (c) both onsite energies and tunnelings being random with h i 2 ½Àe ϵ AL =2; e ϵ AL =2 and t i 2 ½J AL À e J AL =2; J AL þ e J AL =2. The random onsite energies and tunnelings are drawn according to a uniform distribution. In Fig. 2, we show all the three different cases of AL model can be precisely achieved with our optimization method. The absolute errors in the tight-binding model compared to the target one is made smaller than 10 −5 in units of recoil energy, which demonstrates the precise programmability of our scheme.
Regarding the efficiency of the optimization method, we find the cost function is typically evaluated for 10 4 times before reaching convergence. For the actual CPU time cost, the results in Fig. 2a are obtained in 30 s with MATLAB implementation on an Intel processor of 8-core Xeon E5-2640 v4 CPU at 2.40 GHz, similarly for Fig. 2b,c. The method is thus confirmed to be highly efficient.
One immediate application of the programmable quantum simulation of Anderson localization is to study the anomalous localization in the random hopping model. Unlike the random onsite model, where all states are localized in one dimension, the random hopping model has delocalized states at band center 58,59 . But it is extremely difficult to perform quantum simulation of this pure random hopping model with the speckle-pattern approach lacking programmability, since the unavoidable inhomogeneity in the onsite energy will make all states localized. We randomly generate 2000 disorder samples for the hopping, and compute the corresponding potential V D (x) using our optimization method. The averaged inverse participation ratio (IPR) which diagnoses localization to delocalization transition 60 is calculated, with the results shown in Fig. 3. We find quantitative agreement of results obtained for the continuous potential with the targeting tightbinding model. The discrepancy can be further improved by increasing the lattice depth or allocating more numerical resources.
Implementation of boson sampling and determinantal point process Boson sampling is a promising candidate to demonstrate quantum computational advantage for its established exponential complexity on a classical computer 39,40,61 . Its experimental implementation has been achieved in linear photonic 62 , trapped ion 63 , and quantum-dot devices 64 . Here we show that boson sampling could also be implemented with bosonic atoms confined in an optical lattice using our developed precise programmability. One advantage of atomic realization is that one can replace bosonic atoms by their fermionic isotopes, which then performs quantum sampling for determinantal point process 41 . This then provides one way to verify the quantum advantageous boson sampling because the simulation of determinantal point process is efficient on a classical computer 41,42 .
Here, we consider a standard boson sampling problem with m input modes and n identical bosons, where the n bosons are oneto-one injected into the first n modes as the input state, and then let evolve under an m × m Haar-random unitary U. In the dilute limit (n ≪ m), where each output mode contains at most one particle, the probability of a specific output Fock-state configuration S is p(S) = |per(U S )| 2 , with per meaning the permanent, and U S a submatrix of U selected according to the input and output configurations 39 .
To experimentally realize the Haar-random unitary U with an optical lattice, we adapt the decomposition in ref. 65 , where the random unitary is constructed by multiplication of a series of building blocks of two-mode unitary operations. For the optical lattice implementation, we develop a different construction from photonic realization 66 (see "Methods" section). We choose the two-mode building blocks as Here we have the Pauli matrices σ ðp;qÞ x ¼ q j i p h j þ p j i q h j and σ ðp;qÞ z ¼ q j i q h j À p j i p h j, p, q ∈ {1, 2, …, m}, the quantum states p j i and q j i represent the Wannier functions in the optical lattice, and AA . Here, we choose the hyperparameter λ 1 = 1, the system size L = 55, and periodic boundary condition. In our numerical calculation, we choose high-band cutoff M c = 2, and confirm that the including higher bands do not affect the presented results.  =ϵ z , which are positive with proper construction (see "Methods" section). For p − q > 1, a more involved construction is required, which is provided in "Methods" section. The building blocks of T (p, q) ultimately realize any random unitary, and the total evolution time is just the summation of all the evolution time of the building blocks.
For experimental implementation, we consider Li atoms 67,68 confined in a lattice formed by a laser with wavelength 1064 nm, the recoil energy is E R = 2π × 25.12 kHz with 7 Li. For a mode number m = 10 and number of atoms n = 3, we find that the average total evolution time is estimated to be 0.1 second taking J x = −0.01E R and ϵ z = 0.2E R . We emphasize here that the current lifetime of cold atoms is about one second, which permits quantum sampling with the mode number as large as m = 30. We also confirm that the overall harmonic potential confinement in standard cold atom experiments can be corrected by our precise quantum simulation scheme (see Supplementary Note 2 and Supplementary Fig. 2), which is required in performing programmable quantum sampling. Denoting the probabilities corresponding to the theory and the simulated DMD-based experimental realization as p 1 (S) and p 2 (S), respectively, the sampling precision is characterized by a measure of similarity G ¼ P S ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p 1 ðSÞp 2 ðSÞ p , and a measure of distance D = (1/2) P s |p 1 (S) − p 2 (S)|. The numerical results are shown in Fig. 4a. We find quantitative agreement between the simulated experimental realization and the theory prediction, which implies the precision achieved with our scheme is adequate to perform boson sampling experiments.
We also study the case with fermionic atoms, which then realize the determinantal point process 41 . The results are shown in Fig.  4b, where we also find the quantitative agreement between the simulated experimental realization and the theory prediction. It is worth noting here that even when the classical simulation of boson sampling is unavailable for a large particle number, the experiment with fermions allows one way to detect errors when the device is erroneous, since the determinantal point process can be efficiently simulated on both classical computers 41,42 and quantum devices. In Fig. 4c, we set an error in the Hamiltonian (1) compared to the target one. Here, we choose the hyper-parameter λ 1 = 100, the system size L = 10, the depth of the primary lattice V p = 8E R , and periodic boundary condition. In our numerical calculation, we choose high-band cutoff M c = 2, and confirm that the including higher bands do not affect the presented results.  (1) and the tight binding randomhopping model, respectively. Here, we choose the hyper-parameter λ 1 = 100, λ 2 = 0.5, the system size L = 100, the depth of the primary lattice V p = 8E R , and periodic boundary condition. The whole system is split into a number of pieces with L p = 10, and the adjacent pieces overlap with each other over 4 sites. In our numerical calculation, we choose high-band cutoff M c = 2, and confirm that the including higher bands do not affect the presented results. x , whose amplitude is deliberately set to be 70% of the correct value in order to check whether the error can be detected. We find that the error is indeed detectable by comparing the quantum sampling with the exact results.

DISCUSSION
We have proposed a scheme for precisely simulating lattice models with optical lattices, whose potentials can be manipulated through the high-resolution DMD techniques. We have developed a Wegner-flow method to extract the precise tight-binding model of a continuous potential, and a scalable optimization method for the reverse engineering of the optical potential whose tight binding model precisely matches a targeting model. The performance is demonstrated with concrete examples of AA and Anderson models, and quantum sampling problems. Our approach implies optical lattices can be upgraded towards high-precision programmable quantum simulations by integrating with DMD techniques.
The precise programmable quantum simulation enabled by our scheme make the optical lattice rather flexible. For disorder physics, having programmable disorder allows for more systematic study of the localization transition, especially for cases where the rare disorder Griffith effects are important for example in understanding disordered Weyl semimetals 69 , and many-body localization mobility edge 70,71 .
Our proposing setup also paves a way to building a programmable quantum annealer with optical lattices. Considering spinor atoms in a deep lattice with strong interaction, programmable tunnelings imply programmable spin-exchange. For interacting systems, our approach is directly adaptable in the limit of interaction much weaker than the band gap, because the Wannier functions are easily accessible in the Wegner flow (see Supplementary Note 3 and Supplementary Fig. 3). We expect the approach can be further generalized for generic interacting systems, which is worth future investigation.

Exponential convergence of Wegner flow
As the efficiency of our method relies on the convergence behavior of Wegner flow, in this section we prove that the convergence is exponential, and that the convergence speed is inversely proportional to the band gap -it has a lower bound inversely proportional to the band gap to be more precise.
Note that we use Wegner flow to decouple the lowest band from the rest. The flow converges when the coupling between the lowest and excited bands vanish. To analyze such couplings, we rewrite the Hamiltonian matrix in terms of the lowest and excited band blocks and their couplings as Following our constructed Wegner flow, we have the flow equation for the coupling matrix C as dC dl ¼ D ð0Þ C À CD ð1Þ : The overall strength of these couplings in C are quantified by the trace Tr½C y C, whose l-dependence obeys d dl Tr½C y C ¼ 2Tr½CC y D ð0Þ À C y CD ð1Þ < À2 d Having a finite gap between the lowest and the first excited bands, we have d ð1Þ min À d ð0Þ max > 0. The exponential convergence of the couplings between the lowest and excited bands is then assured. The convergence speed is larger than a value inversely proportional to the band gap. In practical calculations, the Frobenius norm of the Wegner flow generator η flows to 10 −6 after 300 calculations of commutators of sparse matrices for the results presented in the paper.
The piece-by-piece optimization method In this section, we provide the details of the piece-by-piece optimization method. Taking a system having L number of periods-the period is defined according to the primary lattice, the starting points of the periods are labeled as (X 0 , X 1 , X 2 , …, X L−1 ). We split the system into smaller pieces with a piece-size L p . Two adjacent pieces have a finite overlap region with size M p . The i-th piece contains the periods from X iðLpÀMpÞ to X iðLpÀMpÞþLp . Its overlap with the lefthand [righthand] side (i − 1)-th [(i + 1)-th] piece is from X iðLpÀMpÞ to X ðiÀ1ÞðLpÀMpÞþLp [from X ðiþ1ÞðLpÀMpÞ to X iðLpÀMpÞþLp ]. In optimizing the optical potential at i-th piece for the targeting tight-binding model in that local region, we introduce an additional cost function λ 2 f 2 , with λ 2 a hyper-parameter, and where V D,i (x) is the variational potential in optimizing the i-th piece. The f 2 cost function is introduced to minimize the inconsistency of the potential in the overlap region with the neighboring (i − 1)-th and (i + 1)-th pieces. Since the constraint on the consistency is not implemented strict, there will still be leftover inconsistency between V D,i (x) and V D,i ± 1 (x) in a single run. To solve this problem, we perform a back-and-forth sweeping processwe first carry out optimization in a forward direction from the leftmost piece to the rightmost, and then in a backward direction from the rightmost to leftmost. This sweeping process is iterated for potential convergence. In our numerics, we find convergence with three to four times of sweeping. We then glue all the pieces together and construct the global optical potential. It is confirmed that this procedure gives the correct potential whose tight binding model is the targeting model.

Decomposition of a Haar-random unitary with optical lattice accessible operations
Here we describe how to adapt the decomposition of the Haar-random unitary in ref. 65 to optical lattice implementation. An m × m Haar-random unitary U(m) is decomposed into The order of matrix multiplication using ∏ is defined to be from left to right, for example Here, T (p,q) and e U through Eq.  with ϵ z = 0.2E R . c H d ¼ i 8π log ðU diag Þ, with U diag defined in Eq. (14), corresponding to the decomposition of the Haar-random unitary in Fig. 4 (see the notation in "Methods" section). The first row shows the reverse-engineered optical potentials V D (x). The middle row shows the onsite energies (circles) and the tunnelings (crosses) of the targeting tight binding model. The 'dashed' and 'dash-dotted' lines mark the zero value of on-site energies and tunnelings, respectively. And the last row shows the absolute errors in the tight-binding models extracted from the continuous Hamiltonian in Eq. (1) compared to the target one. Here, we choose the hyper-parameter λ 1 = 20 and the depth of the primary lattice V p = 20E R . In our numerical calculation, we choose high-band cutoff M c = 2, and confirm that the including higher bands do not affect the presented results.
It is straightforward to show that U (n + 1,n) and its hermitian conjugate can also be obtained through time evolution operators, i.e., We see that in order to build a general m × m Haar-random unitary U(m), both the number of H x and H z gates we need are (m − 1)(3m − 4)/2. And also a gate U diag is needed, which can be achieved through evolving the Hamiltonian H d ¼ i 8π log ðU diag Þ with the time τ d = ℏ8π/E R . In Fig. 5, we show all the Hamiltonians of typical quantum gates can be precisely achieved with our optimization method, and the absolute errors in the tight-binding model compared to the target one is made smaller than 10 −5 in units of recoil energy.

DATA AVAILABILITY
Data is available upon reasonable request.