Efficient world-line-based quantum Monte Carlo method without Hubbard–Stratonovich transformation

By precisely writing down the matrix element of the local Boltzmann operator (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{e}}^{-\tau h}$$\end{document}e-τh, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$h$$\end{document}h is the Hermitian conjugate pairs of off-diagonal operators), we have proposed a new path integral formulation for quantum field theory and developed a corresponding Monte Carlo algorithm. With the current formula, the Hubbard–Stratonovich transformation is not necessary, accordingly the determinant calculation is not needed, which can improve the computational efficiency. The results show that, the simulation time has the square-law scaling with system sizes, which is comparable with the usual first-principles calculations. The current formula also improves the accuracy of the Suzuki–Trotter decomposition. As an example, we have studied the one-dimensional half-filled Hubbard model at finite temperature. The obtained results are in excellent agreement with the known solutions. The new formula and Monte Carlo algorithm could be applied to various studies in future.

www.nature.com/scientificreports/ as well as determinants, increases the computation time heavily, which results in the computation time scaling cubic with the system size. In parallel to determinantal formulation mentioned above, the world-line formulation represents another catalog of QMC [47][48][49][50][51] . In the world-line QMC, the direct-space and imaginary-time is used to interpolate the representation of the fermion fields. The advantage of the world-line QMC is avoiding the time-consuming process of evaluating fermion determinants. However, in the world-line QMC, a closed path with non-zero weight is not always easy to sample in many-body wavefunction (WF) spaces. At present, several successful examples seem to be limited to a few specific Hamiltonians, and a universal algorithm is waiting to propose. To further reduce the gap between the experimental studies and QMC calculations, more efficient numerical methods are needed to meet the current demands for studying SCQMB systems. Thus, there is an urgent need to either develop new numerical methods or optimize known ones. In this paper, we propose a new world-line QMC method by introducing a representation of the path integral formula in quantum field theory. This method can be used to calculate various properties of a system at finite temperature. The new formulation does not require the HS transformation, and not require determinants, therefore the calculation time can be significantly reduced. The results of the test calculation on the one-dimensional Hubbard model are in excellent agreement with exact values 52,53 .

Proposed formula
To illustrate our method, we chose the simple but representative Hubbard model as an example. It is worth noting that our method can be directly extended to any model or Hamiltonian. The Hamiltonian of the Hubbard model reads: where c + iσ (c iσ ) denotes the creation (annihilation) of an electron with spin σ =↑, ↓ at the i-th lattice site. The first term on the right-hand side of Eq. (1) represents the one-body term, which accounts for the hopping of electrons between different sites, and t > 0 is the hopping amplitude. The second term is the two-body on-site Coulomb interaction, where U represents the interaction strength, and n iσ = c † iσ c iσ is the number operator for spin σ at the site i. For convenience, the spin index ( σ ) is omitted in the following description; yet it is included later on to prevent confusion.
One of the key steps of our method is to combine each off-diagonal term in the Hamiltonian and its Hermite conjugate into pairs, namely h ij = −t c † i c j + c † j c i . Clearly h ij is therefore a Hermitian operator. In the case of a general Hamiltonian, h ij can be made up of the pair of a quartic fermion operator and its Hermitian conjugation. The purpose of this combination is to make h ij as a Hermitian operator, and its eigenfunction can be easily obtained.
A many-body WF in the occupation number representation is labeled as |ijK� . Here, the occupancy of the site i and j is explicitly given, while the occupancy of the rest of the sites is represented by K. For the site i and j, the occupation has four cases, which are labeled as |ijK�, |ijK�, |ijK�, and|ijK� . Here, i(j) indicates that there is no electron occupying the site i (j) site, while i (j) indicates an electron occupying the site i (j) site.
It is easy to prove that h ij has only two eigenstates with non-zero eigenvalues: where the eigenvalues are equal to −θ ij t and θ ij t , respectively. θ ij is the sign produced by the particle exchange as h ij acts on |ijK� . When an even number of exchanges occur, θ ij = 1 , otherwise, θ ij = −1 . Since c † i c j ijK� = θ ij ijK�, c † j c i |ijK� = θ ij |ijK� and c † i c j |ijK� = 0, c † j c i |ijK� = 0 , we have h ij |ϕ ij � ± = ∓θ ij t|ϕ ij � ± . The remaining WFs orthogonal to |ϕ ij � + and |ϕ ij � − are also the eigenstates of h ij , however, the corresponding eigenvalues are zero.
Since the operator h ij and e −τ h ij must share the common eigenvectors, the non-zero matrix elements of the local Boltzmann operator (LBO, e −τ h ij , where τ can be any number) are only present in the following cases: where the remaining matrix elements of e −τ h ij are equal to zero. Since the operator e −τ Un i↑ n i↓ is diagonal, the non-zero matrix element reads: Since t > 0 , the matrix elements in Eqs. (3)-(4) are always positive, regardless of the value of θ ij . The matrix element in Eq. (2) is negative if θ ij is negative, otherwise, it is positive. Equation (2) produces the off-diagonal scattering in WF for the site i and j, but Eqs. (3)-(4) are the diagonal scattering in WF. www.nature.com/scientificreports/ The partition function of a quantum many-body system is expressed as Z = Tr e −βH , where β is the inverse temperature (or imaginary time) and Tr refers the trace of e −βH in WFs space. According to the standard path integral formula, the imaginary time (β) is divided into m time slices, where the partition function becomes Z = Tr e −τ H m , with the time step τ = β/m . Using the Suzuki-Trotter decomposition [44][45][46] , the operator e −τ H can be further decomposed as.
Finally, the partition function reads: To calculate the partition function, a complete set of states are inserted between LBOs. For the purpose of our discussion, we number each LBO ( e −τ h , h = h ij orUn i↑ n i↓ ) in Eq. (6) from right to left as, 1st, 2nd, … i-th LBO. The WF following the k-th LBO is then denoted by the k-th WF. A closed world line (a closed WF sequence, or a closed path) in the WF space is labeled as ω = {· · · |s� · · · } , where |s� is the s-th WF. ρ(ω) is the associated Boltzmann weight of the world line ω.
Because only the matrix elements in Eqs.
(2)-(5) are non-zero, for any closed world line, the Boltzmann weight has the following form: where n − , n + , n 0 correspond to the number of occurrences of the matrix elements in Eqs.

New Monte Carlo algorithm
To implement the new formula presented in Eqs. (6)-(7) into the QMC simulations, we have subsequently developed an efficient algorithm. Although it is easy to calculate the weighting of each path, it is relatively difficult to find a closed world line with a non-zero weight due to the fact that the weight of most paths is zero.
The choice of each WF is very significant for obtaining a closed world line with a non-zero weight. Here, we design an algorithm similar to the world-line algorithm 54 and the multiple time threading algorithm 55 . The current QMC algorithm contains two steps. To illustrate our method, we present an example in Fig. 1, in which the 4-site one-dimensional Hubbard model at the half-filled case is shown. Suppose ω o = {· · · |s� · · · } is a closed world line from the last QMC step, the red line in Fig. 1 marks ω o . The first step is to generate an intermediate world line ( ω ′ ) from a randomly selected WF in ω o . Specifically, a WF, say |r� , is randomly selected from ω o . In Fig. 1, the randomly chosen WF is marked by the arrow A. Then |r� is scattered by the r-th LBO ( e −τ h r ), and a new WF |r + 1� ′ is generated by |r + 1� ′ = e −τ h r |r� . Next, the (r + 2)-th WF |r + 2� ′ is generated by |r + 2� ′ = e −τ h r+1 |r + 1� ′ . This process continues until all the LBOs are cycled in the same sequence as in Eq. (6). In Fig. 1, this process starts from the arrow A to the right hand.
In the above scattering process, e −τ Un i↑ n i↓ does not produce any bifurcation because it is diagonal. However, for the operator e −τ h ij , if one of the i-th or j-th sites is occupied and the other is empty, the scattering will be bifurcated. One side of the bifurcation corresponds to Eq. (2), in which the occupancy of the i-th and j-th sites is exchanged before and after scattering (corresponding to the diagonal line in Fig. 1). While the other path corresponds to Eq. (3), in which the wave function is unchanged before and after scattering (corresponding to the horizontal line in Fig. 1). For the bifurcation, we use a similar heat-bath algorithm 56  . Evidently ρ HB (−) + ρ HB (+) = 1 . Note ρ HB (±) is different from that in Ref. 54 . www.nature.com/scientificreports/ After the scattering process finished, the intermediate world line ω ′ = · · · |s� ′ · · · is successfully generated.
The blue line in Fig. 1 is the intermediate world line generated by the above scattering process. It needs to point out that, ω ′ may not be a closed world line. In fact, in most cases, it is an open world line. In Fig. 1, ω ′ is opened at the arrow A. The key point is that ω ′ may have multiple intersections with ω o . The intersection means at which the WF is identical in both ω o and ω ′ . In Fig. 1, the arrow A and B mark the two intersections. In the second step, the fragment between two randomly chosen intersections in ω ′ is used to replace the corresponding part in ω o , then a new closed world line ( ω n ) is constructed, illustrating in the lower panel of Fig. 1.
The acceptance rate of the new world line is determined by the ratio of two factors, namely accpt = ρ(ω n) ρ(ω o) · 1 ρ HB , where ρ(ω n )andρ(ω o ) is the Boltzmann weight of the new and old world lines according to Eq. (7). And ρ HB is the total weight attached to the heat-bath sampling equal to The product contains all the contribution of each bifurcation in the fragment of ω ′ used in the new closed world line ω n . ρ HB should be deducted from the acceptance rate. If the change is accepted, the updated WFs are implemented. Otherwise, the unchanged WFs are implemented.
There are a few differences between the current method and previous ones 48,54 . (1) Except for the initially selected WF |r� from ω o , the scattering process is irrelevant to the rest WFs in ω o , which is different from Ref. 54 .
(2) In the current method, the sequence of e −τ h ij in Eq. (6) can be arranged in any way, the only requirement is to combine each off-diagonal term in the Hamiltonian and its Hermite conjugate into pairs; (3) In the current method, there are two steps to generate a new closed world line ( ω n ). The first step is to generate an intermediate world line ( ω ′ ) from a exist closed world line ( ω o ) by scattering process. The second step is to construct ω n from ω ′ and ω o . The current procedure does not care whether ω ′ is closed or not, but ω ′ has at least two intersections with ω o . In previous method 54 , the scattering process is aimed to directly generate a closed world line in a single step. Because of this requirement, the previous method 54 usually needs a specific break-up or rearrangement of Hamiltonian. In comparison, the current algorithm is more straight forward than previous ones, and can be easily extended to any Hamiltonian. (4) As first feeling, one may expect there should be few intersections between ω ′ and ω o . However, the probability of finding a new closed world line using the current method is remarkably high, i.e., close to 100%. This may derive from the contribution of the Hermite pairs used in our method. (5) Because the scattering keeps the number of particles unchanged, similar to Ref. 54 , the current method also works in a canonical ensemble, which is different from Ref. 48 .

Tests on Hubbard model
The Hubbard model is a representative for studying typical SCQMB systems. By varying the strength of U, the Hubbard model can describe different systems from the weakly coupled case to strongly coupled case. Usually U = 8, 4 and 2 correspond to the strongly coupled, the intermediate coupled, and the weak coupled cases, respectively 57 . For both the strongly and intermediate coupled cases, the mean field method fails. In this paper, our major calculations will be concentrated on U = 8 and 4. Many studies based on the Hubbard model have been carried out to investigate the metal-insulator transition, superconductivity, and magnetic properties caused by electronic correlation. Lieb and Wu 58 have obtained the exact solution of the ground state for the half-filled one-dimensional Hubbard model. In the last few decades, various theoretical calculations have been carried out to study the one-dimensional Hubbard model. However, few studies have been conducted using this model at finite temperatures 52,53,[59][60][61][62][63][64] .
To illustrate the reliability of our new formula and the new QMC algorithm, we have studied the one-dimensional half-filled Hubbard model at finite temperature. In all calculations, t is taken to have units of energy and is set to t = 1.0. The one-dimensional Hubbard model with the number of lattice size of N = 6, 12, and 24 have been systematically studied. For each system, the strength of the interaction has also been investigated for U = 2, 4, 6 and 8. To determine how the simulation time scaling with N, we also calculate a few larger systems with N up to 96. For most simulations, the total number of QMC steps at each temperature or m is more than 10 7 , where the first third of the steps are used to equilibrate the system and the remaining two thirds of the steps are used to calculate the physical properties.
It needs to point out that, like most QMC methods, our new formula could not give a general solution for the sign problem too. However, for particular special cases, the sign problem is not encountered 26,65 , for example, in one-dimensional Hubbard model. By choosing appropriate boundary conditions (periodic or antiperiodic) in one-dimensional Hubbard model, θ ij in Eq. (2) can be always positive, thus the sign problem can be avoided in current method. This is why we choose the one-dimensional Hubbard model.  www.nature.com/scientificreports/ with a difference of approximately 1% and 5% for U = 4 and 8, respectively. Owing to the intrinsic characteristics of the path integral formula, the exact value can be obtained only as τ approaching to zero. To estimate the QMC energy at τ = 0, the extrapolation to τ = 0 is performed. For this purpose, the data shown in Fig. 2 is fitted by E = E 0 + a * M −c , where E 0 , aandc are fitting parameters. We find the value of c is around 1.5, which is weakly dependent on U and slightly reduces with the increase of N. E 0 is the extrapolated energy at τ = 0 . The exact energy ( E e ) and extrapolated energy ( E 0 ) are shown in Fig. 2. The results show that our QMC method does converge to the exact value at τ = 0.
We have tested the convergence speed for several systems with different N and U. To compare the convergence speed, we define a convergence criterion ( τ * ), at which the derivative of energy with m is less than 0.0005. This  www.nature.com/scientificreports/ criterion is equivalent to that the change in energy is less than 0.0005 as m increases by a unit. Figure 3 shows how τ * changing with U and N. It can be seen that the convergence speed does not change significantly with N, but decreases with the increase of U. For most cases, τ = 0.05 is already a good approximation. In the following QMC simulations, τ is fixed at the value of 0.05. Since our QMC method is independent of determinant, the simulation time should have much better scaling with N and m. To check this point, we have calculated the simulation time as a function of N and m at fixed U = 4. These calculations are performed on a desktop computer with the CPU basic frequency of 3.20 GHz (Intel Core I7-8700) and a serial QMC program. The simulation time for 200 thousand MC steps is calculated for various systems, which is summarized in Fig. 4. From this figure, one can see that, the simulation time has the linear scaling with m (upper panel of Fig. 4) and the square-law scaling with N (lower panel of Fig. 4). This computational cost is far lower than other QMC methods involving determinant calculations, and is comparable with the common first-principles calculations. It should be stressed that, our current QMC code can be further improved for the higher efficiency.
Although the Hubbard model is a benchmark system for testing various QMC methods, a single model may be not enough to demonstrate the advantage of our method. To remedy this issue, we have done an analysis of the computational complexity for our method. The amount of calculation mainly consists of two parts: (1) finding a new closed path. This part needs to calculate the scattering of all LBOs to adjoint WFs, which needs Nm operations; And after each scattering, the comparison between the new and old WFs is preformed, which needs N operations. Thus, the total amount of calculations is scaling as N 2 m. (2) The calculation of Eq. (7). In this step, the scattering matrix of each LBO is calculated, the corresponding computational costs is proportional to Nm too. Combining these two parts, the total amount of calculation scales as N 2 m. This is also consistent with our test results (Fig. 4).
In the following, we will present detailed calculations of various physics quantities for the system with six lattice sites. Figure 5 depicts the energy, double occupancy, and specific heat via temperature for U = 4 (left panel) and 8 (right panel) of the system with six lattice sites. The energy calculated by the QMC simulation is in excellent agreement with the exact value for the entire range of temperatures within the error bar (upper panel of Fig. 5). Compared to U = 4, there is an evident plateau in the temperature range of 1.0 to 2.0 for U = 8. The plateau reflects the fact that the on-site Coulomb interaction has a strong effect in suppressing the occurrence of double occupancy. The change in double occupancy with temperature in middle panel of Fig. 5 supports this conclusion.
From the middle panel of Fig. 5, it can be seen that, from the high temperature to the lower temperature, the double occupancy first decreases and then increases for U = 4 and U = 8. Although the double occupancy increases at low temperatures, the total energy is further reduced with a corresponding decrease in temperature. This reflects the fact that a small increase of the delocalization doublon further decreases the total energy 52,66 . The minimum indicates the degree of localization of electrons is the largest, which corresponds to the maximum in local magnetic moment in upper panel of Fig. 6. For U = 4, the increase in the double occupancy is more evident www.nature.com/scientificreports/ than when U = 8 at lower temperatures. At a fixed temperature, the double occupancy is larger than it is for U = 8, demonstrating how the on-site interaction has a noticeable impact on the formation of the double occupations. The specific heat as a function of temperature is shown in the lower panel of Fig. 5. To calculate the specific heat, the exponential fitting method 67-69 is adopted with the fitting form of E(T) = E(0) + M n=1 c n e −nα T , where E(0) , c n and α are the fitting parameters. In this study, the value of M was 8. From Fig. 5, it can be seen that there are two obvious peaks in the specific heat for U = 8. Specifically, there is a narrow peak at low temperatures and a broad peak at high temperatures. In contrast, for U = 4 the two peaks become much closer and begin to merge together. The stronger interaction strength U, the more obvious the peak. The structure of the obtained specific heat peak is consistent with previous findings 52,61,62 . It is believed that this feature in the specific heat is associated with the spin-wave excitations at low temperatures and the single-particle excitations at high temperatures. Thus, spin fluctuations and charge fluctuations are dominant at low and high temperatures, respectively, which can be highlighted by the correlation functions. The trends of spin correlations in low panel in Fig. 6 have been correlated with the peaks in specific heat. The results we have obtained are consistent with known values. Figure 6 shows the local moment ( L α=0 ) and spin correlation functions ( L α=1,2 ) as a function of temperature for U = 4 (left panel) and U = 8 (right panel) of the system with six lattice sites. With an increasing temperature, L 0 reaches its maximum at a certain temperature and then gradually decreases. The maximum value obtained for L 0 indicates that at this temperature, the degree of localization of electrons is the largest. The degree of delocalization of electrons reflects the formation of doublons; therefore, the trends of local moment and double occupancy are reversed, as shown in the upper panels of Fig. 6 and the middle panels of Fig. 5. From the lower panel of Fig. 6, one can see that L 1 is less than zero, which indicates an antiferromagnetic order at a finite temperature, which leads to the emergence of a specific heat peak at a lower temperature. As the temperature increases, L 1 decreases and tends to zero, reflecting a weakened antiferromagnetic order; this is in agreement with the exact results. In contrast, L 2 is greater than zero and gradually reduces to zero with a corresponding increase in temperature. It can be seen that, the error bar in L 1 andL 2 for U = 8 is relatively large, which may be due to the limited simulation time, or the intrinsic large fluctuations in spin correlation functions. Fortunately, the trends of L 1 andL 2 are in general consistent with the results of Shiba 52,53 .

Summary
We have proposed a path integral formula in field theory and a corresponding world-line quantum Monte Carlo algorithm. The remarkable feature of the current method is that neither determinants nor the HS transformation is needed, which does strongly improve the accuracy and efficiency of Monte Carlo simulations. As an example, we have calculated the thermodynamic quantities and correlation functions of the one-dimensional Hubbard model at finite temperature. Our results are in excellent agreement with the exact values, confirming the reliability of our method. The most encouraging thing is that the computational cost has the square-law scaling with the size of systems. We believe that the current approach could be widely used in future.

Figure 6.
Local moment ( L α=0 ) and spin correlation functions ( L α=1,2 ) as a function of temperature for U = 4 (left panel) and U = 8 (right panel) of the system with six lattice sites. L 0 , L 1 , and L 2 represent the local magnetic moment, the nearest-neighbor spin correlation, and the next-nearest-neighbor spin correlation, respectively.