Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Solving quasiparticle band spectra of real solids using neural-network quantum states

## Abstract

Establishing a predictive ab initio method for solid systems is one of the fundamental goals in condensed matter physics and computational materials science. The central challenge is how to encode a highly-complex quantum-many-body wave function compactly. Here, we demonstrate that artificial neural networks, known for their overwhelming expressibility in the context of machine learning, are excellent tool for first-principles calculations of extended periodic materials. We show that the ground-state energies in real solids in one-, two-, and three-dimensional systems are simulated precisely, reaching their chemical accuracy. The highlight of our work is that the quasiparticle band spectra, which are both essential and peculiar to solid-state systems, can be efficiently extracted with a computational technique designed to exploit the low-lying energy structure from neural networks. This work opens up a path to elucidate the intriguing and complex many-body phenomena in solid-state systems.

## Introduction

Artificial neural networks (ANNs) are a class of expressive mathematical models originally designed to imitate the high computing power of the human brain. Driven by the outstanding success over existing data processing methods in the field of machine intelligence1,2,3, ANNs have been used in a wide range of applications, from physical science4,5,6,7,8, medical diagnosis, to astronomical observations. Remarkable among numerous factors underlying their performance is their ability to perform efficient feature extraction from high-dimensional data.

As universal approximators, ANNs have a rich expressive power, which can also be exemplified by encoding complicated quantum correlations9. Carleo and Troyer10 showed that ANNs, employed as a quantum many-body wave-function ansatz, can solve strongly correlated lattice systems at state-of-the-art level. Such quantum-state ansatze, often referred to as neural quantum states (NQS), capture quantum entanglement that even scales extensively11. The use of such a powerful nonlinear parametrization has been keenly investigated in the quantum physics community: both equilibrium12,13 and out-of-equilibrium14,15,16,17 properties, extension of the network structure18,19,20, and quantum tomography21,22,23,24. Meanwhile, we point out that the application of ANNs to fermionic systems is much less explored, despite their practical significance, such as the modeling of real materials and the experimental realizability in quantum simulators25,26. The proof of concept for small molecular systems was first presented by Choo et al.27 which applied the ANNs to solve the many-body Schrödinger equation governed by the second-quantized Hamiltonian for molecular orbits. Few implementations have been further performed to simulate the electronic structures using ANNs28,29,30,31. Thus, a crucial question remains to be answered: are ANNs powerful enough to represent the electronic structures of real solid materials? This is related to one of the fundamental problems in condensed-matter physics and computational materials science; namely, establishing a predictive ab initio method for solids or surfaces. In particular, it must be demonstrated that the ANNs are capable of investigating the thermodynamic limit.

We stress that no current first-principles method can take into account both weak and strong electron correlations compactly and sufficiently. For instance, it is well known that the accuracy of the de facto standard method, density functional theory (DFT), is semi-quantitative and it is very difficult to improve significantly32,33. Many-body-wave-function-based methodologies are, in contrast, systematically improvable. Such techniques, mainly based on coupled-cluster (CC) theory (or many-body perturbation theory)34, have been successful for the electronic states of molecules. This has encouraged the application of quantum chemical methods to solid-state physics35,36. However, methods such as CC specialize in describing weak electronic correlations, and only work well for electronic states where the mean-field approximation is valid.

Methods for dealing with strongly correlated electrons, called multireference theory, also exists in quantum chemistry37; but these assume that the number of strongly correlated electrons is small. Such a condition usually holds in the case of molecules, because the number of strongly correlated electrons is often localized and limited. In contrast, there can be a large number of moderately or strongly correlated electrons in solid-state systems, owing to their high symmetry and dense structure. Based on its success in spin systems, it is natural to expect that the NQS have the potential to compactly describe a variety of electron correlations appearing in first-principles calculations of solids with a moderate computational cost (See Fig. 1 for a schematic diagram of the hierarchy of quantum chemical methods38,39,40,41,42).

In this work, we demonstrate that neural-network-based many-body wave functions can readily simulate the essense of first-principles calculations for extended periodic materials: the ground-state and excited-state properties. The second-quantized fermionic Hamiltonian is transformed into a spin representation, such that the problematic sign structure of fermions, which usually imposes severe limits on the numerical accuracy, is naturally encoded. Employing the variational Monte Carlo (VMC)-based stochastic optimization, we show that the thermodynamic limit of a one-dimensional system can be simulated within chemical accuracy. For real solids in both two and three dimensions, the static electronic correlation in the minimal active space is compactly represented by the NQS. Our work’s main contribution is that multiple excited states, forming quasiparticle band spectra, are computed by constructing an effective Hamiltonian in the truncated Hilbert space. To the best of our knowledge we offer the first demonstration that the NQS can be applied to simulate low-lying eigenstates in the identical-quantum-number sector.

## Results

### Second-quantization representation of solid systems

To alleviate the notorious difficulty of simulating the many-body problem of solid systems, we employ a linear combination of the single-particle basis. Namely, we construct crystalline orbitals (COs) using the solution of the crystalline Hartree–Fock (HF) equation43,44. The second-quantization form of the many-body fermionic Hamiltonian is

$$H =\; \mathop{\sum}\limits_{pq}\mathop{\sum}\limits_{{\bf{k}}}{t}_{pq}^{{\bf{k}}}{c}_{p{\bf{k}}}^{\dagger }{c}_{q{\bf{k}}}\\ +\frac{1}{2}\mathop{\sum}\limits_{pqrs}\mathop{\sum }\limits_{{{\bf{k}}}_{p}{{\bf{k}}}_{q}{{\bf{k}}}_{r}{{\bf{k}}}_{s}}^{\prime}{v}_{pqrs}^{{{\bf{k}}}_{p}{{\bf{k}}}_{q}{{\bf{k}}}_{r}{{\bf{k}}}_{s}}{c}_{p{{\bf{k}}}_{p}}^{\dagger }{c}_{q{{\bf{k}}}_{q}}{c}_{r{{\bf{k}}}_{r}}^{\dagger }{c}_{s{{\bf{k}}}_{s}},$$
(1)

where cpk ($${c}_{p{\bf{k}}}^{\dagger }$$) denotes the annihilation (creation) operator of an electron on the p-th CO with crystal momentum k. Here, the anticommutation relation $$\{{c}_{p{{\bf{k}}}_{p}},{c}_{q{{\bf{k}}}_{q}}^{\dagger }\}={\delta }_{pq}{\delta }_{{{\bf{k}}}_{p}{{\bf{k}}}_{q}}$$ is imposed, and one-body (two-body) integrals are given as $${t}_{pq}^{{\bf{k}}}$$ ($${v}_{pqrs}^{{{\bf{k}}}_{p}{{\bf{k}}}_{q}{{\bf{k}}}_{r}{{\bf{k}}}_{s}}$$). For simplicity, hereafter we denote the suffix as μ (pk). While the general framework of the crystalline HF equation is common with that for molecular systems, it must be noted that the contribution from the reciprocal lattice vector G = 0 requires extra numerical care owing to the divergence of the exchange integrals. In this work, we employ the crystalline Gaussian-based atomic functions as the single-particle basis. The Gaussian density fitting technique is applied to efficiently compute the two-body integrals45.

The summation in the first term of Eq. (1) is taken over a uniform grid, which is typically obtained by shifting the k’s obeying the Monkhorst–Pack rule46. Note that the number Nk of sampled k-points can be arbitrary. The primed summation in the second term satisfies the conservation of crystal momentum, which follows from translational invariance:

$${{\bf{k}}}_{p}+{{\bf{k}}}_{r}-{{\bf{k}}}_{q}-{{\bf{k}}}_{s}\in {\mathcal{G}},$$
(2)

where $${\mathcal{G}}$$ is the set of reciprocal lattice vectors. With the number of COs at each k-point denoted as N, the total number of terms in Eq. (1) is given as $${\mathcal{O}}({N}^{4}{N}_{k}^{3})$$.

To solve the fermionic many-body Hamiltonian (1), we must explicitly impose the antisymmetric sign structure in the quantum state. Here, we map the Hamiltonian into the spin-1/2 representation such that the sign structure is encoded in the operators rather than the quantum states, as Choo et al.27 considered in their application of the NQS to small molecules. The Jordan–Wigner (JW) transformation47 defines the relation of fermionic and spin operators as $${c}_{\mu }^{(\dagger )}={(-1)}^{\mu -1}{\prod }_{\nu \,{<}\,\mu }{\sigma }_{\nu }^{z}{\sigma }_{\mu }^{+(-)}$$, where $${\sigma }_{\mu }^{+(-)}$$ is the raising (lowering) operator of the μ-th spin. Such a mapping yields a nonlocal spin Hamiltonian

$$H=\mathop{\sum}\limits_{Q}{c}_{Q}{P}_{Q},$$
(3)

where PQμ{I, X, Y, Z} is a product of Pauli matrices for a corresponding Pauli string Q.

Let us make two remarks on the application of JW transformation. First, the use of the fermion-to-spin transformation for stochastic variational calculations was initially considered in the context of near-term quantum computers48, including the application to real solids49,50,51, while the spin-to-fermion mapping has been long applied in condensed-matter and statistical physics community, e.g., to solve exactly soluble quantum spin models. Second, the JW transformation merely generates the spin operator representation of the Hamiltonian (1) and does not alter the computational basis. The evaluation of physical observables in the Monte Carlo approach by the occupation-number basis of the fermionic representation is identical to that by the spin computational basis of the spin representation. This is not the case when we apply other transformations developed in quantum information, such as the Bravyi–Kitaev transformation52.

### Ground states in the thermodynamic limit

In general, it is classically intractable to solve for the ground state of the many-body Hamiltonian defined in Eq. (1) or (3). Here we alternatively rely on a variational method that exemplifies the expressive power of neural networks. Namely, a neural network is used as a variational many-body wave-function ansatz. It is optimized so that the expectation value of the energy, estimated via the Monte Carlo simulation, is minimized by approximating the imaginary-time evolution. Such a technique, called variational Monte Carlo (VMC), has been successfully applied to condensed-matter systems53,54,55,56 and quantum chemistry problems57,58, leading to state-of-the-art numerical analysis on strongly correlated phenomena. The choice of the variational ansatz plays a key role for the accuracy, which, as has been pointed out by Carleo and Troyer10, can be significantly improved by using neural networks.

Let us briefly review the general protocol of VMC for simulating ground states in many-body spin systems using the quantum-state ansatz based on the restricted Boltzmann machine (RBM)59. First, we introduce the quantum many-body wave function expressed as follows10,

$$\left|{{{\Psi }}}_{\theta }^{{\rm{RBM}}}\right\rangle = \frac{1}{Z}{\sum }_{{\boldsymbol{\sigma }}}{{{\Psi }}}_{\theta }^{{\rm{RBM}}}({\boldsymbol{\sigma }})\left|{\boldsymbol{\sigma }}\right\rangle ,\\ {{{\Psi }}}_{\theta }^{{\rm{RBM}}}({\boldsymbol{\sigma }}) = \mathop{\sum}\limits_{h}\exp ({W}_{\mu \nu }{\sigma }_{\mu }{h}_{\nu }+\mathop{\sum}\limits_{\mu }{a}_{\mu }{\sigma }_{\mu }+\mathop{\sum}\limits_{\nu }{b}_{\nu }{h}_{\nu }),$$
(4)

where $${{{\Psi }}}_{\theta }^{{\rm{RBM}}}(\sigma )$$ is the unnormalized amplitude for a spin configuration $$\sigma \in {\{-1,+1\}}^{{N}_{v}}$$ where Nv = NNk is the total number of spin orbitals and $$Z=\sqrt{{\sum }_{\sigma }| {{{\Psi }}}_{\theta }^{{\rm{RBM}}}(\sigma ){| }^{2}}$$ is the normalization factor. We denote the set of complex variational parameters as θ = {Wμν, aμ, bν}, where the interaction Wμν denotes the virtual coupling between the spin σμ and the auxilliary degrees of freedom, or the hidden spin hν. One-body terms aμ and bν are also introduced to enhance the expressive power of the RBM state. In the present work, we find that the it suffices to take the total number of the hidden spin as Nh = Nv, and therefore the number of the complex variational parameters is $$({N}_{v}^{2}+2{N}_{v})$$ in total. The all-to-all connectivity between σ and h allows the RBM state to capture complicated quantum correlations such as topological orders13,60, spin-liquid behaviours61,62,63, and electronic structures in small molecular systems27,28.

Using the RBM state (4) as the many-body variational ansatz, the ground-state problem is solved in the VMC framework. In particular, we rely on the stochastic reconfiguration technique64 to approximate the imaginary-time evolution as

$$\left|{{{\Psi }}}_{{\mathrm{GS}}}\right\rangle \propto \mathop{{{\lim}}}\limits_{\tau \to \infty }{e}^{-\tau H}\left|{{{\Psi }}}_{0}\right\rangle \sim \left|\mathop{{{\Psi }}}\nolimits_{{\theta }_{0}+{\sum }_{k}{{\Delta }}{\theta }_{k}}^{{\rm{RBM}}}\right\rangle ,$$
(5)

where the parameter update at the k-th step Δθk is given by the Monte Carlo simulation, and the initial state $$\left|{{{\Psi }}}_{0}\right\rangle$$ is taken as the HF state in our simulation. Detailed information on the implementation and optimization techniques is provided in “Methods”.

As a first demonstration, we provide the potential energy curve for a one-dimensional system whose electronic correlation varies drastically as the geometry is changed. Concretely, we consider a linear hydrogen chain with homogeneous atom separation dH in a minimal basis set (STO-3G)65,66. Figure 2a presents the result of the calculation using the RBM state as well as the second-order Møller–Plesset perturbation theory (MP2)67, the coupled-cluster singles and doubles (CCSD)41,68, and CCSD with perturbative triple excitations (CCSD(T))69, which is considered as the gold-standard in modern quantum chemistry. While the weakly correlated regime at near-equilibrium is simulated quite well by all the conventional methods, we see that they start to collapse as the correlation grows at the intermediate dH regime, not to mention the Mott-insulating large dH regime. In sharp contrast, the RBM state precisely describes the electronic correlation and achieves chemical accuracy at any atom separation dH. Here, two k-points are sampled from each unit cell, which contains four hydrogen atoms so that the interactions between nearby sites are reflected explicitly on the model.

To further illustrate the RBM state’s power and reliability, we calculate the energy in the thermodynamic limit by extrapolating Nk →  in a system with a single atom per unit cell. The numerical result at near-equilibrium (dH = 2.0aB) is shown in Fig. 2b. We confirm the excellent agreement with conventional methods by comparing the result with the FCI for Nk ≤ 8 and CCSD for 10 ≤ Nk ≤ 18. Clearly, the thermodynamic limit is simulated precisely as well as the finite-size system.

Next, we provide the demonstration in both 2D and 3D real solids: graphene and the lithium hydride (LiH) crystal in the rocksalt structure. Here, we restrict the active space per each k-point to its highest occupied CO and lowest unoccupied CO. The results for graphene [Fig. 3a] and the crystalline LiH [Fig. 3b] are both in remarkable agreement with the FCI or CCSD(T). Clearly, the RBM ansatz gives a quantitatively accurate description, which may allow crystal structure determinations of weakly to moderately correlated real solid systems.

### Quasiparticle band structure from the one-particle excitation

Interest beyond the ground-state electronic structures in solids is diverse: the response against electromagnetic fields, impurity effects, phononic dispersions, and so on. Here, we focus on the band structure, which is a peculiar yet fundamental property that characterizes solid systems. We stress that variational calculations for the lowest bandgap, which can be experimentally measured from photoemissions, are already few, not to mention the simulation of the band spectra based on stochastic methods70. Furthermore, to the best of our knowledge, there is no NQS simulation of excited states in the identical sector of quantum numbers except the first excited state19. This motivates us to perform the first attempt to calculate multiple low-lying states and deepen our understanding on the representability of the NQS beyond the well-studied regimes.

In general, the calculation of band structures is based on the assumption that the system is weakly to moderately correlated. In other words, the mean-field approximation is qualitatively valid, so that one-particle excitations dominate the low-lying spectrum. By employing such a picture in a quantum many-body context, we can also simulate the band structure via quasiparticle excitations. We take a similar approach here and compute the band structure from the single-particle linear-response behavior of the ground state.

Let us construct an appropriately truncated Hilbert space which captures the low-lying states in a stochastic manner. It is justified from the above argument that we consider a subspace spanned by a set of non-orthonormal bases $$\{{R}_{\alpha }\left|{{{\Psi }}}_{{\rm{GS}}}\right\rangle \}$$, where Rα denotes the α-th single-particle excitation operator. Here, the valence (conduction) bands are obtained from the ionization (electron attachment) operators $$\{{c}_{p{{\bf{k}}}_{p}}\}$$ ($$\{{c}_{p{{\bf{k}}}_{p}}^{\dagger }\}$$), which allows us to compute the quasiparticle band with an additional computational cost of $${\mathcal{O}}({N}_{v}^{3})$$. Although it is possible to include higher-order excitation operators, here we avoid them from the viewpoint of computational cost and size inconsitency. It can be shown that the diagonalization of the effective Hamiltonian given the non-orthonormal basis is done by the following generalized eigenvalue equation71,

$$\widetilde{H}C=\widetilde{S} CE$$
(6)

where $$E={\rm{diag}}({E}_{1},...,{E}_{{N}_{v}})$$ denote the eigenvalues and C is an array of eigenvectors. The matrix elements of the non-hermitian matrix $$\widetilde{H}$$ and the metric $$\widetilde{S}$$ are estimated via the Monte Carlo sampling as expectation values:

$${\widetilde{H}}_{\alpha \beta }=\left\langle {{{\Psi }}}_{{\rm{{\theta }}^{* }}}^{{\rm{RBM}}}| {R}_{\alpha }^{\dagger }H{R}_{\beta }| {{{\Psi }}}_{{\rm{{\theta }}^{* }}}^{{\rm{RBM}}}\right\rangle ,$$
(7)
$$\widetilde{S}_{\alpha \beta }=\left\langle {{{\Psi }}}_{{\rm{{\theta }}^{* }}}^{{\rm{RBM}}}| {R}_{\alpha }^{\dagger }{R}_{\beta }| {{{\Psi }}}_{{\rm{{\theta }}^{* }}}^{{\rm{RBM}}}\right\rangle ,$$
(8)

where the ground state is now replaced by the RBM ansatz $$\left|{{{\Psi }}}_{{\theta }^{* }}^{{\rm{RBM}}}\right\rangle$$, with the optimized variational parameter θ*. In the field of quantum chemistry, this procedure is referred to as the internally contracted multireference configuration interaction72,73.

To enhance the numerical reliability, we incorporate the effect of orbital relaxation by estimating the bandgap from the extended Koopmans’ theorem74,75,76. The energies are shifted so that the first valence and conduction bands coincide with the energy difference ΔEIP and ΔEEA as

$$\left\{\begin{array}{lll}{{\Delta }}{E}^{{\mathrm{IP}}}&=&{E}_{{\mathrm{GS}}}^{{N}_{v}}-{E}_{{\mathrm{GS}}}^{{N}_{v}-1},\\ {{\Delta }}{E}^{{\mathrm{EA}}}&=&{E}_{{\mathrm{GS}}}^{{N}_{v}+1}-{E}_{{\mathrm{GS}}}^{{N}_{v}},\end{array}\right.$$
(9)

where $${E}_{{\mathrm{GS}}}^{n}$$ is the energy of the RBM optimized in the particle-number sector n (See “Methods”).

We provide a demonstration for the quasiparticle band structure of the polyacetylene [Fig. 4a] using the STO-3G basis sets. The result is compared with a variant of the equation-of-motion coupled-cluster theories (EOM-CC): ionization-potential (electron-attached) EOM-CC (IP-EOM-CC, EA-EOM-CC), which considers up to 2-hole and 1-particle (2-particle and 1-hole) excitations41. The agreement with EOM-CCSD(T)(a)*77 is very good for the first valence and conduction bands, while it becomes slightly worse for higher excitations. As is shown in Fig. 4b, the first conduction band is simulated almost within chemical accuracy, which is partly due to the cancellation of the optimization errors induced by Eq. (9). Meanwhile, Fig. 4c indicates that errors in the higher excitations can be an order of magnitude larger in the worst case, which cannot be explained merely from the variational simulation error. Rather, it can be understood as a systematic error originating in the insufficiency of the truncated Hilbert space; there is a trade-off between the computational cost and the accuracy. Systematic improvement can be expected from using higher-order excitation operators, e.g., two-electron excitation operators $$\{{c}_{p{{\bf{k}}}_{p}}^{\dagger }{c}_{q{{\bf{k}}}_{q}}\}$$ for the lowest energy state in the particle-number sectors (Nv ± 1).

## Conclusion

We have shown that a shallow neural network with a moderate number of variational parameters allows us to perform the essence of first-principles calculations in solid systems, i.e., the ground-state property and the quasiparticle band spectra. In the weakly to moderately correlated regions of the linear hydrogen chain, we have demonstrated that even the thermodynamic limit can be simulated using the RBM state. The representability of the RBM is also exhibited in the strongly correlated regions, where the standard approaches break down. We have furthermore shown that the electronic structures of real solids in both 2D and 3D can be described accurately. Furthermore, we have successfully obtained the quasiparticle band spectra of a polymer in the linear-response regime. To the best of our knowledge, this is the first demonstration proving that NQS are capable of computing multiple excited states, in addition to precise ground-state simulations that reach their chemical accuracy.

Numerous future directions can be envisioned. We remark the following three points. First is the extension towards the complete basis limit. While we have here focused on relatively simple basis sets, the quantitative prediction and comparison with experiments would necessarily require larger basis sets. Working in the continuum space is a possibility, but the calculation would be much more involved than in molecular systems. Second is the systematic improvement of the calculations for excited states. It is intriguing to investigate the quantitative performance; whether higher-order subspace expansions can be efficiently implemented, how the accuracy is compared to other excited-state calculation framework such as the equation-of-motion and time-dependent linear response78, and so on. Third is the behavior of physical observables. One may want to know the optical/magnetoelectric/thermal responses, so that experimental results can be directly compared. If the system is either quasi-static or static, those properties can be evaluated as derivatives of the energy with respect to an external perturbation (e.g., electric field)79.

The main bottleneck that prevents the simulation by the NQS in larger systems is the sampling efficiency. As mentioned by Choo et al. for the case of RBM27, and as known before in the VMC community, accurate calculations for relatively weak electronic correlations in the HF basis requires increasingly larger number of Monte Carlo samplings, because the amplitudes for multi-electron excitations are small. One may consider applying efficient sampling techniques, such as parallel tempering, heat-bath configuration interaction80, or even employ non-HF bases.

## Methods

### Stochastic imaginary-time evolution by variational Monte Carlo

Given an initial state $$\left|{{{\Psi }}}_{0}\right\rangle$$ whose overlap with the true ground state is nonzero (and desirably not exponentially small), the ground state $$\left|{{{\Psi }}}_{{\rm{GS}}}\right\rangle$$ can be simulated as

$$\begin{array}{r}\left|{{{\Psi }}}_{{\mathrm{GS}}}\right\rangle \propto \mathop{{{\lim}}}\limits_{N\to \infty }\mathop{{{\lim}}}\limits_{\eta \to 0}\left(\mathop{\prod }\limits_{k=1}^{N}{e}^{-\eta H}\right)\left|{{{\Psi }}}_{0}\right\rangle ,\end{array}$$
(10)

where H is the Hamiltonian of the system and η is a "learning rate" that determines the step of the imaginary-time evolution. The exact simulation of Eq. (10) for generic quantum many-body systems becomes exponentially inefficient as the system size grows. Hence, we approximate the quantum state by a variational ansatz $$\left|{{{\Psi }}}_{\theta }\right\rangle$$ and consider the update rule of the parameters θ such that Eq. (10) is realized approximately.

There are numerous variational principles that dictate the parameter updates. Here, we choose the stochastic reconfiguration method64,81, which uses the Fubini-Study metric $${\mathcal{F}}$$ to measure the difference between the exact and variational imaginary-time evolution. Given a set of variational parameter θ, the update δθ is determined as

$$\delta \theta = \, \mathop{{\rm{arg}}\ {{\min}}}\limits_{{{\Delta }}}\left({\mathcal{F}}\left[{e}^{-\eta \hat{H}}\left|{{{\Psi }}}_{\theta }\right\rangle ,\left|{{{\Psi }}}_{\theta +{{\Delta }}}\right\rangle \right]\right)\\ = \, -\eta {g}^{-1}f$$
(11)

where $${\mathcal{F}}[\left|\psi \right\rangle ,\left|\phi \right\rangle ]=\arccos (\sqrt{\left\langle \psi | \phi \right\rangle \left\langle \phi | \psi \right\rangle /\left\langle \psi | \psi \right\rangle \left\langle \phi | \phi \right\rangle })$$ and elements of the generic force fi and the geometric tensor gij are given as

$${f}_{i}={\partial }_{i}\frac{\left\langle {{{\Psi }}}_{\theta }| H| {{{\Psi }}}_{\theta }\right\rangle }{\left\langle {{{\Psi }}}_{\theta }| {{{\Psi }}}_{\theta }\right\rangle },$$
(12)
$${g}_{ij}=\frac{\left\langle {\partial }_{i}{{{\Psi }}}_{\theta }| {\partial }_{j}{{{\Psi }}}_{\theta }\right\rangle }{\left\langle {{{\Psi }}}_{\theta }| {{{\Psi }}}_{\theta }\right\rangle }-\frac{\left\langle {\partial }_{i}{{{\Psi }}}_{\theta }| {{{\Psi }}}_{\theta }\right\rangle }{\left\langle {{{\Psi }}}_{\theta }| {{{\Psi }}}_{\theta }\right\rangle }\frac{\left\langle {{{\Psi }}}_{\theta }| {\partial }_{j}{{{\Psi }}}_{\theta }\right\rangle }{\left\langle {{{\Psi }}}_{\theta }| {{{\Psi }}}_{\theta }\right\rangle },$$
(13)

where ∂i is the derivative with respect to the i-th element of the parameter θi. It is noteworthy that the geometric tensor g is the extension of the Fisher information to quantum states. The stochastic gradient method based on g, or the Fisher information, was independently developed in the machine learning community81, and is frequently referred to as the natural gradient method.

Note that both f and g can be estimated efficiently using Monte Carlo sampling. Indeed, any physical observable O can be estimated for a quantum state $$\left|{{\Psi }}\right\rangle$$ as

$$\left\langle O\right\rangle =\frac{\left\langle {{\Psi }}| O| {{\Psi }}\right\rangle }{\left\langle {{\Psi }}| {{\Psi }}\right\rangle }=\frac{{\sum }_{\sigma }| {{\Psi }}(\sigma ){| }^{2}{O}_{{\rm{loc}}}(\sigma )}{{\sum }_{\sigma }| {{\Psi }}(\sigma ){| }^{2}}=\mathop{\sum}\limits_{\sigma }p(\sigma ){O}_{{\rm{loc}}}(\sigma ),$$
(14)

where $${O}_{{\rm{loc}}}(\sigma )={\sum }_{\sigma ^{\prime} }\frac{{{\Psi }}(\sigma ^{\prime} )}{{{\Psi }}(\sigma )}\left\langle \sigma | O| \sigma ^{\prime} \right\rangle$$ is introduced to enable the simulation of the expectation value from classical sampling over the probability distribution p(σ) = Ψ(σ)2/∑σΨ(σ)2. Using the Metropolis–Hastings algorithm with particle-number conservation, we typically sample $${\mathcal{O}}(1{0}^{5})$$ to $${\mathcal{O}}(1{0}^{7})$$ spin configurations to estimate p(σ). Each configuration is drawn every 10–20 Monte Carlo steps so that the autocorrelation, and hence the sampling error, is sufficiently small when the optimization converges.

Three technical remarks are in order. First, we take the initial state $$\left|{{{\Psi }}}_{0}\right\rangle (=\left|{{{\Psi }}}_{{\theta }_{0}}^{{\rm{RBM}}}\right\rangle )$$ as the HF state such that the overlap with the ground state is nonzero. Small noise is added to avoid the gradient vanishing problem, which arises when the parameters of the RBM state are tuned to express any computational basis exactly. Second, to stabilize the optimization, small number ϵ is uniformly added to the diagonal elements of g as gii → gii + ϵ. While large ϵ is beneficial in early iterations, it is necessary to decrease it, or otherwise one may result in undesirable local minima. Therefore, ϵ is initially set as $${\mathcal{O}}(1{0}^{-2})$$ and gradually decreased to $${\mathcal{O}}(1{0}^{-3})$$ after several hundred steps. Third, we find that it is crucial to adopt an appropriate scheduling of η to speed up the optimization and, more importantly, avoid local minima. In the present work, we exclusively employ the RMSProp method82, which adaptively modifies η according to the magnitude of the gradient.

### Energy corrections by the extended Koopmans’ theorem

In Fig. 5, we visualize the effect of the corrections to the energy bands by the extended Koopmans’ theorem, which are defined in Eq. (9) in the main text as

$$\left\{\begin{array}{lll}{{\Delta }}{E}^{{\mathrm{IP}}}&=&{E}_{{\mathrm{GS}}}^{{N}_{v}}-{E}_{{\mathrm{GS}}}^{{N}_{v}-1},\\ {{\Delta }}{E}^{{\mathrm{EA}}}&=&{E}_{{\mathrm{GS}}}^{{N}_{v}+1}-{E}_{{\mathrm{GS}}}^{{N}_{v}},\end{array}\right.$$

where $${E}_{{\mathrm{GS}}}^{n}$$ is the energy of the RBM optimized in the particle-number sector n. Here, panels (a) and (b) indicate the first conduction and valence bands, respectively. In both bands, we observe a systematic deviation, which we attribute to the lack of orbital relaxation effect caused by the removal or addition of a single electron. The order of the correction ΔE ~ 0.05 Ha is comparable to that of the electronic correlation (~0.1 Ha).

## Data availability

The data that support the findings of this study are available from the corresponding author upon request.

## Code availability

Codes written for and used in this study is available from the corresponding author upon reasonable request.

## References

1. 1.

Krizhevsky, A., Sutskever, I. & Hinton, G. E. In Neural Information Processing Systems, Vol. 25, (eds. F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger) (Curran Associates, Inc., 2012).

2. 2.

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT press, 2016).

3. 3.

Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

4. 4.

Carleo, G. et al. Machine learning and the physical sciences. Rev. Mod. Phys. 91, 045002 (2019).

5. 5.

Das Sarma, S., Deng, D.-L. & Duan, L.-M. Machine learning meets quantum physics. Phys. Today 72, 48–54 (2019).

6. 6.

Carrasquilla, J. Machine learning for quantum matter. Adv. Phys. X 5, 1797528 (2020).

7. 7.

Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).

8. 8.

Yoshioka, N., Akagi, Y. & Katsura, H. Learning disordered topological phases by statistical recovery of symmetry. Phys. Rev. B 97, 205110 (2018).

9. 9.

Schuld, M., Sinayskiy, I. & Petruccione, F. An introduction to quantum machine learning. Contemp. Phys. 56, 172185 (2014).

10. 10.

Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).

11. 11.

Deng, D.-L., Li, X. & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017).

12. 12.

Nomura, Y., Darmawan, A. S., Yamaji, Y. & Imada, M. Restricted Boltzmann machine learning for solving strongly correlated quantum systems. Phys. Rev. B 96, 205152 (2017).

13. 13.

Deng, D.-L., Li, X. & Das Sarma, S. Machine learning topological states. Phys. Rev. B 96, 195145 (2017).

14. 14.

Yoshioka, N. & Hamazaki, R. Constructing neural stationary states for open quantum many-body systems. Phys. Rev. B 99, 214306 (2019).

15. 15.

Hartmann, M. J. & Carleo, G. Neural-network approach to dissipative quantum many-body dynamics. Phys. Rev. Lett. 122, 250502 (2019).

16. 16.

Nagy, A. & Savona, V. Variational quantum Monte Carlo method with a neural-network ansatz for open quantum systems. Phys. Rev. Lett. 122, 250501 (2019).

17. 17.

Vicentini, F., Biella, A., Regnault, N. & Ciuti, C. Variational neural-network ansatz for steady states in open quantum systems. Phys. Rev. Lett. 122, 250503 (2019).

18. 18.

Gao, X. & Duan, L.-M. Efficient representation of quantum many-body states with deep neural networks. Nat. Commun. 8, 662 (2017).

19. 19.

Choo, K., Carleo, G., Regnault, N. & Neupert, T. Symmetries and many-body excitations with neural-network quantum states. Phys. Rev. Lett. 121, 167204 (2018).

20. 20.

Levine, Y., Sharir, O., Cohen, N. & Shashua, A. Quantum entanglement in deep learning architectures. Phys. Rev. Lett. 122, 065301 (2019).

21. 21.

Torlai, G. et al. Neural-network quantum state tomography. Nat. Phys. 14, 447–450 (2018).

22. 22.

Torlai, G. & Melko, R. G. Latent space purification via neural density operators. Phys. Rev. Lett. 120, 240503 (2018).

23. 23.

Melkani, A., Gneiting, C. & Nori, F. Eigenstate extraction with neural-network tomography. Phys. Rev. A 102, 022412 (2020).

24. 24.

Ahmed, S., Muñoz, C. S., Nori, F. & Kockum, A. F. Quantum State Tomography with Conditional Generative Adversarial Networks.  arXiv:2008.03240 (2020).

25. 25.

Georgescu, I. M., Ashhab, S. & Nori, F. Quantum simulation. Rev. Mod. Phys. 86, 153–185 (2014).

26. 26.

Mazurenko, A. et al. A cold-atom Fermi–Hubbard antiferromagnet. Nature 545, 462–466 (2017).

27. 27.

Choo, K., Mezzacapo, A. & Carleo, G. Fermionic neural-network states for ab-initio electronic structure. Nat. Commun. 11, 2368 (2020).

28. 28.

Yang, P.-J., Sugiyama, M., Tsuda, K. & Yanai, T. Artificial neural networks applied as molecular wave function solvers. J. Chem. Theory. Comput. 16, 3513–3529 (2020).

29. 29.

Pfau, D., Spencer, J. S., Matthews, A. G. D. G. & Foulkes, W. M. C. Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Phys. Rev. Res. 2, 033429 (2020).

30. 30.

Hermann, J., Schätzle, Z. & Noé, F. Deep-neural-network solution of the electronic Schrödinger equation. Nat. Chem. 12, 891–897 (2020).

31. 31.

Spencer, J. S., Pfau, D., Botev, A. & Foulkes, W. M. C. Better, faster fermionic neural networks. arXiv:2011.07125 (2020).

32. 32.

Medvedev, M. G., Bushmarinov, I. S., Sun, J., Perdew, J. P. & Lyssenko, K. A. Density functional theory is straying from the path toward the exact functional. Science 355, 49–52 (2017).

33. 33.

Mardirossian, N. & Head-Gordon, M. Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals. Mol. Phys. 115, 2315–2372 (2017).

34. 34.

Shavitt, I. & Bartlett, R. J. Many-Body Methods in Chemistry and Physics: MBPT and Coupled-Cluster Theory, Cambridge Molecular Science (Cambridge University Press, 2009).

35. 35.

Gruber, T., Liao, K., Tsatsoulis, T., Hummel, F. & Grüneis, A. Applying the coupled-cluster ansatz to solids and surfaces in the thermodynamic limit. Phys. Rev. X 8, 021043 (2018).

36. 36.

Zhang, I. Y. & Grüneis, A. Coupled cluster theory in materials science. Front. Mater. 6, 123 (2019).

37. 37.

Roos, B., Lindh, R., Malmqvist, P., Veryazov, V. & Widmark, P.-O. Multiconfigurational quantum chemistry (John Wiley, Sons, 2016).

38. 38.

Hirata, S., Podeszwa, R., Tobita, M. & Bartlett, R. J. Coupled-cluster singles and doubles for extended systems. J. Chem. Phys. 120, 2581–2592 (2004).

39. 39.

Zgid, D. & Chan, G. K.-L. Dynamical mean-field theory from a quantum chemical perspective. J. Chem. Phys. 134, 094115 (2011).

40. 40.

Liao, K. & Grüneis, A. Communication: Finite size correction in periodic coupled cluster theory calculations of solids. J. Chem. Phys. 145, 141102 (2016).

41. 41.

McClain, J., Sun, Q., Chan, G. K.-L. & Berkelbach, T. C. Gaussian-based coupled-cluster theory for the ground-state and band structure of solids. J. Chem. Theory Comput. 13, 1209 (2017).

42. 42.

Booth, G. H., Grüneis, A., Kresse, G. & Alavi, A. Towards an exact description of electronic wavefunctions in real solids. Nature 493, 365–370 (2013).

43. 43.

Re, G. D., Ladik, J. & Biczo, G. Self-consistent-field tight-binding treatment of polymers. I. Infinite three-dimensional case. Phys. Rev. 155, 997 (1967).

44. 44.

Andre, J. M. Self-consistent field theory for the electronic structure of polymers. J. Chem. Phys. 50, 1536–1542 (1969).

45. 45.

Sun, Q., Berkelbach, T. C., McClain, J. D. & Chan, G. K.-L. Gaussian and plane-wave mixed density fitting for periodic systems. J. Chem. Phys. 147, 164119 (2017).

46. 46.

Monkhorst, H. J. & Pack, J. D. Special points for Brillouin-zone integrations. Phys. Rev. B 13, 5188 (1976).

47. 47.

Jordan, P. & Wigner, E. Über das Paulische äquivalenzverbot. Z. Phys. 47, 631–651 (1928).

48. 48.

Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 4213 (2014).

49. 49.

Liu, J., Wan, L., Li, Z. & Yang, J. Simulating periodic systems on quantum computer. arxiv http://arxiv.org/abs/2008.02946 (2020).

50. 50.

Manrique, D. Z., Khan, I. T., Yamamoto, K., Wichitwechkarn, V. & Ramo, D. M. Momentum-space unitary couple cluster and translational quantum subspace expansion for periodic systems on quantum computers. arxiv http://arxiv.org/abs/2008.08694 (2020).

51. 51.

Yoshioka, N., Nakagawa, Y. O., Ohnishi, Y. & Mizukami, W. Variational quantum simulation for periodic materials. arxiv http://arxiv.org/abs/2008.09492 (2020).

52. 52.

Bravyi, S. B. & Kitaev, A. Y. Fermionic quantum computation. Ann. Phys. (NY) 298, 210–226 (2002).

53. 53.

McMillan, W. L. Ground state of liquid He4. Phys. Rev. 138, A442–A451 (1965).

54. 54.

Kolorenč, J. & Mitas, L. Applications of quantum Monte Carlo methods in condensed systems. Rep. Prog. Phys. 74, 026502 (2011).

55. 55.

Sorella, S. et al. Superconductivity in the two-dimensional tJ model. Phys. Rev. Lett. 88, 117002 (2002).

56. 56.

Misawa, T. & Imada, M. Origin of high-Tc superconductivity in doped hubbard models and their extensions: roles of uniform charge fluctuations. Phys. Rev. B 90, 115137 (2014).

57. 57.

Hammond, B. L., Lester, W. A. & Reynolds, P. J. Monte Carlo methods in ab initio quantum chemistry Vol. 1 (World Scientific, 1994).

58. 58.

Foulkes, W., Mitas, L., Needs, R. J. & Rajagopal, G. Quantum Monte Carlo simulations of solids. Rev. Mod. Phys. 73, 33 (2001).

59. 59.

Smolensky, P. Parallel Distributed Processing: Volume 1: Foundations 194 (MIT Press, 1986).

60. 60.

Glasser, I., Pancotti, N., August, M., Rodriguez, I. D. & Cirac, J. I. Neural-network quantum states, string-bond states, and chiral topological states. Phys. Rev. X 8, 011006 (2018).

61. 61.

Choo, K., Neupert, T. & Carleo, G. Two-dimensional frustrated J1 − J2 model studied with neural network quantum states. Phys. Rev. B 100, 125124 (2019).

62. 62.

Ferrari, F., Becca, F. & Carrasquilla, J. Neural Gutzwiller-projected variational wave functions. Phys. Rev. B 100, 125131 (2019).

63. 63.

Nomura, Y. & Imada, M. Dirac-type nodal spin liquid revealed by machine learning. arxiv http://arxiv.org/abs/2005.14142 (2020).

64. 64.

Sorella, S. Generalized Lanczos algorithm for variational quantum Monte Carlo. Phys. Rev. B 64, 024512 (2001).

65. 65.

Motta, M. et al. Towards the solution of the many-electron problem in real materials: equation of state of the hydrogen chain with state-of-the-art many-body methods. Phys. Rev. X 7, 042308 (2017).

66. 66.

Motta, M. et al. (Simons Collaboration on the Many-Electron Problem),Ground-state properties of the hydrogen chain: Dimerization, insulator-to-metal transition, and magnetic phases. Phys. Rev. X 10, 031058 (2020).

67. 67.

Sun, J.-Q. & Bartlett, R. J. Second-order many-body perturbation-theory calculations in extended systems. J. Chem. Phys. 104, 8553–8565 (1996).

68. 68.

Hirata, S., Grabowski, I., Tobita, M. & Bartlett, R. J. Highly accurate treatment of electron correlation in polymers: coupled-cluster and many-body perturbation theories. Chem. Phys. Lett. 345, 475–480 (2001).

69. 69.

Grüneis, A. et al. Natural orbitals for wave function based correlated calculations using a plane wave basis set. J. Chem. Theory Comput. 7, 2780–2785 (2011).

70. 70.

Ma, F., Zhang, S. & Krakauer, H. Excited state calculations in solids by auxiliary-field quantum monte carlo. N. J. Phys. 15, 093017 (2013).

71. 71.

McClean, J. R., Kimchi-Schwartz, M. E., Carter, J. & de Jong, W. A. Hybrid quantum-classical hierarchy for mitigation of decoherence and determination of excited states. Phys. Rev. A 95, 042308 (2017).

72. 72.

Werner, H.-J. & Reinsch, E.-A. The self-consistent electron pairs method for multiconfiguration reference state functions. J. Chem. Phys. 76, 3144–3156 (1982).

73. 73.

Werner, H.-J. & Knowles, P. J. An efficient internally contracted multiconfiguration–reference configuration interaction method. J. Chem. Phys. 89, 5803–5814 (1988).

74. 74.

Day, O. W., Smith, D. W. & Garrod, C. A generalization of the Hartree–F”ock one-particle potential. Int. J. Quantum Chem. 8, 501–509 (1974).

75. 75.

Smith, D. W. & Day, O. W. Extension of Koopmans’ theorem. I. Derivation. J. Chem. Phys. 62, 113–114 (1975).

76. 76.

Morrell, M. M., Parr, R. G. & Levy, M. Calculation of ionization potentials from density matrices and natural functions, and the long-range behavior of natural orbitals and electron density. J. Chem. Phys. 62, 549–554 (1975).

77. 77.

Matthews, D. A. & Stanton, J. F. A new approach to approximate equation-of-motion coupled cluster with triple excitations. J. Chem. Phys. 145, 124102 (2016).

78. 78.

Mussard, B. et al. Time-dependent linear-response variational Monte Carlo. Adv. Quantum Chem. 76, 255–270 (2018).

79. 79.

Pulay, P. Analytical derivatives, forces, force constants, molecular geometries, and related response properties in electronic structure theory. Comput. Mol. Sci. 4, 169–181 (2014).

80. 80.

Holmes, A. A., Tubman, N. M. & Umrigar, C. J. Heat-bath configuration interaction: an efficient selected configuration interaction algorithm inspired by heat-bath sampling. J. Chem. Theory Comput. 12, 3674–3680 (2016).

81. 81.

Amari, S.-I., Kurata, K. & Nagaoka, H. Information geometry of Boltzmann machines. IEEE Trans. Neural Netw. 3, 260 (1992).

82. 82.

Hinton, G., Srivastava, N. & Swersky, K. Neural networks for machine learning (Coursera, Video lectures, 2012).

83. 83.

McClean, J. et al. Openfermion: the electronic structure package for quantum computers. Quantum Sci. Technol. 5, 034014 (2020).

84. 84.

Sun, Q. et al. PySCF: the Python-based simulations of chemistry framework. Comput. Mol. Sci. 8, e1340 (2018).

85. 85.

Carleo, G. et al. Netket: A machine learning toolkit for many-body quantum systems. SoftwareX 10, 100311 (2019).

## Acknowledgements

We thank Kenny Choo, Antonio Mezzacappo, and James Spencer for fruitful discussions. This work was supported by MEXT Quantum Leap Flagship Program (MEXT Q-LEAP) Grant Number JPMXS0118067394 and JPMXS0120319794. N.Y. is supported by the Japan Science and Technology Agency (JST) (via the Q-LEAP program). W.M. wishes to thank Japan Society for the Promotion of Science (JSPS) KAKENHI No. 18K14181 and JST PRESTO No. JPMJPR191A. F.N. is supported in part by: NTT Research, Army Research Office (ARO) (Grant No. W911NF-18-1-0358), Japan Science and Technology Agency (JST) (via the CREST Grant No. JPMJCR1676), Japan Society for the Promotion of Science (JSPS) (via the KAKENHI Grant No. JP20H00134 and the JSPS-RFBR Grant No. JPJSBP120194828), the Asian Office of Aerospace Research and Development (AOARD) (via Grant No. FA2386-20-1-4069), and the Foundational Questions Institute Fund (FQXi) via Grant No. FQXi-IAF19-06. Numerical calculations were performed using OpenFermion83, PySCF (v1.7.1)84, and NetKet85. Some calculations were performed using the supercomputer systems in RIKEN (HOKUSAI GreatWave), the Institute of Solid State Physics at the University of Tokyo, and in the Research Institute for Information Technology (RIIT) at Kyushu University, Japan.

## Author information

Authors

### Contributions

N.Y. and W.M. conceived the project and contributed equally to the numerical simulations. W.M. and F.N. supervised the research. All authors discussed the results and contributed to writing the paper.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Yoshioka, N., Mizukami, W. & Nori, F. Solving quasiparticle band spectra of real solids using neural-network quantum states. Commun Phys 4, 106 (2021). https://doi.org/10.1038/s42005-021-00609-0

• Accepted:

• Published: