Introduction

Theoretical approaches at the quantum mechanical level are essential to the full understanding of the matter and its properties. Although the wavefunction of a given system – and thus any property of that system – may in principle be obtained by solving the Schrödinger equation, this exact method is extremely costly for many-particle interacting systems, despite the availability of high computational resources. This difficulty is enhanced further when the theoretical simulations incorporate also spatial inhomogeneities – such as borders, impurities, confining potentials and disorder – which are sometimes desired1 but in most of the cases unavoidable in realistic systems and experimental setups.

In this context the density-functional theory (DFT)2,3,4 appears as an interesting and powerful tool. Within DFT the particle density of a system with N interacting particles is used as the central quantity, instead of the wavefunction. Thus the task of obtaining Ψ(r1, r2, …, rN) – a 3N-dimensional function – is reduced to obtain n(r) – a 3-dimensional function. This illustrates how much simpler DFT calculations are in comparison to wavefunction-based quantum calculations. The one-to-one mapping between n and Ψ is guaranteed by the Hohenberg-Kohn (HK) theorem2, which has been recently formally proved to be valid also for lattice systems5,6. As a consequence of the HK theorem all the observables are a density functional and therefore may in principle be obtained via DFT calculations. The theorem does not provide however any hint of how to obtain both, density of the system and density functional of any desired property.

The Kohn-Sham (KS) scheme7 is a clever iterative mapping commonly used to obtain the density of an interacting system. It maps the many-body interacting system into a ficticious non-interacting one via the so-called KS potential vKS. Here vKS is constructed to reproduce the density n(r) of the interacting system. Since vKS depends on the exchange-correlation energy and the latter is a density functional typically unknown exactly, in practical calculations one needs to make use of approximations to obtain the particle density. Hence, the performance of the DFT results depends on the approximations and functionals used.

Therefore there is a very active community in DFT dedicated to the development and the optimization of density functionals for practical DFT calculations. Less exploited tools in this research line are however the artificial neural networks (ANN)8. ANN are computational models that refer to the human brain functioning and are capable of calculating non-linear mathematical functions. It resembles the biological brain: it is able to be trained with a dataset and afterwards generalize the learning to other data of the same problem, being a type of machine learning used to estimate parameters and to recognize patterns.

Among the advantages of ANN one can mention the precision of the results, the easy implementation, the fast calculations and the ability of generalizing from examples9. Once an ANN has been trained and validated satisfactorily, it is able to predict new outputs from different input data (in the same domain, but different from those used in the training). ANN are used in many areas including neurocomputation, chemical engineering, industrial applications, medicine, chemistry and physics10,11,12,13,14,15,16,17.

Here we apply ANN’s concepts to estimate the ground-state energy of fermionic interacting particles in homogeneous chains as described by the one-dimensional Hubbard model. We designed an ANN functional which is able to recover the numerically exact energy of the Hubbard by less than 0.15% for all the regimes of filling factors, 0 ≤ n ≤ 1, all the regimes of magnetization, 0 ≤ m ≤ n, and a vast regime of interactions, 0 ≤ U ≤ 10t, including strongly correlated materials, with typical interaction U~6t, where t is the hopping term. Our ANN functional was also proven to be superior to current parameterizations, used as input in local and non-local approximations for DFT methods. Additionally, we have used the homogeneous ANN functional as input in DFT calculations for investigating finite, localized impurities and harmonically confined systems. We find that while our ANN calculations are notably more accurate than other similarly simple and fast DFT approaches, it is two orders of magnitude faster. Thus the ANN model proposed here could be used as a practical and reliable tool in DFT calculations for such inhomogeneous fermionic systems, including superlattices18,19 and disordered chains20,21, for which exact calculations are nontrivial.

Theoretical and Computational Methods

We consider one-dimensional homogeneous chains described by the fermionic Hubbard model22,

$$\hat{H}=-\,t\,\sum _{\langle ij\rangle \sigma }\,({\hat{c}}_{i\sigma }^{\dagger }{\hat{c}}_{j\sigma }+H.\,c.\,)+U\,\sum _{i}\,{\hat{n}}_{i\uparrow }{\hat{n}}_{i\downarrow },$$
(1)

where U is the on-site interaction and t is the nearest-neighbor hopping term, while \({\hat{c}}_{i\sigma }^{\dagger }\) (\({\hat{c}}_{i\sigma }\)) is the creation (annihilation) operator of fermions at site i with spin \(\sigma =\uparrow \,,\downarrow \), and \({\hat{n}}_{i\sigma }={\hat{c}}_{i\sigma }^{\dagger }{\hat{c}}_{i\sigma }\) the number operator. The average charge density or filling factor is given by \(n=N/L=\langle {\hat{n}}_{\uparrow }\rangle +\langle {\hat{n}}_{\downarrow }\rangle \), while the magnetization or spin density is \(m=\langle {\hat{n}}_{\uparrow }\rangle -\langle {\hat{n}}_{\downarrow }\rangle \), where \(N={N}_{\uparrow }+{N}_{\downarrow }\) is the total number of particles and L the chain size.

This is the simplest model to describe itinerant and interacting electrons in a chain, such as in solids and in nanostructures23,24,25, or cold atoms in an optical lattice26,27,28. Despite its simplicity the Hubbard model describes important phenomena29, such as the Mott transition from metal to insulator (for U > 0, at n = 1)30, the transition from Bardeen-Cooper-Schrieffer superfluid to Bose-Einstein condensate (for U < 0, at m = 0)30 and exotic superfluidity (for U < 0 and m ≠ 0)28,31,32.

Nevertheless the exact analytical solution of the model is in general unknown – except for some specific conditions33 – thus one has to perform numerical calculations. For spatially homogeneous chains, L = ∞, one can obtain an exact fully numerical (FN) solution by solving the Lieb-Wu integrals34. For finite chains, using exact diagonalization we are limited to small chains, \(L\lesssim 15\), while with density-matrix renormalization group (DMRG)35, which is almost exact, one can solve chains of \(L\lesssim 200\) sites. DMRG are however computationally expensive calculations: a single calculation might take several hours in a high-performance computing cluster.

Thus DFT appears as a powerful tool to solve larger chains at a lower computational cost: a typical DFT calculation for the Hubbard model takes no longer than 1 minute. DFT may also be applied to solve inhomogeneous chains, i.e., in the presence not only of boundaries, but also impurities, interfaces, superlattices, harmonic confinement and disorder. This is possible thanks to – local or non-local – approximations to the exchange-correlation functional, which require, as an input, a functional for the per-site ground-state energy of the homogeneous system.

In particular, the local-spin-density approximation (LSDA) for the ground-state energy is given by

$${e}_{0}^{inh}\approx {e}_{0}^{LSDA}=\frac{1}{L}\,{\sum _{i=1}^{L}{e}_{0}^{{\hom }}[n,m,U]|}_{\begin{array}{l}n\to {n}_{i}\\ m\to {m}_{i}\end{array}}$$
(2)

where \({e}_{0}^{hom}\) is the homogeneous functional for the ground-state energy density and (ni, mi) are local densities at site i of the inhomogeneous system. Thus Eq. (2) approximates the energy of inhomogeneous chains by evaluating the energy of the homogeneous system with the site-by-site replacement n → ni and m → mi. In practical DFT, using the self-consistent KS scheme7, it implies the evaluation of partial derivatives of the energy with respect to ni and mi (to obtain vKS) in each of the iterations of the self-consistency cycle. Therefore, in order to solve inhomogeneous Hubbard chains within DFT-LSDA, the first step is to obtain a simple and reliable functional \({e}_{0}^{hom}[n,m,U]\) for the homogeneous limit.

Although the fully numerical (FN) solutions of the Lieb-Wu integrals34 (exact only for infinite chains) are certainly a trusty functional for the DFT-LSDA treatment of inhomogeneous systems, they are computationally expensive. Besides, the integrals are not direct functions of (n, m) and hence we do not have total freedom to choose specific values to be used site by site in each iteration of the KS cycles. Therefore, for each interaction U, one has to produce a huge amount of FN data to cover all the regimes of ni and mi in order to obtain the vKS numerical derivatives with high precision.

So one possibility to overcome this numerical difficulty is then to adopt a reliable analytical expression for the homogeneous functional. An example for this approach is the analytical FVC functional33, which extended previous expressions36 by incorporating the magnetization, has been currently used as a practical tool for DFT-LSDA calculations20,37,38,39,40. The FVC parametrization properly recovers all the analytical known limiting cases (for U = 0, U = ∞ and for the combined set n = 1 and m = 0), correctly predicts a positive Mott gap at n = 1 for any U > 0, and is also more precise in general than the previous expressions36. While for most of the parameters the FVC functional presents relatively low deviations from numerically exact results (~2%), its accuracy is considerably worse for the weakly interacting regime. Thus, the ANN functional proposed here is a simpler and faster numerical approach to inhomogeneous systems within DFT calculations, with superior accuracy than FVC and computationally cheaper than FN and DMRG.

Results and Discussion

The training of our neural model was performed with the Levenberg-Marquartd algorithm41. The process involves the random split of the exact data in three sets: 70% for the training itself, 15% for the validation and the remaining 15% for testing the ANN performance. The input parameters of our ANN model are the variables density n, magnetization m and interaction U, while the output is the per-site ground-state energy \({e}_{0}^{ANN}(n,m,U)\). We consider 20,891 numerically exact Lieb-Wu results within the regimes 0 ≤ U ≤ 10t, 0 ≤ n ≤ 1 and 0 ≤ m ≤ n. Thus the possible topologies of our ANN are of type: 3 input neurons {n, m, U} – a certain number of hidden layers – a single output neuron \(\{{e}_{0}^{ANN}(n,m,U)\}\). Several topologies were studied, e.g., with a single hidden layer – 3-5-1, 3-10-1 and 3-20-1 – and with two hidden layers, 3-12-12-1 and 3-10-10-1.

Our target performance was below 10−6 for the mean squared error, which was obtained after extensive training by the topology 3-20-1 with 1000 epochs (we choose to stop the training in 1000 epochs to avoid over-fitting by the network, which could lead to a poorer generalization performance), as shown in Fig. 1. This architecture contains one hidden layer composed of twenty neurons, with tangent sigmoid transfer function, while the output layer has only one node with a linear transfer function. The trained neural network can be found online free of charge as Supplementary Material (ANN_to_Hubbard.mat and Read_me.pdf files).

Figure 1
figure 1

Structure of our ANN model, 3-20-1, and its performance during training.

Now we compare the perfomance of our ANN model with the FVC parametrization, for all the regimes of parameters. We start by considering non-magnetic systems. In Fig. 2 we show the energy as a function of the density for a strongly correlated case, with U = 6t. While the overall behavior is well described by both, ANN and FVC, we find that the ANN deviations are below ~0.1% (see inset), smaller than the FVC ones, which in some cases are 10 times bigger. This higher performance of ANN over FVC for strongly interacting systems is surprising, since at this regime FVC is recognized as sufficiently precise33.

Figure 2
figure 2

Per-site ground-state energy as a function of the density for the FVC and ANN functionals in comparison to the numerically exact FN result. The inset shows the percentage deviations, obtained via \(100\times |({e}_{0}^{approx}-{e}_{0}^{exact})/{e}_{0}^{exact}|\). In all cases the interaction is U = 6t, the magnetization is m ~ 0 and D is the average percentage deviation.

Figure 3 presents the energy as a function of the interaction for non-magnetic chains. It reveals that the excellent performance of our ANN model is not restricted to the strongly correlated systems: within 0 ≤ U ≤ 10t and m ~ 0 the ANN deviates by less than ~0.08% for all the interactions, while FVC deviates on average ~0.5%.

Figure 3
figure 3

Per-site ground-state energy as a function of the on-site interaction U (in units of t), for the FVC and ANN functionals in comparison to the numerically exact FN result. Here n ~ 0.5 and magnetization m ~ 0. The inset shows the percentage deviations, obtained via \(100\times |({e}_{0}^{approx}-{e}_{0}^{exact})/{e}_{0}^{exact}|\), while D is the average percentage deviation.

We also compare the performance of ANN and FVC for magnetic systems. Figure 4 presents the energy as a function of the magnetization for several interactions. We find that while FVC in some cases fails to recover even the qualitative behavior of the energy, our ANN model properly describes the exact trend. Quantitatively, while FVC deviations reach up to ~7% (for U = 0.5t), the ANN model remains reliable, with deviations inferior to 0.15%. We also calculate numerically the magnetic susceptibility χ, related to the energy by \({\chi }^{-1}={\partial }^{2}e/\partial {m}^{2}{|}_{m=0}\). For the case of U = 0.5t we find χFN = 0.896, χANN = 0.825 and χFVC = 1.335, illustrating that the higher accuracy on the total energy leads in general to more precise observables.

Figure 4
figure 4

Per-site ground-state energy as a function of the magnetization m, for the FVC and ANN functionals in comparison to the numerically exact FN result. Here n ~ 0.5 and D is the average percentage deviation.

Finally, we illustrate the use of our ANN functional within DFT-LSDA calculations by applying it to inhomogeneous (i) unconfined finite chains, (ii) localized impurity systems (\({V}_{i}=V{\delta }_{i,{i}_{V}}\)) and (iii) harmonically confined chains (Vi = k(i − i0)2). The harmonic potential is particularly relevant to simulate state-of-the-art experiments with cold fermionic atoms in optical lattices. We compare the performance of DFT-LSDA – with each functional ANN, FVC and FN – to DMRG calculations.

Figure 5 summarizes the essential features of our analysis to the several inhomogeneous test cases. For finite unconfined chains, Fig. 5a, the typical Friedel-like oscillations42 on the local densities are in general well reproduced by all the three functionals. The average percentage deviations in this case are similar, slightly larger for FVC, because the chain is relatively large, L = 40. For smaller chains and/or with inhomogeneous external potentials Vi, as in Fig. 5b,c, we see that the FVC is clearly the less accurate functional. Figure 5d, for the harmonically confined case, shows that the deviations on the ground-state energy follow similar trend: although the deviation oscillates with the harmonic strength k for the approaches, the FVC deviations are systematically larger than the FN and ANN ones. Thus, our results demonstrate that the ANN approach has a very good balance between precision and computational cost: it is two orders of magnitude faster than FN and DMRG and significantly more precise than FVC.

Figure 5
figure 5

Density profile for (a) a finite chain of size L = 40 and n = 0.8, (b) a localized impurity system, with L = 15, impurity strength V = 2t and n = 0.4, and (c) a harmonically confined chain with L = 40, harmonic curvature k = 0.01 and n = 0.8, while (d) shows the per-site ground-state energy as a function of the harmonic confinement curvature for chains of size L = 40 and n = 0.8. In all cases we adopted open boundary conditions, U = 1t and m ~ 0. The results are either exact DMRG or approximated DFT-LSDA, with one of the functionals: FN, FVC or ANN. The percentage deviations are obtained via \(100\times |({e}_{0}^{approx}-{e}_{0}^{exact})/{e}_{0}^{exact}|\) and D is the average percentage deviation.

Conclusions

In summary, we have developed a simple, fast and reliable artificial neural network functional for the ground-state energy of the homogeneous fermionic Hubbard chains. We have demonstrated the power of the ANN functional in DFT methods by exploring finite, localized impurities and harmonically confined systems. The results show that our ANN approach not only properly captures the qualitative behavior of energy and density profiles for all the inhomogeneous systems, but also leads to more accurate results than previous parameterizations. This excellent performance is similar to other numerical solutions (FN and DMRG), but at a fraction of the computational cost. Therefore our ANN functional is a practical tool to be implemented in DFT calculations, such as in local and non-local density approximations, for inhomogeneous Hubbard chains. Our approach is particularly useful to investigate disordered systems, where many realizations are needed for each randomly localized impurities scenario, what makes analyses via exact methods computationally prohibitive.