Artificial neural networks for density-functional optimizations in fermionic systems

In this work we propose an artificial neural network functional to the ground-state energy of fermionic interacting particles in homogeneous chains described by the Hubbard model. Our neural network functional was proven to have an excellent performance: it deviates from numerically exact calculations by less than 0.15% for a vast regime of interactions and for all the regimes of filling factors and magnetizations. When compared to analytical functionals, the neural functional was found to be more precise for all the regimes of parameters, being particularly superior at the weakly interacting regime: where the analytical parametrization fails the most, ~7%, against only ~0.1% for the neural network. We have also applied our homogeneous functional to finite, localized impurities and harmonically confined systems within density-functional theory (DFT) methods. The results show that while our artificial neural network approach is substantially more accurate than other equivalently simple and fast DFT treatments, it has similar performance than more costly DFT calculations and other independent many-body calculations, at a fraction of the computational cost.

Theoretical approaches at the quantum mechanical level are essential to the full understanding of the matter and its properties. Although the wavefunction of a given system -and thus any property of that system -may in principle be obtained by solving the Schrödinger equation, this exact method is extremely costly for many-particle interacting systems, despite the availability of high computational resources. This difficulty is enhanced further when the theoretical simulations incorporate also spatial inhomogeneities -such as borders, impurities, confining potentials and disorder -which are sometimes desired 1 but in most of the cases unavoidable in realistic systems and experimental setups.
In this context the density-functional theory (DFT) 2-4 appears as an interesting and powerful tool. Within DFT the particle density of a system with N interacting particles is used as the central quantity, instead of the wavefunction. Thus the task of obtaining Ψ(r 1 , r 2 , …, r N ) -a 3N-dimensional function -is reduced to obtain n(r) -a 3-dimensional function. This illustrates how much simpler DFT calculations are in comparison to wavefunction-based quantum calculations. The one-to-one mapping between n and Ψ is guaranteed by the Hohenberg-Kohn (HK) theorem 2 , which has been recently formally proved to be valid also for lattice systems 5,6 . As a consequence of the HK theorem all the observables are a density functional and therefore may in principle be obtained via DFT calculations. The theorem does not provide however any hint of how to obtain both, density of the system and density functional of any desired property.
The Kohn-Sham (KS) scheme 7 is a clever iterative mapping commonly used to obtain the density of an interacting system. It maps the many-body interacting system into a ficticious non-interacting one via the so-called KS potential v KS . Here v KS is constructed to reproduce the density n(r) of the interacting system. Since v KS depends on the exchange-correlation energy and the latter is a density functional typically unknown exactly, in practical calculations one needs to make use of approximations to obtain the particle density. Hence, the performance of the DFT results depends on the approximations and functionals used.
Therefore there is a very active community in DFT dedicated to the development and the optimization of density functionals for practical DFT calculations. Less exploited tools in this research line are however the artificial neural networks (ANN) 8 . ANN are computational models that refer to the human brain functioning and are capable of calculating non-linear mathematical functions. It resembles the biological brain: it is able to be trained with a dataset and afterwards generalize the learning to other data of the same problem, being a type of machine learning used to estimate parameters and to recognize patterns. Among the advantages of ANN one can mention the precision of the results, the easy implementation, the fast calculations and the ability of generalizing from examples 9 . Once an ANN has been trained and validated satisfactorily, it is able to predict new outputs from different input data (in the same domain, but different from those used in the training). ANN are used in many areas including neurocomputation, chemical engineering, industrial applications, medicine, chemistry and physics [10][11][12][13][14][15][16][17] .
Here we apply ANN's concepts to estimate the ground-state energy of fermionic interacting particles in homogeneous chains as described by the one-dimensional Hubbard model. We designed an ANN functional which is able to recover the numerically exact energy of the Hubbard by less than 0.15% for all the regimes of filling factors, 0 ≤ n ≤ 1, all the regimes of magnetization, 0 ≤ m ≤ n, and a vast regime of interactions, 0 ≤ U ≤ 10t, including strongly correlated materials, with typical interaction U~6t, where t is the hopping term. Our ANN functional was also proven to be superior to current parameterizations, used as input in local and non-local approximations for DFT methods. Additionally, we have used the homogeneous ANN functional as input in DFT calculations for investigating finite, localized impurities and harmonically confined systems. We find that while our ANN calculations are notably more accurate than other similarly simple and fast DFT approaches, it is two orders of magnitude faster. Thus the ANN model proposed here could be used as a practical and reliable tool in DFT calculations for such inhomogeneous fermionic systems, including superlattices 18,19 and disordered chains 20,21 , for which exact calculations are nontrivial.

Theoretical and Computational Methods
We consider one-dimensional homogeneous chains described by the fermionic Hubbard model 22 , N N N is the total number of particles and L the chain size. This is the simplest model to describe itinerant and interacting electrons in a chain, such as in solids and in nanostructures [23][24][25] , or cold atoms in an optical lattice [26][27][28] . Despite its simplicity the Hubbard model describes important phenomena 29 , such as the Mott transition from metal to insulator (for U > 0, at n = 1) 30 , the transition from Bardeen-Cooper-Schrieffer superfluid to Bose-Einstein condensate (for U < 0, at m = 0) 30 and exotic superfluidity (for U < 0 and m ≠ 0) 28,31,32 .
Nevertheless the exact analytical solution of the model is in general unknown -except for some specific conditions 33 -thus one has to perform numerical calculations. For spatially homogeneous chains, L = ∞, one can obtain an exact fully numerical (FN) solution by solving the Lieb-Wu integrals 34 . For finite chains, using exact diagonalization we are limited to small chains,  L 15, while with density-matrix renormalization group (DMRG) 35 , which is almost exact, one can solve chains of  L 200 sites. DMRG are however computationally expensive calculations: a single calculation might take several hours in a high-performance computing cluster.
Thus DFT appears as a powerful tool to solve larger chains at a lower computational cost: a typical DFT calculation for the Hubbard model takes no longer than 1 minute. DFT may also be applied to solve inhomogeneous chains, i.e., in the presence not only of boundaries, but also impurities, interfaces, superlattices, harmonic confinement and disorder. This is possible thanks to -local or non-local -approximations to the exchange-correlation functional, which require, as an input, a functional for the per-site ground-state energy of the homogeneous system.
In particular, the local-spin-density approximation (LSDA) for the ground-state energy is given by is the homogeneous functional for the ground-state energy density and (n i , m i ) are local densities at site i of the inhomogeneous system. Thus Eq. (2) approximates the energy of inhomogeneous chains by evaluating the energy of the homogeneous system with the site-by-site replacement n → n i and m → m i . In practical DFT, using the self-consistent KS scheme 7 , it implies the evaluation of partial derivatives of the energy with respect to n i and m i (to obtain v KS ) in each of the iterations of the self-consistency cycle. Therefore, in order to solve inhomogeneous Hubbard chains within DFT-LSDA, the first step is to obtain a simple and reliable functional e n m U [ , , ] hom 0 for the homogeneous limit. Although the fully numerical (FN) solutions of the Lieb-Wu integrals 34 (exact only for infinite chains) are certainly a trusty functional for the DFT-LSDA treatment of inhomogeneous systems, they are computationally expensive. Besides, the integrals are not direct functions of (n, m) and hence we do not have total freedom to choose specific values to be used site by site in each iteration of the KS cycles. Therefore, for each interaction U, one has to produce a huge amount of FN data to cover all the regimes of n i and m i in order to obtain the v KS numerical derivatives with high precision.
So one possibility to overcome this numerical difficulty is then to adopt a reliable analytical expression for the homogeneous functional. An example for this approach is the analytical FVC functional 33 , which extended previous expressions 36  and is also more precise in general than the previous expressions 36 . While for most of the parameters the FVC functional presents relatively low deviations from numerically exact results (~2%), its accuracy is considerably worse for the weakly interacting regime. Thus, the ANN functional proposed here is a simpler and faster numerical approach to inhomogeneous systems within DFT calculations, with superior accuracy than FVC and computationally cheaper than FN and DMRG.

Results and Discussion
The training of our neural model was performed with the Levenberg-Marquartd algorithm 41 . The process involves the random split of the exact data in three sets: 70% for the training itself, 15% for the validation and the remaining 15% for testing the ANN performance. The input parameters of our ANN model are the variables density n, magnetization m and interaction U, while the output is the per-site ground-state energy e n m U ( , , ) . Several topologies were studied, e.g., with a single hidden layer -3-5-1, 3-10-1 and 3-20-1 -and with two hidden layers, 3-12-12-1 and 3-10-10-1.
Our target performance was below 10 −6 for the mean squared error, which was obtained after extensive training by the topology 3-20-1 with 1000 epochs (we choose to stop the training in 1000 epochs to avoid over-fitting by the network, which could lead to a poorer generalization performance), as shown in Fig. 1. This architecture contains one hidden layer composed of twenty neurons, with tangent sigmoid transfer function, while the output layer has only one node with a linear transfer function. The trained neural network can be found online free of charge as Supplementary Material (ANN_to_Hubbard.mat and Read_me.pdf files). Now we compare the perfomance of our ANN model with the FVC parametrization, for all the regimes of parameters. We start by considering non-magnetic systems. In Fig. 2 we show the energy as a function of the density for a strongly correlated case, with U = 6t. While the overall behavior is well described by both, ANN and FVC, we find that the ANN deviations are below ~0.1% (see inset), smaller than the FVC ones, which in some cases are 10 times bigger. This higher performance of ANN over FVC for strongly interacting systems is surprising, since at this regime FVC is recognized as sufficiently precise 33 . Figure 3 presents the energy as a function of the interaction for non-magnetic chains. It reveals that the excellent performance of our ANN model is not restricted to the strongly correlated systems: within 0 ≤ U ≤ 10t and m ~ 0 the ANN deviates by less than ~0.08% for all the interactions, while FVC deviates on average ~0.5%.
We also compare the performance of ANN and FVC for magnetic systems. Figure 4 presents the energy as a function of the magnetization for several interactions. We find that while FVC in some cases fails to recover even the qualitative behavior of the energy, our ANN model properly describes the exact trend. Quantitatively, while FVC deviations reach up to ~7% (for U = 0.5t), the ANN model remains reliable, with deviations inferior to 0.15%. We also calculate numerically the magnetic susceptibility χ, related to the energy by χ = ∂ ∂ | − = e m / m 1 2 2 0 . For the case of U = 0.5t we find χ FN = 0.896, χ ANN = 0.825 and χ FVC = 1.335, illustrating that the higher accuracy on the total energy leads in general to more precise observables.
Finally, we illustrate the use of our ANN functional within DFT-LSDA calculations by applying it to inhomogeneous (i) unconfined finite chains, (ii) localized impurity systems ( δ = V V i ii , V ) and (iii) harmonically confined chains (V i = k(i − i 0 ) 2 ). The harmonic potential is particularly relevant to simulate state-of-the-art experiments with cold fermionic atoms in optical lattices. We compare the performance of DFT-LSDA -with each functional ANN, FVC and FN -to DMRG calculations. Figure 5 summarizes the essential features of our analysis to the several inhomogeneous test cases. For finite unconfined chains, Fig. 5a, the typical Friedel-like oscillations 42 on the local densities are in general well reproduced by all the three functionals. The average percentage deviations in this case are similar, slightly larger for FVC, because the chain is relatively large, L = 40. For smaller chains and/or with inhomogeneous external potentials V i , as in Fig. 5b,c, we see that the FVC is clearly the less accurate functional. Figure 5d, for the harmonically confined case, shows that the deviations on the ground-state energy follow similar trend: although the deviation oscillates with the harmonic strength k for the approaches, the FVC deviations are systematically larger than the FN and ANN ones. Thus, our results demonstrate that the ANN approach has a very good balance between precision and computational cost: it is two orders of magnitude faster than FN and DMRG and significantly more precise than FVC.

Conclusions
In summary, we have developed a simple, fast and reliable artificial neural network functional for the ground-state energy of the homogeneous fermionic Hubbard chains. We have demonstrated the power of the ANN functional in DFT methods by exploring finite, localized impurities and harmonically confined systems. The results show that our ANN approach not only properly captures the qualitative behavior of energy and density profiles for all the inhomogeneous systems, but also leads to more accurate results than previous parameterizations. This excellent performance is similar to other numerical solutions (FN and DMRG), but at a fraction of the computational cost. Therefore our ANN functional is a practical tool to be implemented in DFT calculations, such as in local and non-local density approximations, for inhomogeneous Hubbard chains. Our approach is particularly useful to investigate disordered systems, where many realizations are needed for each randomly localized impurities scenario, what makes analyses via exact methods computationally prohibitive.

Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.