Learning excited states from ground states by using an artificial neural network

Excited states are different quantum states from their ground states, and spectroscopy methods that can assess excited states are widely used in materials characterization. Understanding the spectra reflecting excited states is thus of great importance for materials science. However, understanding such spectra remains difficult because excited states have usually different atomic or electronic configurations from their corresponding ground states. If excited states could be predicted from ground states, the knowledge of the excited states would be improved. Here, we used an artificial neural network to predict the excited states of the core-electron absorption spectra from their ground states. Consequently, our model correctly learned and predicted the excited states from their ground states, providing several thousand times computational efficiency. Furthermore, it showed excellent transferability to other materials. Also, we found two physical insights about excited states: core-hole effects of amorphous silicon oxides are stronger than those of crystalline silicon oxides, and the excited-ground states relationships of some metal oxides are similar to those of the silicon oxides, which could not be obtained by conventional spectral simulation nor found until using machine leaning.


INTRODUCTION
Excited states are quantum states of an atom or electron with higher energy than its ground states. Excited states can be generated by irradiation, such as by photons or electrons, and can be measured by spectroscopic techniques, such as absorption or emission spectroscopy. As the fine profiles of these excited staterelated spectra reflect interactions with external fields as well as atomic configurations and chemical bonding, spectroscopy methods that assess excited states have been extensively used in materials characterization [1][2][3][4][5] .
While spectroscopy is indispensable for analyzing the atomic and electronic structures of materials, understanding spectra that reflect excited states is always difficult because excited states usually have atomic or electronic configurations that are different from their corresponding ground states. To learn about excited states, then, theoretical simulations of spectra have been developed. Among spectroscopy methods, core-electron absorption spectroscopy using electrons or X-rays-namely, electronenergy-loss near-edge structure (ELNES) and X-ray absorption near-edge structure (XANES)-are powerful methods [6][7][8][9][10][11][12][13][14] , and thus simulation on them has been performed 10,15,16 . Combining ELNES/ XANES with the simulation can analyze local atomic and electronic structures, but doing so is not straightforward because such simulations must treat both the ground and final states, and often must consider two-particle and multi-particle interactions 17 , which leads to several hundred or thousand times computational costs than ground-state only calculation. If excited states could be predicted from ground states, the knowledge of the excited states and the understanding of spectral features would both be improved.
In this manuscript, we used an artificial neural network (ANN) to predict excited states from ground states. Among spectroscopy methods that assess excited states, we focused on the coreelectron excited states, namely ELNES/XANES spectra, and designed an ANN model to predict the fine ELNES/XANES profiles of crystalline materials from their ground-state density of states. After training the ML model on ELNES/XANES data from crystalline materials, we transferred it to predict the ELNES/XANES of amorphous and other materials. The "transferability" of our model provided insights into the excited states, such as how atomic configurations affected their excited state electronic structures.

Architecture of our artificial neural network
We believe that ANNs are suitable for predicting ELNES/XANES spectra because ANNs can easily handle multidimensional inputoutput pairs, and because they can incorporate complicated, nonlinear correlations between spectral features. Figure 1 shows the architecture of our ANN. It uses a simple feedforward architecture, in which the input layer involves information obtained at low computational cost; the output layer (i.e., target to learn) is a spectrum reflecting the excited state. We used the partial density of states (PDOS) at the ground state as the input and the intensities of the ELNES/XANES spectra at each energy as the output, because the core-electron excitation is known to follow Fermi's golden rule. Indeed, the ground-state PDOS can be easily calculated using the primitive cell, which only takes several seconds or minutes to compute; in contrast, direct simulation of ELNES/XANES takes several hours or days because simulating both the ground and excited states requires a sufficiently large supercell. The details of our ANN and preprocessing of PDOS and ELNES/XANES are provided in "Methods".
Database and processing In this study, we prepared spectrum datasets by simulation because they do not include accidental noise or errors. We selected the oxygen-K (O-K) edge of silicon oxide polymorphs for three reasons: (1) Silicon oxide has many polymorphs, so a large dataset can be prepared. (2) O-K edges can be correctly calculated by a simple one-particle calculation based on the generalized gradient approximation of density functional theory (DFT-GGA).
(3) Silicon oxides are used in various applications such as windows, catalysts, and batteries 35,36 . To prepare these datasets, we extracted 188 silicon oxides crystals from the Materials Project database 37 and then calculated 1171 O-K edge spectra from their multiple oxygen sites. We also simulated the O-K edges of amorphous silicon dioxide in order to understand how its various atomic configurations affect the magnitude of its excited state. We constructed amorphous structures with 72 silicon and 144 oxygen atoms by using a classic molecular dynamics simulation. The detailed simulation conditions for calculating O-K edges and procedure for constructing the amorphous structures are provided in "Methods".

DISCUSSION
We evaluated our ANN-based predictive model by assessing the mean-squared error (MSE) of the test dataset. In Fig. 2a, we show the obtained MSEs, where the x-axis is the data sample index and the y-axis is the sorted MSE. More than 90 % of the test samples had relatively small MSEs of <0.4. Overall, our ANN correctly predicted the ELNES/XANES spectra from the ground-state PDOS. Notably, our model only requires information about the PDOS of the primitive cell at the ground state, so this calculation is several thousand times more efficient than a conventional first-principles ELNES/XANES simulation.
In addition, we extracted some predicted spectra from Fig. 2a and analyzed them in detail. Figure 2b-d shows three predicted spectra-A, B, and C, respectively-where each figure is the data sample of each point shown by arrows in Fig. 2a. Figure 2b-d shows the predicted spectra (deep-green line) and calculated results (light-green line). Predicted spectrum A (MSE = 0.01) fit the correct spectrum (Fig. 2b). Notably, predicted spectrum B (MSE = 0.38) could also sufficiently capture the correct spectral features (Fig. 2c), such as peak positions, but predicted spectrum C (MSE = 1.2) did not have the correct peak positions (Fig. 2d).
We further investigated the well-predicted spectrum A by comparing the ELNES/XANES (green line) to the ground-state Fig. 1 Schematic of our NN architecture for predicting ELNES/ XANES spectra. The architecture is composed of three types of layers: input, hidden, and output. Each circle represents a layer node, which information enters and exits. PDOS (blue line) in Fig. 2b. In this figure, we denote each peak in the ELNES/XANES as a, b, c, and d on the green line; we denote each peak in the ground-state PDOS as aʹ, bʹ, cʹ, and dʹ on the blue line. Comparing these peak positions, the positions of the groundstate PDOS, aʹ-dʹ, were shifted to lower energy and close to positions of the ELNES/XANES, a-d. These peak shifts are shown by the left-facing arrows in Fig. 2b. Now we will provide physical mechanisms for the peak shifts to lower energy shown in Fig. 2b. As described in the Introduction, the ELNES/XANES is generated from the hole in the core-orbital, referred to as a core-hole 38,39 . In our case, the O-K edge has a corehole in its O 1s orbital. The core-hole generates a strong Coulomb interaction from the nucleus, and then the wave functions are largely localized near the excited atom; this shifts the peaks to lower energy, i.e., from aʹ-dʹ to a-d in Fig. 2b. Also, because of the stronger Coulomb interaction of the nucleus, the lower energy peaks generally have larger shifts than the higher energy peaks do. Indeed, as shown by the length of each left-faced arrow in Fig.  2b, our ANN reproduced the behavior. Namely, we found that our ANN, which correctly predicted the ELNES/XANES spectra from the ground-state PDOS, learned the excited state derived from the core-hole.
So far, we have confirmed that our ANN correctly learned and predicted the excited states from the ground states; now, we will apply the trained ANN model. Once an ML model is trained on a given dataset, it can be used to make predictions from another dataset. Also, we worked to transfer our ANN trained on crystalline silicon oxides to predict the excited states of amorphous silicon oxides (see "Methods"). Figure 3a shows the MSEs of the amorphous materials, where the gray symbols are the MSEs, which are dozens of times larger than those of the crystalline materials shown in Fig. 2a. In the following, we will investigate these error trends and provide a way to improve the prediction performance.
Similar to Fig. 2, Fig. 3b-d shows three extracted spectra-D, E, and F-denoted by arrows. The predicted spectrum (green line) in Fig. 3b, which had the lowest MSE of 0.3, fit the correct spectrum (light green line). However, the main peaks of the predicted spectra in Fig. 3c, d were underestimated compared to the correct spectra. These misprediction trends like Fig. 3c, d mean that the strength of the excited state, which was trained with crystalline silicon oxides, was weaker for the amorphous silicon oxides. In other words, the amorphous silicon oxide is subject to larger corehole effects than the crystalline silicon oxides, so the predictive model trained on the crystalline data underestimated the corehole effects. This behavior could be confirmed as "overestimations" when we trained a model based on the amorphous data and applied it to the crystalline data (see Supplementary Fig. 1). This finding cannot be obtained only by conventional ELNES/ XANES simulation, because the excited state was directly calculated by the simulation, and the excited state of the crystalline material cannot be "transferred" to the amorphous material.
To address this underestimation, we consider the shielding effects, which involve the core-hole effects we described above. Indeed, the core-hole is commonly derived from "one" electronic hole at the core-state, so the strength of the Coulomb interactions from the nucleus should be similar among these O-K edges. Thus, we could speculate that the changes in the core-hole effects can be ascribed to the different shielding effect.
In the following, we consider using the band gap as additional input information because it is known to be one of the representative values, which potentially gives how an electric field (e.g., generated by the core-hole) influences the electronic structure (i.e., shielding effects).
To also consider the band gap in the input of our ANN, we obtained the band gap using the GGA-PBE simulation at the ground state, which means we don't need additional calculations.
Specifically, we concatenated the PDOS intensities from the middle of the band gap (i.e., the middle of the valence band top and conduction band bottom) to the threshold of the conduction bands with the inputs. Thus, the band gap information was naturally included as the input of our ANN, that is, the range of zero-intensities from the beginning corresponded to the band gap information.
Considering the band gap as described above, we trained the excited states again on the crystalline dataset and then predicted those of crystalline and amorphous materials. The results are summarized in Fig. 3a, where orange and green symbols show the MSEs of the crystalline and amorphous materials with a band gap, while the gray symbols show that of the amorphous material without a band gap. While the MSEs of the crystalline material with a band gap (orange) show few improvements (details in Supplementary Fig. 2), those of the amorphous material with a band gap were significantly improved (from gray to green). As a result, the MSEs of the amorphous material with a band gap were close to those of the crystalline material with a band gap.
To analyze the above results, we extracted two data samples: the predicted spectra without a band gap, shown in Fig. 3c, and that with a band gap, shown in Fig. 3e. The spectra in Fig. 3e were correct compared to Fig. 3c. In particular, considering the band gap corrected the underestimation of the peak position, namely the peak shift of the excited state to lower energy.
To validate the above model that considered the band gap, we compared it to a model that considered the valence band as an input in addition to the band gap. The valence and conduction bands were the PDOS from −20 eV to 15 eV, with 0 eV set to the middle of the band gap. The result of this model is shown by purple symbols in Fig. 3a; it did not show more improvement, indicating that the band gap is the key to predicting the excited state derived from the core-hole.
Finally, we considered applying the ANN model to various other materials. In the above results and discussion, the predictive model could be applied to both crystalline and amorphous materials even if the model was trained on only crystalline silicon oxides. Thus, we applied the trained model to various metal oxides -including Li 2 O (anti-fluorite), MgO (rock salt), Al 2 O 3 (corundum), and GeO 2 (α-GeO 2 )-to predict their ELNES/XANES. Note that the local coordination of the metal/oxygen in Li 2 O, MgO, Al 2 O 3 , and GeO 2 are 4/8, 6/6, 6/4, and 4/2, respectively, whereas the variety of the local structures of silicon and oxygen in the training data are still limited; namely, silicon/oxygen mostly form a 4/2 coordination. Figure 4 shows the predicted and correct O-K edges of Li 2 O, MgO, Al 2 O 3 , and GeO 2 . The model was trained on crystalline silicon oxides considering both the band gap and valence state. The predicted O-K edge of these oxides were correctly reproduced, even though they had different elements and local structures from the silicon oxides. This successful prediction could provide new insight into the excited state. Namely, the impact of the oxygen 1s core-hole on the electronic structure of those oxides are similar to each other. We believe that our model for the ELNES/XANES in this study was "transferable" and feasible for these oxides. However, we cannot say that the present prediction model is an "universal" model, which is transferable for any materials. Supplementary Fig. 3 shows the predicted O-K edge of SrTiO 3 . Different from the oxides in Fig. 4, SrTiO 3 is composed of a relatively heavy element (Sr), 3d-transition metal (Ti), and oxygen, namely, the characteristic of the constituent elements is quite different from Si-O system. Because an excitonic simulation using Bethe-Salpeter equation is necessary to reproduce the O-K edge of SrTiO 3 40 , our prediction model cannot reproduce the spectral features. This result indicates that the present prediction model is transferable only for the similar oxides, whereas it cannot be applied to complex materials.
In summary, we used a neural network to predict the excited state for an ELNES/XANES spectrum from the PDOS of the ground state. Our neural network model trained on crystalline silicon oxides accurately reproduced the peaks and intensities of the excited-state ELNES/XANES spectra with only information about the ground-state PDOS of the conduction bands, which means that our method can simulate the excited state with much lower computational costs; however, it failed to predict those of amorphous silicon oxides. With detailed analysis, we found that the misprediction came from the core-hole effects on the amorphous silicon oxides being stronger than those of the crystalline silicon oxides. To correct this underestimation, we introduced the band gap as an input, which let us predict the excited states of the amorphous materials correctly. Furthermore, our model can be used to predict the excited states of other metal oxides with different chemical bonds and local structures.
Our study gives insights into the excited state, such as that the excited state of amorphous silicon oxides is stronger than that of crystalline silicon oxides, and that their difference can be understood by considering the band gap. Furthermore, the excited states of the metal oxides, Li 2 O, MgO, Al 2 O 3 , and GeO 2 , are similar to those of the silicon oxides.
In conclusion, machine learning is an effective way to predict excited states. We predicted the excited state of the core-electron excitation here, and we believe that a similar strategy will work well for other excitations. Excited states are closely related to spectroscopy, chemical reactions, and material function, so machine learning will help deepen our understanding of the mechanisms of the excited state.

Details of artificial neural network
To optimize the learning parameters in the ANN, we used backpropagation based on the Adam scheme 41 , minimizing the mean squared errors. ReLU was used as an activation function, and the dropout rates were fixed at 0.5 in the hidden layers not linked to the output layer. Hyperparameterssuch as the number of hidden layers, the number of nodes in each hidden layer, and the regularization parameters-were tuned by fivefold crossvalidation. We considered up to three hidden layers, each with various numbers of nodes. Also, we tried various terms for the weight decay rate of regularization-10 −5 , 10 −6 , 10 −7 , 10 −8 , 10 −9 , and 0.0-resulting in 108 combinations. The architectures and best parameters are listed in Supplementary Tables 1 and 2. Preprocessing of PDOS and ELNES/XANES All the calculated PDOS and core-electron excited spectra were broadened by a Gaussian with a deviation of 0.5 eV. The crystal dataset was randomly shuffled and divided into two subsets, one subset and test data, at a ratio of 9:1, and the subset data was randomly divided again into two subsets, training and validation data, which were equal in size. To create a dataset -i.e., the input-output pairs for our ANN described above-we aligned the PDOS and core-electron excitation spectra for each threshold to 0 eV, and the intensities go from −1 to 15 eV in increments of 0.1 eV, resulting in 160 dimensions.

Calculation of ELNES/XANES spectra
We calculated the PDOS at the ground state and the O-K edge spectrum of each site at the excited state by a first-principles plane-wave basis pseudopotential method with CASTEP code 42 . The pseudopotential method was selected because it is fast and sufficiently accurate 43 . We used the GGA-PBE as the exchange-correlation functional with a cut-off energy of 500 eV. To calculate the core-electron excited state, the on-thefly pseudopotential based on the CASTEP database was applied to the excited oxygen atom in the supercell. To minimize interactions among excited atoms under periodic boundary conditions, we used sufficiently large supercells, larger than 8 Å, in all cases.