MnEdgeNet for accurate decomposition of mixed oxidation states for Mn XAS and EELS L2,3 edges without reference and calibration

Accurate decomposition of the mixed Mn oxidation states is highly important for characterizing the electronic structures, charge transfer and redox centers for electronic, and electrocatalytic and energy storage materials that contain Mn. Electron energy loss spectroscopy (EELS) and soft X-ray absorption spectroscopy (XAS) measurements of the Mn L2,3 edges are widely used for this purpose. To date, although the measurements of the Mn L2,3 edges are straightforward given the sample is prepared properly, an accurate decomposition of the mix valence states of Mn remains non-trivial. For both EELS and XAS, 2+, 3+, and 4+ reference spectra need to be taken on the same instrument/beamline and preferably in the same experimental session because the instrumental resolution and the energy axis offset could vary from one session to another. To circumvent this hurdle, in this study, we adopted a deep learning approach and developed a calibration-free and reference-free method to decompose the oxidation state of Mn L2,3 edges for both EELS and XAS. A deep learning regression model is trained to accurately predict the composition of the mix valence state of Mn. To synthesize physics-informed and ground-truth labeled training datasets, we created a forward model that takes into account plural scattering, instrumentation broadening, noise, and energy axis offset. With that, we created a 1.2 million-spectrum database with 1-by-3 oxidation state composition ground truth vectors. The library includes a sufficient variety of data including both EELS and XAS spectra. By training on this large database, our convolutional neural network achieves 85% accuracy on the validation dataset. We tested the model and found it is robust against noise (down to PSNR of 10) and plural scattering (up to t/λ = 1). We further validated the model against spectral data that were not used in training. In particular, the model shows high accuracy and high sensitivity for the decomposition of Mn3O4, MnO, Mn2O3, and MnO2. The accurate decomposition of Mn3O4 experimental data shows the model is quantitatively correct and can be deployed for real experimental data. Our model will not only be a valuable tool to researchers and material scientists but also can assist experienced electron microscopists and synchrotron scientists in the automated analysis of Mn L edge data.

the references which prevents accurate oxidation state decomposition.In order to avoid the problem, standard reference samples such as MnO, Mn 2 O 3 , and MnO 2 need to be measured in the same experimental session to avoid any energy offsets as well as changes in instrumental broadening 9,11 .Still, with this procedure, other factors could prevent the proper energy axis calibration, for example, temperature fluctuations would result in an energy shift in the monochromator for XAS experiments.Basically, if the XAS measurements are separated multiple hours in time, the spectra taken could have a slight energy offset.In EELS, the energy offset could change more rapidly and is more unpredictable than in XAS.Typically, the energy offset is very sensitive to the DC stray field.For example, the passing of a heavy-duty truck or the movement of a nearby elevator could change the energy offset if the TEM instrument is not fully shielded.This problem is now mitigated with the dualEELS instruments but there are still many single EELS instruments under active service.Moreover, all historical data were acquired without the dualEELS correction.In addition, the nonlinearity of the parallel EELS spectrometer is present in EELS in a nontrivial way because the nonlinearity is not only present in the dispersion device, i.e., the magnetic prism.There is another complex nonlinearity present in the magnification lenses, a series of quadrupoles.Therefore, it is extremely difficult to calibrate the energy onset of EELS edges unless strict protocols are followed as described by Tan et al. 11 .
Another complication is that EELS' near-edge fine structures change with sample thickness due to plural scattering.As the sample gets thicker, signals close to the edge onset would be multiply scattered to higher energy losses.This would result in a shape change of the spectrum 11 .For example, for the latter 3d transition metals' L2,3 edges, as the sample gets thicker, the L2/L3 ratio increases-this problem has rendered the reference-free L2,3 ratio method inaccurate for EELS 11 .In addition, for XAS, the background and the near-edge structures could be different between the Total Electron Yield (TEY) and Partial Fluorescence Yield (PFY) modes.TEY mode measures the total number of emitted electrons resulting from the absorption of X-rays while PFY measures the fluorescence emitted by the sample as a result of X-ray absorption.That also renders the L2,3 ratio method unreliable.Moreover, for early 3d transition metals, there are no established reference-free methods because of the L2,3 anomaly.
For both EELS and XAS spectroscopy, one interesting observation is that human operators with sufficient training can identify spectral features and assign oxidation states to transition metal L2,3 edges with high confidence.This points to the direction that deep learning could be successful in solving the L2,3 oxidation state decomposition problem.Pate et al. in 2021 discussed using deep learning to denoise high frame rate spectra 12 .Chatzidakis and Botton in 2019 introduced the idea of translation-invariance for classifying EELS edges 13 .They built a convolutional neural network (CNN) for oxidation state classification and showed that with translationinvariant training, moving the energy axis does not change the Mn2+, 3+, and 4+ oxidation state classification.This is a very important step in demonstrating that spectral features are like spatial features in images-they can be classified by a CNN network regardless of their absolute energy positions in the spectrum.However, there are still problems remained to be resolved: (1) how to quantitatively decompose mixed oxidation states; (2) is it possible to build one model that works for both XAS and EELS spectroscopy that have drastically different energy resolutions; (3) is it possible to build a model that is not affected by plural scattering, i.e., the thickness effect in EELS.
To address the three challenges defined above, in this study, we present a reference-free, calibration-free deep learning approach to determine the accurate oxidation states decomposition of 3d transition metal based on the L2,3 near-edge fine structures.To demonstrate the validity of the method, we use Mn as an example because Mn is technologically important in catalysis, energy storage, and electronic materials.Also, Mn oxides are a good case study because their 3 different oxidation states lead to notable variations in fine structures of the Mn L2,3 edge.Determining the composition of the mixed oxidation states is extremely important for understanding the charge transfer phenomenon happening at the device interfaces.The method we present in this study is not a simple classification of Mn2+, 3+, and 4+ edges but an accurate and quantitative decomposition of the mixed Mn oxidation states.Instead of having a classification/binary type label, we created a three-element ground truth vector that quantitatively describes the composition of Mn2+, 3+, and 4+ in each Mn spectrum, i.e. [%Mn 2+ , %Mn 3+ , %Mn 4+ ].
To achieve this goal, we synthesized a spectrum library from 38 experimental spectra (23 EELS and 15 XAS).The library contains 1.2 million spectra 50% of which are synthesized from XAS data and the other 50% are synthesized from EELS data.In building the mixed oxidation state library, we paid special attention to normalizing the Mn L2,3 edges correctly, and including experimental-like uncertainties such as both Gaussian and Lorentzian type instrumental broadening, energy offset, and detector noise.To include the plural scattering effect in the training library, we developed a forward model to correctly introduce the thickness effect to the L2,3 edges.Using this physics-informed large training library, we show that the deep convolutional regression model we trained is robust against plural scattering and noise.The overall accuracy of the model in determining the mixed valence state reaches 85% on the validation data set.We also validated the data on "unknown unknowns", i.e., Mn 3 O 4 spectra that have never been used for training and validation-the accurate decomposition of Mn 3 O 4 experimental data shows the model is quantitatively correct and can be deployed for real experimental data.

Methods
In this method section, we will describe (1) how to build a ground-truth oxidation state labeled Mn edge library, (2) how to construct the neural network, and (3) how to train it.
For building the library, the technical challenges lie in (1) how to obtain a wide variety of XAS and EELS Mn2+, Mn3+, and Mn4+ reference spectra; (2) how to normalize or ratio the 2+, 3+, and 4+ spectra correctly; (3) how to include the EELS's plural scattering effect (thickness effect) into the training sets; (4) how to include Collection of Mn reference spectra.To have sufficient varieties of data that can capture the features of the EELS and XAS Mn 2+, 3+, and 4+ edges, in this study, we digitized 23 experimental EELS and 13 XAS Mn spectra, in total 38, that were documented in 6 literatures using WebPlotDigitizer 20 .In Fig. 1, we presented all spectra that were used for making the training library.(The Mn 2.67+ spectrum was not included in the training library).In Table 1, we listed the compounds for which we digitized the spectra and their original references.www.nature.com/scientificreports/All data are standardized to range from 630.5 eV to 669.4 eV with 0.1 eV increments (338 data points).For missing data, the left side of the spectra is padded with zero and the right side is padded with the end value of the spectra.
Normalization of 2+, 3+, 4+ reference spectra.In order to quantitatively combine the 2+, 3+, and 4+ Mn spectra, they need to be normalized to the correct ratio.To achieve that, we normalize the Mn L3 edge according to the d-hole number.Elemental Mn has an electron configuration of [Ar] 3d5 4s2.Therefore, Mn2+, 3+, 4+ have an electron configuration of [Ar] 3d5, [Ar] 3d4, [Ar] 3d3.Because the d shell can hold 10 electrons, the number of holes for Mn 2+, 3+, and 4+ are 5, 6, and 7 respectively.Therefore, the area under the L3 peak and above the continuous background shall be proportional to the d-hole number.The continuous background under the L2,3 edge can be modeled by two-step functions with a step height that follows the 1:2 population ratio.(The filled 2p3/2, and 2p1/2 orbitals have a population ratio of 1:2).The d-hole area can be calculated after the background is subtracted from the spectrum (Fig. 2).With this procedure to find the d-hole area, we can correctly ratio the 2+, 3+, and 4+ spectra.
Ground truth labeled library.After the d-hole ratio normalization, we can correctly combine the Mn 2+, 3+, and 4+ component spectra to form a new spectrum with the known ground truth oxidation state composition and oxidation state as the following In making the ground truth labeled spectra, we only combined Mn spectral components that were digitalized from the same publication source.The reason is the spectra to be combined shall share the same instrumental resolution.
The composition of the training library is detailed in Table 2.A total of 1,200,000 synthetic spectra are included in the library.For XAS, similar short-range and long-range broadening happens due to the monochromator.Therefore, we introduce a two-parameter controlled instrument broadening kernel aka the point spread function, PSF(E) as the following.
where ⊗ stands for convolution and Basically, the instrumental point spread function is a convolution of a Gaussian function with a Lorentzian function.The full width at maximum (FWHM) of the Lorentzian function is w and the FWHM of the Gaussian function is 2 √ 2ln(2)σ .The combined FWHM is equal to √ FWHM Lorentzian + FWHM Gaussian .It is worth noting that the inclusion of the Lorentzian tails in the point spread kernel is very important for making the synthesized spectra resemble the experimental ones.An example of such a broadening effect on a Mn2 + L2,3 edge is shown in Fig. 3.

Plural scattering in EELS.
If the single scattering probability function is P(E), plural scattering as a function of thickness, t, in EELS can be described by the following differential equation dS(E, t) dt = S(E′)P(E − E′)dE′ www.nature.com/scientificreports/and the boundary condition is S(E, t = 0) = δ(E) in the ideally monochromatic condition.In the practical situ- ation where the incoming electron has an energy spread, we can use the point spread function given in the last section as the initial energy profile, i.e.
Once we obtain a numerical representation of P(E), the spectral function, S(E, t) at any given thickness, t can be numerically calculated.
Using this equation, it allows us to calculate the low-loss spectral function numerically.Once we obtain the low-loss spectral function, the core-loss spectrum is a convolution of the core-loss single scattering probably distribution, P core−loss (E) , with the low-loss spectral function, i.e., S(E, t).
Figure 4 shows the change of the low-loss function as a function of normalized thickness ( t/ , is the inelastic mean free path) and how the Mn L2,3 edge evolves.
In this modeling, we use an average plasmon loss energy of 25 eV and approximate the P(E) by an asymmetric function where the left side is a Gaussian function, and the right size is a Lorentzian loss function.To be more exact, we also modeled the Mn M edge and superimposed it onto the plasmon loss.
Other augmentations: energy shift and noise.Both EELS and XAS are subject to the issues of inaccurate energy axis.To take this into account, we apply a random shift augmentation of the energy axis for the ground truth labeled spectra.With this augmentation, the model becomes translation invariant-it is only sensitive to the spectral shape and it is insensitive to the absolute energy onset of the L2,3 edge.
For noise, we have modeled the noise as white noise (Gaussian noise) with a salt and pepper noise (impulse noise).Both noises are additive to the spectrum.We use the linear definition of PSNR as:

Summary of augmentation
In Table 3, we summarize the augmentation operations done to the ground truth labeled library.
Network structure.How our brains process or identify a spectral feature is very similar to recognizing spatial features in an image.Inspired by this, we adopted the convolution layers that are used in image classification for feature extraction.Then we connected the features with a fully connected layer (also known as dense layers) for composition regression.The input is the one-dimensional spectrum, and the output is a 3-element composition vector (Fig. 5).We call this network a convolutional regression net (CRN).Different from a classification network, a regression network's outputs are continuous numbers rather than binary numbers.Therefore, we used the mean square error function as the loss function.

S(E, t
The modeling of the plural scattering for Mn-containing compound and its effect on the spectral shape of Mn L2,3 edges.
Table 3.A summary of the augmentation operations and occurrences.

Type of augmentation Probability of application Parameters
Instrumentation broadening 80% Gaussian: FWHM uniformly distributed between 0.01 and 1.5 eV Lorentzian: FWHM uniformly distributed between 0.1 and 0. www.nature.com/scientificreports/For feature extractions, we use three convolutional layers followed by leaky ReLU and maxpooling.The final layer outputs 41*128 = 5248 filtered features.In the regression layers, we used three fully connected layers with 2048, 512, and then 3 neurons with leaky ReLU in between.The final output is a softmax normalization of the final 3-neuron layer to ensure that the sum of the composition vector is equal to 1.

Training.
In Table 4, we summarized the technical information of the training process.www.nature.com/scientificreports/All spectra are subtracted by the mean and divided by the standard deviation before entering the network.Dropouts are added to each layer before maxpooling with a dropout rate of 0.1.Adam, an algorithm for firstorder gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments was used for learning.The learning rate is set at 8E-5.The batch size is 32.As shown in Fig. 6, the model converges quickly; therefore, only 4 epochs of training were used to overfitting.

Validation.
We performed an 80/20 split of the ground truth labeled library to split it into 80% of training and 20% of validation datasets.The accuracy of the model is evaluated on the validation set.An accurate prediction is defined as the predicted oxidation state falling in the range of ± 0.1 of the ground truth oxidation state.
We also evaluated our model on reference data and testing data.Reference data are the data we digitized from the literature, used to build the library.The testing data are new experimental and literature data that were never used for the construction of the training data.www.nature.com/scientificreports/and MnO 2 which means our model can be deployed and trusted in real experiments.This work showed that it is possible to accurately decompose mix valence state of Mn L2,3 edge spectra for both EELS and XAS without reference and calibration using deep learning algorithms.In the future, the method described in this work can also be generalized to other transition metals such as Fe, since their similar chemistry property to Mn.This work provided a new angle to study the fine structure of L2,3 edges and the development of AI-driven autonomous TEM.

Figure 1 .
Figure 1.The presentation of the EELS and XAS Mn L2,3 edges included in making the training library.The Mn 2.67+ presented is not included in the training library.

Figure 2 .
Figure 2. Schematics showing how to extract the d-hole area under the L3 edge.

Figure 5 .
Figure 5.The structure of the convolutional regression net for mixed oxidation state decomposition.

Figure 7 .
Figure 7.The scatter plot of the model's predicted average oxidation state versus the ground truth.

Figure 9 .
Figure 9.The Mn 3 O 4 spectra from testing data as a function of noise.

Table 1 .
The compound information and references of the Mn L2,3 edges.

Table 2 .
The composition of the ground truth labeled library.

Table 4 .
Technical information for training.
Figure 6.The MSE loss as a function of epochs processed.Vol:.(1234567890)

Table 5 .
CRN's decomposition performance on validation spectra.

Predicted [2+, 3+, 4+] decomposition (%) Predicted oxidation state Ground truth
Testing on Mn 3 O 4 .One of the compounds, for which we have experimental data on, but was not used for training was Mn 3 O 4 .It has a mixed oxidation state of Mn2+ and Mn3+ with a theoretical ratio of 1:2.It gives an average oxidation state of + 2.67.Figure8shows the predicted oxidation state as a function of thickness and the oxidation state decomposition is shown in Table8.The model predicts the correct ratio between 2+/3+ with a

Table 6 .
Testing of CRN's decomposition robustness against plural scattering on validation data.

Table 9 .
Testing of CRN's decomposition robustness against noise on testing data(Mn 3 O 4 ).