Introduction

Resonant inelastic x-ray scattering (RIXS) is a powerful spectroscopic tool that has revolutionized the field of material science. It provides detailed information on the electronic, magnetic, and orbital structures of materials with high resolution, thereby offering new opportunities for the exploration of materials and their dynamic processes including electron correlations, charge excitations, spin excitations, and phonon density of states1,2. With the advent of x-ray free-electron laser (XFEL) light sources3, RIXS is also becoming an increasingly appealing technique for the exploration of extreme states of matter4, including of materials undergoing laser-driven dynamic compression to achieve high pressures5. While x-ray diffraction techniques bear witness to a rich collection of exotic physical behaviour exhibited in materials dynamically compressed to high densities6, the detailed electronic structure and excitation behaviour of such systems remain largely unexplored.

The transient nature of experimentally-realized high energy density (HED) systems necessitates the use of bright x-ray sources with short pulse durations, making XFELs particularly attractive probes. Time-resolved RIXS is appealing in this context as it can provide insight into the time evolution of the electronic structure of such systems, and access a wide range of excitation mechanisms. Traditionally, however, RIXS requires a highly monochromatic x-ray source to yield spectra with satisfactory resolution, which is at odds with the current 0.4% bandwidth of a self-amplified spontaneous emission (SASE) XFEL7. SASE bandwidths in the x-ray regime are typically on the order of 20–30 eV, an order of magnitude too large to resolve even basic features in the density of states. Self-seeding techniques can improve the spectral purity of the beam, but the presence of a SASE pedestal remains problematic for high-resolution RIXS measurements8. For these reasons, monochromators are typically required, but they come with their own challenges. For one, the signal-to-noise ratio (SNR) is drastically reduced, as most incoming photons are discarded. In addition, the stochastic nature of the FEL generation process also leads to substantial fluctuations in the thermodynamic conditions of the sample, which complicates the data analysis. Importantly, in the study of transient HED systems such photon losses cannot always be recovered by longer integration times or by higher repetition rates, as the measurement is destructive, and a new system must be created for each new shot. In order to integrate multiple shots to improve overall SNR in such experiments, each shot must contain sufficient signal to be validated independently. This requirement places a fundamental constraint on the signal levels required to field a reliable RIXS diagnostic.

In this work, we present experimental results demonstrating how the electronic structure can be extracted from a RIXS measurement via a deconvolution approach that makes use of the full information contained in the stochastic SASE pulse structure of an XFEL. This idea of correlating the spectroscopic measurements with the photons source spectra has already been used in previous studies9,10. The resolution achieved with our approach is limited only by the resolution of the spectrometer measuring the emitted RIXS signature, and by the SNR, but not by the overall bandwidth or structure of the probe beam itself. Using a SASE pulse with bandwidth of  ~ 19 eV we show we can extract the density of states of Fe and Fe2O3 with resolutions of 6-9 eV. This is sufficient to observe pre-edge features in Fe2O3, demonstrate material specificity, and to extract the temperature of the system heated by the x-ray pulse. Our results illustrate how this correlation approach can robustly deconvolve the polychromatic signal, making the SASE spectrum of an XFEL an exploitable feature rather than an inconvenience requiring mitigation by a monochromator. Importantly, this implies that developments in XFEL technology that increase the energy of the XFEL pulse, rather than its spectral brightness at the expense of photon number, can provide a promising alternative for accessing improved, higher resolution spectroscopic data.

Results

The RIXS cross-section for scattering into a solid angle dΩ can be written as11,12:

$$\frac{{{{{\rm{d}}}}}^{2}\sigma }{{{{\rm{d}}}}\Omega {{{\rm{d}}}}{\omega }_{2}}\propto \sum\limits_{f}{\left| \sum\limits_{n}\frac{\langle f| {{{{\mathcal{D}}}}}^{{\prime} {{\dagger}} }| n\rangle \langle n| {{{\mathcal{D}}}}| i\rangle }{{E}_{n}-\hslash {\omega }_{1}-{E}_{i}+i{\Gamma }_{n}}\right| }^{2}\\ \times \delta ({E}_{f}-{E}_{i}+\hslash {\omega }_{2}-\hslash {\omega }_{1}),$$
(1)

where Ei, En and Ef are the energies of the initial, intermediate and final states of the system, \({{{\mathcal{D}}}}\) is the transition operator11, and Γn is the lifetime of the intermediate state. The incident and outgoing photons have energies ω1 and ω2 respectively. The RIXS process is shown schematically in Fig. 1. Following the work of Humphries et al.4, we focus on the exploration of the broad structure of unoccupied valence states, rather than low-energy excitations, thus seeking to extract the density of states from the RIXS measurement. In this case, Eq. (1) can be simplified and written explicitly in terms of the density of states ρ, the matrix element for transitions between initial and intermediate states M, and the Kα line intensity Af as

$$\frac{{{{{\rm{d}}}}}^{2}\sigma }{{{{\rm{d}}}}\Omega {{{\rm{d}}}}{\omega }_{2}} \, = \, \hslash {\left(\frac{e}{mc}\right)}^{4}\frac{{\omega }_{2}}{{\omega }_{1}}\sum\limits_{f}{A}_{f}[1-{f}_{{{{\rm{FD}}}}}(\hslash {\omega }_{1}-\hslash {\omega }_{2}+{\epsilon }_{L,f};T)]\\ \times \rho (\hslash {\omega }_{1}-\hslash {\omega }_{2}+{\epsilon }_{L,f})\frac{{\left\vert M(\hslash {\omega }_{1}-\hslash {\omega }_{2}+{\epsilon }_{L,f})\right\vert }^{2}}{{(\hslash {\omega }_{2}-({\epsilon }_{L,f}-{\epsilon }_{K}))}^{2}+{\Gamma }_{f}^{2}},$$
(2)

where fFD denotes the Fermi-Dirac occupation function, precluding transitions to occupied states. The energy ϵL,f denotes the binding energy of the (L-shell) electron that decays to fill the (K-shell) core hole, whose binding energy is denoted by ϵK. The derivation of this result can be found in the supplementary materials of ref. 4. We will use this expression as a starting point to interpret the experimental results.

Fig. 1: Schematic of the RIXS process for our cases of interest.
figure 1

This is composed of an energy non-conserving absorption, which excites a K-shell electron (left), followed by an energy non-conserving emission produced by the decay of a L-shell electron into the K-shell core hole (right). The overall process conserves energy. The accessible states are the vacant states in the valence band or, for finite temperature systems, in thermally ionized bound states.

RIXS as dynamic kernel deconvolution

The intensity of the RIXS spectrum as a function of the scattered photon energy ω2, can be found by integrating the differential cross-section over all incident photon energies ω1, and over the solid angle detected for each outgoing energy Ω(ω2):

$$I({\omega }_{2})=\Omega ({\omega }_{2})\int_{\!\!\!-\infty }^{+\infty }\hslash d{\omega }_{1}\,\Phi ({\omega }_{1}){\partial }_{{\omega }_{2}}\sigma ,$$
(3)

with \({\partial }_{{\omega }_{2}}\sigma\) given by Eq. (2), Φ(ω1) representing the incoming SASE spectrum and Ω(ω2) calculable from the geometry of the experimental setup. The expression above takes the form of a sum of convolutions with a dynamic kernel Φ(ω1). The kernel is dynamic because it represents the SASE pulse, which is formed of a series of narrow spikes in photon energy that change stochastically from shot to shot. If we assume a typical dataset will contain N single shots indexed by k, with associated RIXS spectra Ik(ω2), each will be given by Eq. (3) using the corresponding XFEL spectra Φk(ω1).

The cross-section contains information on the vacant part of the DOS via the sum in Eq. (2), including a modulation due to the energy-dependent transition matrix elements M(ε). We denote this experimentally accessible quantity as the effective density of states, ρeff, given by

$${\rho }_{{{{\rm{eff}}}}}(\varepsilon )=[1-{f}_{FD}(\varepsilon ;T)]\rho (\varepsilon )| M(\varepsilon ){| }^{2}.$$
(4)

We note that this quantity is effectively the x-ray absorption spectrum of the material, but without the disadvantage of any blurring due to the x-ray source bandwidth13, and with the advantage of having being obtained in just a single shot, therefore avoiding the risk of being deformed by thermodynamic fluctuations in the sample due to edge-crossing. With the notation given in (4) we can describe the RIXS measurement formally as an operator PRIXS that links the k measured spectra Ik to the effective DOS and the spectral FEL kernel Φk:

$${I}_{k}({\omega }_{2})={P}_{{{{\rm{RIXS}}}}}[{\Phi }_{k},{\rho }_{{{{\rm{eff}}}}}]({\omega }_{2}).$$
(5)

The effective DOS can be found by inverting PRIXS with respect to ρeff:

$${\rho }_{{{{\rm{eff}}}}}(\varepsilon )={P}_{{{{\rm{RIXS}}}}}^{-1}[{I}_{k},{\Phi }_{k}](\varepsilon ),$$
(6)

which is valid for each k.

The RIXS cross-section is relatively small, and single-shot measurements typically have low SNR. This makes the use of standard inversion methods via deconvolution, such as the Richardson-Lucy method14, unsuitable for single-shot analysis. Integrating over many shots (\({I}_{k}\to \bar{I}\) and \({\Phi }_{k}\to \bar{\Phi }\)) can mitigate such limitations in SNR, but it also limits the resolution with which we can extract ρeff. Alternatively, a machine learning approach could be used to construct an estimator to approximate \({P}_{{{{\rm{RIXS}}}}}^{-1}\) from a large labelled dataset of known pairings (ρeffIk, Φk)15. However, given the stochastic nature of the FEL pulse profile and the complexity of the RIXS operator, collecting and validating a sufficiently large dataset of this kind can be a considerable challenge in its own right, in addition to the high complexity required for such an inversion estimator. The lack of a robust approach to process low SNR data represents a considerable bottleneck for x-ray spectroscopy in high energy density physics applications10.

Rather than searching for a general estimator to approximate the highly complex object \({P}_{{{{\rm{RIXS}}}}}^{-1}\), we instead look for a suitable approximation \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) to the much simpler function ρeff. The adopted procedure is illustrated in Fig. 2. We use the known forward model PRIXS to calculate the predicted spectral intensity given a known Φk and the estimate \({\tilde{\rho }}_{{{{\rm{eff}}}}}\):

$${\tilde{I}}_{k}({\omega }_{2})={P}_{{{{\rm{RIXS}}}}}[{\Phi }_{k},{\tilde{\rho }}_{{{{\rm{eff}}}}}]({\omega }_{2}).$$
(7)

Considering that, for our cases with low photon rates, the uncertainties of the single shots are approximately constant over ω2, the spectral intensity \({\tilde{I}}_{k}\) is then used to compute an L2 loss function

$${{{{\mathcal{L}}}}}_{k}=| | {\tilde{I}}_{k}({\omega }_{2})-{I}_{k}({\omega }_{2})| {| }_{2},$$
(8)

which provides a measure of the quality of the approximate \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) given the observed Ik(ω2). Improving the approximation for \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) can now be viewed as a standard machine learning optimization problem. We represent \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) using a trainable feed-forward neural network acting as a universal approximator, and implement PRIXS in automatically differentiable form16. This allows us to use backpropagation and gradient descent to systematically improve \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) by minimizing the objective loss \({{{\mathcal{L}}}}\). We explicitly use the known physics, given by Eq. (3), to provide the inductive bias for extracting the desired electronic structure from the spectroscopic measurement.

Fig. 2: Representation of the scheme used to extract physical information on the electronic structure of a material from a SASE-based RIXS measurement.
figure 2

The estimate of ρeff, represented by a feedforward neural network, is fed to the RIXS differentiable model alongside the measured XFEL spectrum. The model outputs the calculated RIXS spectrum, which is then compared with the experimental spectrum. This comparison generates a loss function used to adjust the neural network parameters, thereby refining the ρeff estimate. The resolution of the resulting density of states depends on the resolutions of the spectrometers used to measure the RIXS signal and the profile of the SASE pulse, but is independent of the SASE pulse structure or bandwidth.

The energy resolution with which \({\tilde{\rho }}_{{{{\rm{eff}}}}}\) can be found is given by the SNR of the single RIXS spectra and the energy resolutions with which Ik and Φk are measured. While the method can, in principle, be used to optimize single shot data, we instead perform batching of the data to improve the SNR across multiple shots (see the subsection ‘Details of the machine learning approach’ in Methods for more information). In contrast to standard averaging approaches, the resolution is not degraded by such a merging of multiple shots, since each pairing (Ik, Φk) is still considered individually in the optimization process.

Synthetic data

As a first validation of the method, we attempt to reconstruct synthetic DOS data using our deconvolution approach. We choose a spiky DOS, which is both challenging to tackle for traditional deconvolution algorithms, but which is also indicative of narrow bound states, d − band features, and resonances. The DOS data is fed into the forward model of Eq. (5), alongside a series of realistic SASE spectra, to produce a synthetic RIXS intensity profiles. We add three levels of Gaussian noise to these spectra, with standard deviations being respectively 0%, 15% and 30% of the RIXS spectra maxima. Some typical resulting spectra are shown in Fig. 3a–c. Note that at the highest level of noise, the features in the spectrum are barely recognizable.

Fig. 3: Reconstruction of a synthetic density of states from a RIXS spectrum at varying levels of noise, using a realistic SASE XFEL profile.
figure 3

A total of 50 single shots were used. We show examples of the synthetic RIXS spectra in ac, with the corresponding reconstructions in df. The technique is robust to noise, owning to the strong inductive bias given by the model, and the stochastic nature of the SASE pulse profile Φ that allows us to oversample the DOS. The zero of the DOS energies is centred at the resonance.

These synthetic spectra are provided to our deconvolution scheme, alongside the corresponding SASE spectra. We show the resulting extracted DOS in Fig. 3d–f. Our approach shows good convergence of the extracted DOS for all levels of noise up to 30%. However, the accuracy of reconstruction decreases as we raise the level of noise, with the extracted ρeff starting to miss small or narrow features, especially for large detuning energies. This behaviour is expected, since the area under a DOS feature and its detuning, i.e., its distance from resonance, are the two main parameters that determine the intensity of the feature in the RIXS spectrum. Bright features closer in energy to the resonance are thus more robust to low SNR. Another common factor that deteriorates the accuracy of the reconstructions is the presence of spurious oscillations in the extracted ρeff. These oscillations are a typical byproduct of deconvolution, and are present on energy-scales smaller than the kernel’s bandwidth. Regardless, the successful extraction of features using our approach even from spectra dominated by noise, as shown in Fig. 3c, remains significant.

A note is in order: we denote as RIXS scattering processes with detuning energies up to 150 eV, following the notation adopted in4, however, other names (e.g., high energy resolution off-resonance spectroscopy) have also been used to describe such scattering processes in the literature17,18. Unlike in ref. 4 where the incident photon energies are varied, in this work we fix the XFEL central energy approximately 50 eV below the iron K edge to optimize the extraction of the valence density of states.

To demonstrate the efficacy of our method, in Fig. 4 we show the quality of the ρeff reconstruction as a function of the number of shots in the dataset for four different approaches and two noise levels. The blue and red curves employ the paradigm described in Fig. 2 to reconstruct ρeff: with blue circles we show the results using all individual experimental pairs (Ik, Φk), while the results using average values (\(\bar{I},\bar{\Phi }\)) are shown with red squares. Similarly to what has been done in9,10, we also directly invert Eq. (5) via standard numerical techniques for comparison (see subsection ‘Details of the numerical approach’ in Methods), both for the individual pairs (diamond markers) and for the averaged spectra (triangle markers). We find that while the direct inversion method achieves the best results in the absence of noise, its performance degrades rapidly as noise is added. This is due to the characteristics of PRIXS, which maps very different ρeff into similar RIXS spectra (contraction map), making its inversion an ill-conditioned problem. On the other hand, our approach shows little variation on the quality of the reconstructed DOS with noise, and is thus more robust. In particular, we observe that with our method the results obtained for a 15% noise level converge to those achieved without noise already for a relatively restrained N ≈ 50. The comparison between single shot and averaged analysis indicates that averaged calculations suffice for broad features, but that shot-by-shot analysis is required to capture finer structure, and that the averaging process leads to a loss in resolution. This can be explained considering that deconvolution is more efficient with spiky XFEL spectra, characteristic of the single shot case, rather than with relatively broad Gaussian XFEL spectra, which occur in averaged calculations. Specifically, in the absence of noise, where no gain on the SNR is obtained through averaging, the use of the averaged spectra deteriorates the quality of the reconstructions as N grows.

Fig. 4: Reconstruction quality of the RIXS spectrum as a function of the number of shots N included in the inversion dataset.
figure 4

The DOS used is that shown in Fig. 3a. While the direct inversion works best for noiseless data, the quality rapidly deteriorates for real data with noise. In contrast, our approach is fairly insensitive to noise. The loss function employed for this plot is a weighted L2 distance between the reconstructed and the original DOS, with bigger weights for the energy regions where features are present. This is achieved defining the weighted L2 norm as \(1/F{\sum }_{i}| d{\rho }_{{{{\rm{eff}}}}}/d\epsilon | ({\epsilon }_{i}){({\tilde{\rho }}_{{{{\rm{eff}}}}}({\epsilon }_{i})-{\rho }_{{{{\rm{eff}}}}}({\epsilon }_{i}))}^{2}\), where ρeff is the original DOS and F is a normalization factor.

Experimental data

We now turn to the analysis of experimental RIXS spectra, which have been obtained at the HED instrument of the European XFEL19. The measurement was performed using x-rays at a photon energy of 7060 eV, focused to spot sizes of 7–10 μm onto samples of Fe and Fe2O3. The total pulse energy in the beam ranged between 500 and 1000 μJ, but it was constrained to be between 700 and 800 μJ for the analysed data in order to limit the sample thermodynamic variations. The focal spot size was optimized in-situ to maximize x-ray heating, diagnosed via the observed emission from the Fe M-shell. The targets consisted of 20 μm thick freestanding Fe foils, and 15 μm thick Fe2O3, deposited on 50 μm of plastic. These thicknesses correspond to a single absorption length or below at the photon energies used, and they were chosen to maintain a uniform temperature from x-ray heating. The FEL was operated in SASE mode20, with an average pulse duration of 40 fs and a spectral bandwidth of around 19 eV full-width-half-maximum (FWHM). The photon energy was tuned to lie just below the Fe K-edge to ensure that RIXS was the dominant scattering process. By ensuring that the bandwidth of the pulse is sufficiently close to the Kβ transition energy (1s-3p) we further ensure that the ionization of the 3p state can be measured. The 3p state is fully occupied in the ground state, and is only depopulated in the experiment due to the heating of the electrons via the intense x-ray irradiation. The RIXS signal was measured using a cylindrically bent Highly Annealed Pyrolytic Graphite (HAPG) spectrometer in the von Hamos configuration21, coupled to a Jungfrau detector. The experimentally determined resolution of the spectrometer and setup, including crystal resolution, pixel size effects and source size, was 5.5 eV. The spectrum of the SASE beam was determined on a shot-to-shot basis via a Si beamline spectrometer with resolution of 0.3 eV22.

Examples of single-shot experimental spectra are shown in Fig. 5, alongside with the corresponding extracted electronic structure. We see that the typical level of noise for a single-shot RIXS spectrum is on the order of 15%, comparable to the cases examined for the synthetic data. The experimental reconstructions of ρeff are compared with density functional theory calculations (see section ‘Theoretical calculations’ in Methods), which allow us to identify various features in the experimental data, and to evaluate the resolution with which ρeff(ε) can be extracted in practice, given our experimental setup. The experimental ρeff have been reconstructed using our dynamic kernel deconvolution scheme over approximately 18,000 RIXS shots for each material. We conducted 6 independent fitting processes, for both materials, with different random seeds for the NN. The average and standard deviation of the resulting outputs have been taken, respectively, as the best estimate for ρeff and its error. The error on the experimental RIXS spectra due to noise, which is propagated through to ρeff, becomes negligible because of the large amount of experimental shots. Hence, the error is due only to the stochasticity of the NN optimization.

Fig. 5: Measured accessible density of states of Fe and Fe2O3 a, b, reconstructed from the RIXS measurements, along with their respective experimental spectra c, d.
figure 5

The grey bands indicate the standard deviation. The measured data are compared with theoretical DFT calculations, for which we used a smearing width of 7 eV for Fe2O3 and 9 eV for Fe47. The insets show the reconstruction of the M-shell vacant states, located around -55 eV. This feature is only present if the M-shell is thermally ionized by the x-ray beam, thus allowing us to estimate the temperature of the sample. The red dashed vertical line indicates a reference M-shell binding energy, taken from27.

Estimation of the resolution

The experimental reconstructions in Fig. 5 agree well with the theoretical predictions, and we note that we are able to extract densities of states with a fairly complex structure from RIXS spectra with relatively low SNR. We estimated the resolution of the extracted ρeff with two different methods, shown in Fig. 6. In panel (a) we fit the reconstructed Fe2O3 ρeff with a superposition of multiple Gaussians. This fitting is then used to achieve an estimate of the narrowest feature width we reconstruct, and therefore, of the resolution. This width is found to be 6.7 eV. Another method to evaluate the resolution is to consider the L2 distance between the experimental ρeff and the DFT predictions, as a function of the smearing width applied to the latter. This curve is plotted in Fig. 6b. The minimum of this curve can be found for a smearing width around 8.2 eV, yielding another estimate for the resolution of our extracted ρeff. The small discrepancy between the two methods can be traced back to the different resolutions with which different parts of the DOS seem to be reconstructed (e.g., the continuum slope and the small valence band peak). This resolution is somewhat lower than the limit of 5.5 eV imposed by the spectrometer resolution in our experimental setup. We attribute this small difference primarily to the SNR of the experimental data, which was fairly low, as the data collection was done in single-shot mode (see Fig. 5). Nevertheless, the reconstructed electronic structures show resolutions up to three times higher than the spectral FWHM of the XFEL pulse. We show in Fig. 5a, b that this allows us to distinguish the environment in which an Fe atom is embedded, via differences in the electronic structure of Fe in the insulating Fe2O3 or in the metallic state. Such a distinction cannot be made via RIXS with the normal spectral resolution of the SASE beam of  ~20 eV, as it is illustrated in Fig. 7.

Fig. 6: Two different approaches for the estimate of the extracted DOS resolution.
figure 6

In panel (a) this is carried out by fitting a superposition of multiple Gaussians to the reconstructed DOS, whereas in panel (b) we consider the L2 distance between the experimental reconstruction and the DFT simulations as a function of their smearing width. The error intervals on the reconstructed DOS in panel (a) are identical to those of Fig. 5.

Fig. 7: Comparison between the reconstructed ρeff for Fe2O3 (solid black lines) and the DFT simulations (dashed blue lines) with smearing widths of 0.2, 5, 10, and 20 eV respectively.
figure 7

It is observable that with a smearing of 20 eV, as it would be given by the SASE bandwidth, the characteristic features of Fe2O3 are no longer visible.

Characteristic of the measured electronic structures

The comparison of the electronic structure with the orbital-projected DOS reveals that the pre-edge feature in Fe2O3, located around -10 eV, contains a mixture of the 2p oxygen states and the 3d iron states, which are indistinguishable at these resolutions. The band gap between these states and the conduction ones, predicted by experimental measurements23,24 to have a width of approximately 2 eV, is visible in the reconstruction as a slightly convex plateau located at just below 0 eV.

In addition to the modulation in the continuum and the pre-edge feature in the Fe2O3 case, we are able to reconstruct the M-shell for both materials, as shown in the insets of Fig. 5a–b. Despite the small area of these features due to the low thermal de-occupation, they are amplified in the RIXS spectra by their vicinity to the resonant energies. Thus, the reconstructed ρeff gives us a method to experimentally estimate the binding energy (BE) of the M-shell in extreme thermodynamic conditions–a quantity difficult to obtain via theoretical approaches25,26. The values found from the reconstructions are  (− 56 ± 3) eV for Fe and  (− 59 ± 5) eV for Fe2O3. As we observed that finite-temperature effects like pressure ionization have a negligible impact on BE for the temperatures reached in the experiment, we compared our estimates with the M-shell binding energies documented in the NIST database27 and we found our results to be compatible with these reference values.

Similarly to what has been done in4, we can exploit the M-shell feature to estimate the sample temperature averaged over the beam pulse duration. However, contrary to4, where the sample temperature is found by fitting the theoretical RIXS spectra to the experimental ones, here we can directly fit the DFT-computed density of accessible states (ρM2) to the reconstructed \({\tilde{\rho }}_{{{{\rm{eff}}}}}\):

$$[1-{f}_{FD}(\epsilon ;T)]{\rho }_{{{{\rm{DFT}}}}}(\epsilon )| {M}_{{{{\rm{DFT}}}}}(\epsilon ){| }^{2}{=}^{!}{\tilde{\rho }}_{{{{\rm{eff}}}}}(\epsilon ),$$
(9)

where the chemical potential must be computed self-consistently. This new procedure enables us to reduce the error bars on the temperatures estimates compared to the previous work, yielding temperatures of (6.1 ± 0.2) eV for Fe and of (5.2 ± 0.2) eV for Fe2O3. We note that these estimates justify the implicit assumption that depopulation of the M-shell is given mainly by thermal collisions (and not by photoionization), and therefore can be described by the factor (1 − fFD). As observed in ref. 28, these extracted temperatures provide a good estimate of the peak temperature reached in the target during the irradiation.

Conclusions

We have used a neural surrogate and differentiable programming to establish a machine learning routine that can extract hyper-resolved DOS measurements from RIXS diagnostics at XFEL experiments through large amounts of data. This procedure could represent an alternative way forward in XFEL experiments which prioritizes the increase of the laser pulse energy over the development of monochromation techniques. Our method for the extraction has been compared to other approaches, demonstrating its robustness to noise present in the experimental measurements. We have furthermore shown that the analysis of RIXS spectra through this method can be used to gain a plethora of useful information, from distinguishing between different material spectra to inferring temperatures in HED regimes. A potential future step to further improve the resolution would envisage the inclusion of the setup instrument function in the forward model. This is particularly beneficial in instances where we can model the effects of specific setup components, such as the broadening due to a mosaic crystal29. Beyond the application to RIXS analysis for further materials and experiments, this machine learning approach can provide a promising avenue for other diagnostics, such as X-Ray Thomson Scattering (XRTS) or the extraction of structure factors from diffraction data. Even though each such application requires a differentiable programming implementation of the corresponding forward model, the combination of machine learning with known physical inductive biases under this scheme constitutes a powerful tool for the analysis of experimental data and the estimation of physical quantities that can only be measured indirectly.

Methods

The physical parameters of PRIXS, chosen for our synthetic and experimental data analysis, are specific to iron and have been taken from30. We report them in Table 1.

Table 1 Physical parameters used in this work

Details of the machine learning approach

The estimator used to reconstruct ρeff(ε) was a feed-forward neural network with a single input and output, 4 hidden layers with 40 nodes each, and the softplus activation function31. The neural network was trained by means of the ADAM32 optimizer with the initial learning rate set to 10−3. This optimization, together with the automatic differentiation of the forward model, was carried out using the library PyTorch33. To ensure a good performance of this scheme, we have identified the necessity of addressing the vanishing gradient problem34 by omitting exponentially small factors in the gradient backpropagation. The training was performed splitting the set of experimental spectra (Ik, Φk for k = 0, 1, …, N) into batches, whose size was increased dynamically during the training to shift from exploratory to exploitative training. The optimization routine considers the loss function, and the respective gradients, on an entire batch to update model parameters at each step. Furthermore, the training has been carried out over many epochs, i.e., rolling over all the N shots multiple times. Finally, during the training we constrained the output of the neural network \({\tilde{\rho }}_{{{{\rm{eff}}}}}(\varepsilon )\) to be positive, a fundamental physical requirement.

Details of the numerical approach

To extract ρeff from a set of RIXS and SASE spectra without making use of machine learning techniques, we employ the following routine:

  1. 1.

    We first discretize the RIXS operator for each FEL spectrum Φk, computing the associated matrices Mk for k = 1, 2, …, N. Each Mk is constructed taking as its i-th column PRIXSkρi], where \({({\rho }_{i})}_{j}={\delta }_{i,j}\), with ij = 1, 2, …, L and L the length of ρeff. Note that this discretization is possible because PRIXS is a linear operator in ρeff.

  2. 2.

    We then construct the macro-linear system of equations in ρeff by stacking the matrices Mk and the RIXS spectra Ik as:

    $$\left[\begin{array}{c}{M}_{1}\\ {M}_{2}\\ \vdots \\ {M}_{N}\end{array}\right]{\rho }_{{{{\rm{eff}}}}}:=M{\rho }_{{{{\rm{eff}}}}}=\left[\begin{array}{c}{I}_{1}\\ {I}_{2}\\ \vdots \\ {I}_{N}\end{array}\right]:=I$$
    (10)

    Notice that the matrix M is in general not square.

  3. 3.

    Finally, we approximate the solution of this linear system, and hence ρeff, using the conjugate gradient descent method35.

This idea of discretizing the physical process operator and inverting the associated matrix has already been employed in the field of spectroscopic analysis10. When working with the averaged spectra, we just search for a solution of the linear system \(\bar{M}{\rho }_{{{{\rm{eff}}}}}=\bar{I}\), where \(\bar{M}\) is constructed using the averaged FEL spectrum \(\bar{\Phi }\).

Theoretical calculations

We simulated the electronic structure of Fe and Fe2O3 using finite-temperature density functional theory (DFT) with a locally-modified version of the ABINIT v8.10.3 code36,37,38. The relevant modification is the inclusion of the hybrid Kohn-Sham plane-wave-approximation scheme39 as described by ref. 40 to enable accurate high temperature calculations. The ABINIT code is used to solve the Kohn-Sham (KS) system41, which provides the KS states and eigenvalues. The DOS is given by the latter, while the former can be used to calculate the dipole transition matrix elements from the Fe 1s core state to the calculated valence states42. The orbital-projected DOS was also calculated to aid in identifying features in the total DOS.

The ion cores were represented using the projector-augmented wave (PAW) scheme43, with Fe and O PAW potentials generated using the Atompaw code44. For efficiency, the Fe 1s, 2s, 2p orbitals and the O 1s orbital were treated with the frozen-core approximation. This approximation is suitable for the relatively low temperatures  (< 10 eV) reached in this experiment, as these orbitals are not thermally ionized.

Calculations were performed at electron temperatures of 300 K, 1 eV, 5 eV, and 10 eV, with the ions frozen in their ambient crystal lattice positions. This further approximation is justified by the femtosecond duration of the XFEL pulses, during which all the spectral emission of interest occurs. This timescale is substantially shorter than the electron-phonon coupling times of several picoseconds. The Fe calculations were therefore performed in a bcc primitive unit cell containing a single atom. Simulations were carried out with 120 bands, a 30 × 30 × 30 k-point grid, with a cut-off energy of 50 Ha for the PAW pseudo-wavefunctions and 150 Ha for the all-electron wavefunctions. The Fe2O3 calculations were performed in its α-phase45, with a unit cell containing 30 atoms, 1440 bands, a 6 × 6 × 1 k-point grid (the latter direction being the long direction of the unit cell), with a cut-off energy of 20 Ha for the PAW pseudo-wavefunctions and 100 Ha for the all-electron wavefunctions.

For the exchange-correlation functional, the PBE form of the generalized gradient approximation (GGA) was used for both the Fe and O atoms. We used the corrective approach of a Hubbard potential (DFT+U) to recover the band gap in Fe2O3, with a value of U = 4.0 eV for the Fe l = 2 channel to be able to recover a band gap of  ~ 2.0 eV, as given by46. As the higher temperatures we consider here are substantial compared with the value of U, we choose U = 4.0 eV for all Fe2O3 calculations. Small adjustments to the value of U at a temperature of 5 eV did not show meaningful changes to the electronic structure, justifying this approach.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.