Capturing dynamical correlations using implicit neural representations

Understanding the nature and origin of collective excitations in materials is of fundamental importance for unraveling the underlying physics of a many-body system. Excitation spectra are usually obtained by measuring the dynamical structure factor, S(Q, ω), using inelastic neutron or x-ray scattering techniques and are analyzed by comparing the experimental results against calculated predictions. We introduce a data-driven analysis tool which leverages ‘neural implicit representations’ that are specifically tailored for handling spectrographic measurements and are able to efficiently obtain unknown parameters from experimental data via automatic differentiation. In this work, we employ linear spin wave theory simulations to train a machine learning platform, enabling precise exchange parameter extraction from inelastic neutron scattering data on the square-lattice spin-1 antiferromagnet La2NiO4, showcasing a viable pathway towards automatic refinement of advanced models for ordered magnetic systems.


INTRODUCTION
Quantum matter, as featured by the existence of macroscopic order from microscopic spin or charge arrangements or phases with spontaneous symmetry breaking, represents an abundant and complex class of materials in condensed matter physics.For example, the magnetic configuration of a material and its dynamics are often a synergistic effect of multiple interactions as well as crystalline symmetries.The collective spin excitations in most magnetic materials, such as spin waves or magnons, act as probes of those interactions.This information is reflected in their dispersion relations and correlations, and possesses a wide range of potential applications, which include forming spintronics devices, as well as carrying, transferring, and storing information [1][2][3].
In the last few decades, a big aim has been to characterize these excitations, and this has been facilitated by advances in spectroscopic techniques, such as neutron scattering techniques [4][5][6][7].These techniques use the energy and directions of scattered neutrons to measure the dispersion relations, lifetimes, and amplitudes of the spin excitations.While neutron scattering could provide valuable information about the structure and magnetic properties of materials, the limited availability of neutrons, i.e. low flux compared to other scattering techniques, the small partial differential cross section, and the complexity of interpretation, has made the question how to effectively enhance the efficiency of these experiments a long term topic [8,9].Moreover, the interpretation of neutron scattering data can be challenging and time-consuming due to the complex nature of this physical process, the diversity of samples, and limited knowledge from theoretical modeling.With these obstacles, there is a need for deep and synergistic collaboration among experiment, theory, and data analysis to accelerate and simplify the understanding of spin properties [10].Moreover, as rates of data collection increase as well as the ability to collect hyper-dimensional datasets, it is important to be able to readily allow for real-time fitting during experimentation.The ability to perform 'on-the-fly' fitting [11] can allow for efficient use of expensive beamtime by knowing when sufficient data is collected, as well as by coupling to adaptive sampling methods to gain the most information about parameters of interest with the least number of measurements possible.Currently, for neutron scattering data, real-time fitting can require substantial preparation -for example, direct fitting with the package known as SpinW [12] requires the extraction of the eigenmodes of the system and therefore, needs an accurate and favorably automatic peak extraction algorithm.When the chosen paths in reciprocal space are numerous or the dispersion relations change significantly along those paths, this can involve significant human guidance and monitoring.In addition, fitting directly through SpinW does not take into consideration the magnon peak in-arXiv:2304.03949v1[cond-mat.str-el]8 Apr 2023 tensities or their shapes.Approaches to fit peak intensities and shapes directly, such as Multi-or Tobyfit implemented in HO-RACE [13], are possible alternatives -however, these fitting procedures still require significant human guidance and are either extremely slow and therefore incompatible with data acquisition rates or else they require an analytical, rapidly calculable spin wave model.Finding such a model is usually only feasible for simpler systems with minimal frustration or a low number of magnetically distinct sites.
In this work, we solve this problem by developing a machine learning platform based on a data-driven differentiable neural representation (interchangeably referred-to as the "surrogate model") which will dramatically affect how these types of experiments are carried out.This method works with any algorithm, magnetic or non-magnetic, that calculates the dynamical structure factor S(Q, ω), where Q is the scattering vector and ω is the energy transfer.The platform is based on recent developments in implicit neural representation theory [14,15], and the dynamical structure factor is the general function measured in many inelastic xray and neutron experiments and is related to the partial differential cross section by d 2 σ where k i and k f are the incident and final wave vector.Here, the dynamical structure factor is approximated to , where S m (t)S n (0) represents spin-spin correlations at different atomic sites m, n.The neutron polarization factor as well as the magnetic form factor are neglected here.This work extends recent prior work which used machine learning methods to calculate the static structure factor after training on simulated training data S(Q) [16,17].
Our focus in this work is the introduction of neural implicit representations to model the dynamical regime.Convention feed-forward neural networks, trained on large linear spin wave theory (LSWT) simulations, have also been applied to inelastic neutron scattering data to predict the type of magnetic exchange from data [18].More recently, a cycle-GAN approach, which is able to make experimental data look like simulated data has been applied as a pre-processing step to further aid in predicting a certain class of spin-model [19].We note that our work is complementary to these approaches in that our focus is on predicting continuous parameters within an assumed Hamiltonian model rather than predicting the discrete choice of functional form.For a comprehensive survey of the application of machine learning methods to x-ray and neutron data, please refer to Ref. [10].
Our approach offers a straightforward method to both predict and quickly learn parameters of a spin or charge Hamiltonian directly from inelastic neutron or resonant X-ray scattering data.The power of this approach will be especially impactful in using expensive and advanced computational methods for simulating strongly coupled electrons, such as exact diagonalization (ED) [20], density matrix renormalization group (DMRG) [21,22], determinant quantum Monte-Carlo (DQMC) [23,24], and variational Monte Carlo (VMC) [25,26].
To further demonstrate the versatility of our method, we report the results using a less expensive series of calculations based on mean field theory, the linear spin wave theory (LSWT) framework [27].We simulate massive LSWT spectra from a spin-1 square-lattice Heisenberg Hamiltonian model for a large phase space of Hamiltonian parameters.We use a GPU-based machine learning framework to learn how to recover these parameters from scattering data and, in particular, show the method does not rely on peak fitting algorithms for experimental data.We test our approach on experimental time-of-flight neutron spectroscopy data [28] taken on the quasi-2D Néel antiferromagnet La 2 NiO 4 at the Spallation Neutron Source at the Oak Ridge National Laboratory [29] and show that the approach can utilize raw data almost directly from neutron scattering files containing -Q, ω, and S(Q, ω) -to return Hamiltonian parameters that represent the system under study.In addition, we use a Monte-Carlo simulation of the experimental data collection to demonstrate the potential for continuously fitting data -as it is collected -in order to provide guidance on when enough information has been collected from the data to conclude the experiment.

MACHINE LEARNING APPROACH
Our machine learning framework is based on implicit neural representations [14,15].These models are often described as coordinate networks as they take a coordinate as input and typically output a single scalar or a small set of scalars.For example, in the field of computational imaging, these networks learn mappings from pixel position (i, j) to RGB value.This representation allows for the model to be queried in between pixel values and implicitly represents the image through the trained weights of a neural network.We use a SIREN coordinate network [14], a fully-connected neural network [30] with sinusoidal activation functions, which has been shown to be able to accurately capture high-frequency features in images and scenes and has been particularly successful at tasks such as 3D-shape representation.Furthermore, gradients and higher-order derivatives of the mapping can be readily calculated and used for solving inverse-problems [14,[31][32][33].
Here, the SIREN model acts as a fast and differentiable implicit representation for the hyper-volume of the dynamical structure factor across different model Hamiltonian parameters.The Hamiltonian we investigate in this work corresponds to the further nearest-neighbor Heisenberg model and is generally given by Eq. 1 [34,35].
As depicted in Fig. 1a, J and J p are the first-and secondnearest-neighbor Heisenberg interactions on a square lattice.Thus for Q x and Q y , a square lattice notation is utilized with a and b corresponding to the vectors connecting the first nearest neighbors.
Wave vector (h,k) (r.l.u) (2) Specifically, our SIREN model is trained to represent the scalar function log(1 + , which is a logarithmic transformation of the dynamical structure factor evaluated at a specific Q ∈ R 3 (reciprocal lattice vectors in units of r.l.u.), ω ∈ R 1 (energy transfer in units of meV) and J and J p ∈ R 1 (specific Hamiltonian coupling parameters).This functional mapping is described by Eq. 2 and the SIREN model Φ is visualized in Fig. 1c.
We use a log(1 + x) transform in order to amplify a weak signal and to prevent ill-conditioned behaviour around zero.Here, we note that in principle, our model is written for three-dimensional Q, however the neutron profiles in the subsequent sections do not include a Q z component due to limited scattering.The model is trained on 1,200 LSWT simulations of S(Q list , ω list ) over a large set of possible J, J p and on two paths in reciprocal space (Fig. 1b).Here Q-path 1 and 2 are represented as Once the differentiable neural implicit model is trained, it is possible to use gradient-based optimization to solve the inverse problem of determining the unknown J and J p param-eters from data.Our objective function for the optimization task measures the Pearson correlation coefficient (r) of the logarithm of the predicted and the 'true' S(Q, ω, J, J p ) values (Eq.3).We use the logarithm to increase the weighting of weaker features in the data and we use the correlation as the metric because the normalization factors between the experiment and simulation are unknown.Using the prior is favourable as it enhances the weighting of the coherent excitation at high ω and further helps evade contamination due to statistical noise in the elastic and incoherent-inelastic scattering, primarily arising at low ω and which cannot be removed by background subtraction.The latter is important as we are not aiming to fully describe the spectral weights as this would require the exact handling of all individual neutrons in the full three-dimensional Q-space instead of the averaged weight in the reduced two-dimensional Q-space.During optimization, any subset of (Q list , ω list ) coordinates can be chosen as long as they fall along either of the paths defined in Fig. 1b.Here, we note that from an inference point of view, any momentum or energy coordinates could be chosen, however our training data only includes two reciprocal space paths.To determine the Hamiltonian parameters, J and J p are treated as free parameters in the optimization problem.The objective in Eq. 3 is optimized using the Adam optimizer [36], a commonly used gradient-based optimization algorithm, exploiting the automatic differentiation capabilities in Tensorflow [37] to calculate dL dJ and dL dJ p .See methods for further details.
In our method, it is not necessary to use all sets of Q list , ω list along both paths to perform the fitting.Instead, random batches of coordinates (Q batch , ω batch ) can be queried at each optimization iteration in order to improve computational efficiency and converge to a better minima, in a manner similar to the regularization effects of stochastic gradient descent [38].Pseudo-code for the optimization procedure is provided in Algorithm 1.

Algorithm 1 Differentiable Surrogate Optimization
Although our approach uses a neural network forward model and differentiable optimization, we note that in many machine learning studies, the default approach is to train an inverse model on simulated data.More specifically a model which takes raw data and directly predicts the unknown parameters.In our case, this would be a model of the form S(Q list , ω list ) → (J, J p ).Here, for example, S(Q list , ω list ) could be represented as an image and the prediction task could use standard convolutional neural network architectures [39].This is similar to the work in Refs.[18,19] which train inverse models to predict the spin model class from experimental data.While such approaches can often work well on simulated test data, inference is often much more challenging for experimental data and often requires detailed modelling and corrective dataset augmentations of experimental effects accounting for attributes such as background noise, missing data, and matching instrumental profiles [40,41].Specifically, from an inverse modeling standpoint, the experimental data is generally outside the distribution used to train the model.Here, generative approaches, such as the recently proposed cycle-GAN model in Ref. [19], offer an elegant alternative to minimize the deviation between experimental and simulated data.In cases where the predicted outcomes have continuous values (i.e.regression tasks), inverse prediction pipelines can be even less robust to data distribution differences as the model may have to learn more subtle features.For this reason, instead of choosing an inverse modelling approach, we opted for a SIREN neural forward model with differentiable optimization.For this approach, the machine learning model only makes predictions in-distribution and the experimental nonidealities are considered only in the optimization step.Here, we note that the generative approach in Ref. [19] could likely augment our method by using the "translated" experimental data directly in the loss function.

RESULTS AND DISCUSSION
We first characterized the performance of the machine learning framework on simulated SpinW data in order to demonstrate the viability of using a neural implicit model as a surrogate for the LSWT simulator.Fig. 2 shows the LSWT simulation and machine learning "simulation" with input parameters of J = 45.57meV and J p = 2.45 meV.Note that in this example, the machine learning framework was fed (J, J p ) directly (instead of obtaining these parameters using gradient descent through the neural representation).Evidently, the machine learning prediction and LSWT simulation are almost indistinguishable, highlighting the ability of the neural representation to mimic the theoretical calculation.
Although our model can clearly approximate simulated data well, the main motivation of this approach is to provide a tool that can reliably extract the spin Hamiltonian parameters of interest from real, experimental data.For this reason, we applied our method to measured inelastic neutron scattering data, after automatic background-subtraction, on a quasi-2D Néel antiferromagnet La 2 NiO 4 collected at the Spallation Neutron Source [29].Though a full 3D dataset was collected, we chose two paths in Q-space to simulate spectra for the model training prior to any inclusion of the real data.After the model had been trained on both simulated paths, we used gradient-based optimization to solve the inverse problem of determining J and J p from the data.Here, we note that the optimization for both experimental paths was performed jointly and therefore the fit parameters are the same for both.We found that our approach yields excellent pre- dictions, both qualitatively and quantitatively, relative to the results of a detailed and expensive analytical fit, as shown in Fig. 3a and b.The analytical parameters in the LSTW limit adapted from Petsch et al. [28] are J = 29.00(8)meV and J p = 1.67 (5) meV.In addition, the experimental data show spin gaps at low energy which are here neglected.The parameters obtained from the ML fitting are J = 29.68 meV and J p = 1.70 meV.The small overestimation of J arises due to a number of factors.Firstly, there exist small differences due to the 3-dimensionality of Q and hence, in the following variations in the magnetic form factors and polarization factors.The 3-dimensional information is not included in the analyzed data, as they are averaged over Q z ∈ [−10, 10] r.l.u., and thus these factors are neglected in the simulated images.In addition, the resolution function and finite lifetime are only approximations here and further, any multi-magnon scattering is not described by LSWT.Finally, the experimentally observed energy shift by the spin gaps [28,42] is not considered to minimize the number of parameters for clarity.Note, we also experimented with fitting each path independently and also obtained similar predictions; for path two, this is a notable achievement since a significant section of the data is missing in the experiment (See SI2).In addition, in SI3, we provide fitting results from SpinW with algorithmic peak-fitting, which yields similar results for this dataset.
In addition, since the neural implicit model is cheap to evaluate, we also constructed a loss landscape of the objective function with respect to J and J p .We see that the objective function is well-behaved and that the gradient descent scheme finds a fit close to the analytical result (Fig. 3e).
Here, we emphasize that the only information provided to the algorithm is knowledge of a region of (Q, ω)-space from which to perform automatic background subtraction prior to fitting the data.Importantly, no peak finding or extraction is needed as the optimization objective uses the intensity of all provided voxels in the (Q, ω)-space or pixels on the 2D in-tensity map, respectively, rather than magnon peak positions ω Q .We expect that the ability to fit such complex data in real-time could be readily coupled to autonomous experiment steering agents.For example, the neural implicit model can provide fast and scalable forward computations to provide sufficient sampling for accurate distribution estimations, which are essential in Bayesian experiment designs [43,44].
In real experimental settings, another critical aspect is the ability to make rapid decisions on whether sufficient statistics have been obtained for understanding the necessary physics being measured.Especially since neutron scattering measurements typically have low detector count rates, this is a major influence on the efficiency of measurement time at facilities.Moreover, one would like to minimize the amount of time needed without sacrificing data quality, or rendering information statistically insignificant, and ultimately reliably analyze magnetic excitations in the most efficient way.
To probe the effectiveness of our framework for real-time fitting during an experiment, and to reduce data collection time, we used the current experimental data to generate plausible data for low counting situations.Specifically, we smooth the experimental data and use it as a probability distribution which is sampled using rejection sampling (See Methods).Here, we note that there is no "detector noise" and that the noise in the experiment comes purely from the background scattering of the data.
In Fig. 4a, we show the obtained parameters from the machine learning fitting as a function of the number of detected neutrons within the path region.Visualizations of path 1 at selected points in time are also shown in 4b.The machine learning prediction is obtained as the lowest objective value from 10 independent gradient descent optimizations starting from random locations in Hamiltonian parameter space.We further note that using the median prediction also gives very similar results.This test demonstrates that the machine learning model quickly converges to the true solution.
The ability to continuously fit data as it is collected is very useful from an experimental point of view.Had other paths in reciprocal space been available, it would have simply required training with additional simulations and without any changes to the overall machine learning model or framework.This is because the model was built to generically accept reciprocal coordinates as input.In general, fitting this type of high-dimensional data is not compatible with having to run peak finding algorithms (which are often manually guided) for each path.Importantly, the algorithm described here will be a valuable tool for both carrying out these types of experiments faster, or enabling multiple experiments to be performed.From this analysis, we demonstrate the ability of the machine learning model to be trained prior to an experiment to allow for real-time fitting and decision making.

CONCLUSIONS
We developed a powerful tool for identifying the key parameters that describe the linear spin wave spectrum.Besides LSWT, our machine learning algorithm can be combined with more complex and time-consuming models derived from e.g.exact diagonalization, and density-matrix renormalization group methods.It breaks the barrier of real-time fitting of inelastic neutron and x-ray scattering data, bypassing the need for complex peak fitting algorithms or user-intensive post-processing, and allows for the incorporation of entirely three-dimensional data in reciprocal space.Furthermore, the ability to fit data continuously throughout an experiment will be useful for determining an optimal stopping point for data collection as well as for guiding experiments.The approach opens up new opportunities which can significantly improve the ability to analyze scattering from excitations in ordered quantum systems.

Sample Preparation and Data Collection
In the experiment a 21 g single crystal of the quasi-2D Néel antiferromagnet La 2 NiO 4+δ (P4 2 /ncm with a = b =5.50 Å, c =12.55 Å), grown by the floating-zone technique, was utilized.The presented time-of-flight neutron spectroscopy data were collected on the SEQUOIA instrument at the Spallation Neutron Source at the Oak Ridge National Laboratory [29] with an incident neutron energy of 190 meV, the high-flux Fermi chopper spun at 300 Hz and a sample temperature of 6 K.The data is integrated over the out-of-plane momentum Q z ∈ ±10 r.l.u.The lattice can be approximated by I4/mmm with a = b ≈ 3.89 Å. Q x and Q y for I4/mmm are equivalent to Q x and Q y in the square-lattice notation.For more details see Ref. [28].

SpinW Simulation and Fitting
The two momentum paths used for S(Q, ω) simulation are 0 0 and Q list2 = [−0.070.03 0] + Q list1 , respectively.When plotting spin wave spectra using the SpinW software [12].600 simulations were performed for each path (1200 total) corresponding to randomly sampling J and J p in ranges [20,75] and [-30, 10] meV.The lower limit for J and upper limit for J p are chosen such that the ground state remains the Néel state which is satisfied in LSWT for J > 2J p and J > 0. For each location in Q, the corresponding energies from 0 -200 meV were obtained.The quantum fluctuation renomalization factor Z c is set to 1.09 [28,45,46].After simulation, the data was convoluted with an energy-dependent kernel based on the beamline instrument profile.For this procedure, an in-built tool from SEQUOIA was used to give a polynomial fit for the dependence of the resolution (FWHM) in meV on the energy transfer ( ω) in meV: FWHM = 1.4858 × 10 −7 ( ω) 3 + 1.2873 × 10 −4 ( ω) 2 − 0.084492 ω + 14.324 [29].In addition, the data was broadened with a 1D Gaussian kernel (σ = 5 pixels) in Q to correct for the discrete sampling of the simulation and to partially consider the momentum resolution of the instrument.
The SpinW-software-based spin wave spectrum fitting was implemented using its built-in function.The inputs are peak information extracted from experimental spin-wave dispersion data.The R value is optimized using a particle swarm algorithm to find the global minimum defined as , where (i, q) index the spin-wave mode and momentum, respectively.E sim and E meas are the simulated and measured spin wave energies, σ is the standard deviation of the measured spin-wave energy determined previously by fitting the inelastic peak and n E is the number of energies to fit.

Surrogate Model Training
A 5-layer SIREN neural network (Fig. 1c) was trained on 1,000 simulations of (S(Q, ω), J, J p ) tuples; 200 simulations were left aside for validation and testing.Here, ω [0 -200] meV, J [20 -75] meV and J p [-30 -10] meV were normalized to 0-1 in order for all the parameters to be on approximately the same scale.The model was trained to predict log(1 + S(Q, ω, J, J p )) by optimizing the mean-squared-error objective L between the prediction and the label with respect to the network parameters.During training, the following hyper-parameters and settings were used: Adaptive Moment Estimation (ADAM) algorithm for optimization (β 1 = 0.9, β 2 = 0.999) [36], batch size = 2,048, learning rate = 0.001.The learning rate was exponentially decayed by a factor of exp(−0.1)for every epoch after the first ten epochs.We used NVIDIA A100 GPU hardware with the Keras API [47] and the model was trained for 50 epochs.

Machine Learning Parameter Extraction
Prior to differentiable optimization, the experimental data were automatically background subtracted using the following procedure.First, a region of (Q list , ω list ) space was chosen for each slice (160-170 pixel location in the Q-axis) and averaged across Q list to yield a one-dimensional energy profiles.This procedure was chosen based on prior assumptions on the isotropic nature of the scattering and the Néel ground state.Next, the one-dimensional energy profiles were fit using a Savitzy-Golay filter (window size = 51, polynomial order = 3) and used for background subtraction.
The unknown J, J p parameters were recovered from data using gradient-based optimization of the neural network implicit representation.For the experimental data presented in this work, the metric (1 − r) between the measured and simulated (1 + S(Q, ω, J, J p )) was used as the objective function (Eq.4; here, r refers to the Pearson correlation coefficient.No normalization was performed for scaling the simulation data relative to the experimental data. The objective L was optimized using the ADAM algorithm with respect to J and J p and Q list and ω list were randomly sampled from the list of paths containing the experimental data.Here, a batch size of 4,096 was used for the(Q list , ω list ) sampling, with 2,000 Adam optimization steps and a learning rate of 0.005.

Low count data generation and fitting
High-count data for each slice (without background subtraction) were smoothed using a 3x3 Gaussian convolutional kernel.The resultant images were each normalized to (0, 1) using the total intensity.Each slice was treated as a probability distribution which was sampled using Monte-Carlo rejection sampling.This process was used to create a series of datasets with neutron counts in the range (1×10 4 -9×10 6 ).Each dataset was individually and automatically background subtracted by the previously described method and fit ten times from random starting locations in (J, J p ) using the machine learning optimization procedure.Note, the corresponding low-count data was used in order to perform the automated background subtraction.

SI2: Missing Region Interpolation
In the raw data for path 2, a region is missing.However, since the machine learning model is a forward surrogate for the LSWT simulation, it is able to make predictions even where there is no data.Note, that this is a significant advantage over the inverse modelling approach.
For the displayed path there is no data available in the here utilized neutron dataset for the missing region of Q-space.However, data is available for (Q x , Q y ) → (Q y , Q x ), so for the 2D momentum rotated by 45 • of the missing path region.As this compound is assumed to be fully twinned S(Q x , Q y , ω) = S(Q y , Q x , ω) and thus, the missing region in Q can be substituted by the equivalent data with Q x and Q y exchanged.The data along the full path with the missing region substituted is depicted in comparison with the result predicted by our forward model.Note, in this case, the prediction for path 2 only uses the data from path 2. No information from path 1 is utilized in the fitting.
Wave vector (h,k) (r.l.u)This section shows the SpinW fitting results for the (J, J p ) parameters when both path 1 and path 2 are used for the fitting (Fig. 3a and b).SpinW fitting was carried out to provide a benchmark for the ML results: (J, J p ) = (29.15,1.55) (The result is the median of five fits with 100 maximum iterations each, path 1 resulting result is shown in Fig. 7).Relative to the ML method, SpinW fitting does not utilize all the available pixel information and instead requires additional peak finding and peak fitting steps.When the spin Hamiltonian has more parameters, or when there are multiple eigenmodes in the energy region of interest, our ML algorithm may show greater advantage both in time and accuracy.

3 FIG. 1 .
FIG. 1. Overview of machine learning pipeline, model Hamiltonian and reciprocal space paths.(a) Ni 4 O 4 square-lattice plaquette in La 2 NiO 4 .J and J p are the first-and second-nearest-neighbor interactions.(b) The Brillouin zone for the spin-1 square lattice magnetic structure.Selected high-symmetry points are indicated.The two momentum paths are denoted by the purple and orange lines, respectively.(c) Visualization of the SIREN neural network for predicting the scalar dynamical structure factor intensity.All nodes exhibit a fully-connected architecture.The notation 64 × 3 and 64 × 1, represent three and one neural network layers with 64 neurons each.(d) Visualization of the distribution of training, test, and validation data in J-J p space.(e) Synthetic S(Q, ω) predictions from the SIREN model along the corresponding trajectory drawn in (d).Grid lines correspond to [0, 50, 100, 150, 200] and [P, M, X, P, Γ, X] for the energy and wave vector respectively.

FIG. 2 .
FIG. 2. Visualization of ground-truth linear spin wave theory (LSWT) simulation and corresponding machine learning forward model.Example of ground-truth simulated S(Q, ω) calculated using the SpinW software program (a) and corresponding machine learning forward model prediction (b) given parameters J = 45.57,J p = 2.45 meV.

FIG. 3 .
FIG. 3. Machine learning forward model prediction and gradient-based optimization.(a) and (b) show experimental data after automated background subtraction.The color bars reflect S(Q, ω) in units of: mbarn sr −1 meV −1 f.u.−1 .(c) and (d) show corresponding machine learning predictions for both paths.The visual predicted profiles are in close agreement with the experimental data.Deviations at low ω are due to the neglection of the anisotropy spin gaps in our model.(e) Visualization of the loss landscape for objective fitting in Hamiltonian parameter space (J, J p ).

FIG. 4 .
FIG. 4. (a) Machine learning prediction for J and J p as a function of detector count with square-root scaling of the 'Detector Counts' axis in regard to Poisson statistics.Note, the machine learning prediction converges much earlier than the count-time recorded in the experiment.(b) Visualization of plausible low-count (without algorithmic background subtraction) data with total detector counts: 16,173, 57,237, and 326,952 (top to bottom).

FIG. 6 .
FIG. 6. Machine learning forward model accurately predicts scattering profile for missing region.(a) Experimental data with missing region filled in by exchanging Q x and Q y .(b) ML prediction using only experimental data from path 2 with missing region.Evidently, the ML prediction closely models the true experimental data.