Capturing dynamical correlations using implicit neural representations

Chitturi, Sathya R.; Ji, Zhurun; Petsch, Alexander N.; Peng, Cheng; Chen, Zhantao; Plumley, Rajan; Dunne, Mike; Mardanya, Sougata; Chowdhury, Sugata; Chen, Hongwei; Bansil, Arun; Feiguin, Adrian; Kolesnikov, Alexander I.; Prabhakaran, Dharmalingam; Hayden, Stephen M.; Ratner, Daniel; Jia, Chunjing; Nashed, Youssef; Turner, Joshua J.

doi:10.1038/s41467-023-41378-4

Download PDF

Article
Open access
Published: 20 September 2023

Capturing dynamical correlations using implicit neural representations

Nature Communications volume 14, Article number: 5852 (2023) Cite this article

3779 Accesses
3 Citations
56 Altmetric
Metrics details

Subjects

Abstract

Understanding the nature and origin of collective excitations in materials is of fundamental importance for unraveling the underlying physics of a many-body system. Excitation spectra are usually obtained by measuring the dynamical structure factor, S(Q, ω), using inelastic neutron or x-ray scattering techniques and are analyzed by comparing the experimental results against calculated predictions. We introduce a data-driven analysis tool which leverages ‘neural implicit representations’ that are specifically tailored for handling spectrographic measurements and are able to efficiently obtain unknown parameters from experimental data via automatic differentiation. In this work, we employ linear spin wave theory simulations to train a machine learning platform, enabling precise exchange parameter extraction from inelastic neutron scattering data on the square-lattice spin-1 antiferromagnet La₂NiO₄, showcasing a viable pathway towards automatic refinement of advanced models for ordered magnetic systems.

Integration of machine learning with neutron scattering for the Hamiltonian tuning of spin ice under pressure

Article Open access 04 November 2022

Accelerating the calculation of electron–phonon coupling strength with machine learning

Article 08 August 2024

Data-driven magneto-elastic predictions with scalable classical spin-lattice dynamics

Article Open access 28 September 2021

Introduction

Quantum matter, as featured by the existence of macroscopic orders from microscopic spin and/or charge arrangements or other phases with spontaneous symmetry breaking, represents an abundant and complex class of materials in condensed matter physics. For example, the magnetic configuration of a material and its dynamics are often driven by competing effects of multiple interactions as well as crystalline symmetries. The collective spin excitations in most magnetic materials, such as spin waves or magnons, act as probes of those interactions. The associated dispersion relations and correlations are key for developing potential applications, which include next-generation spintronics devices, as well as new strategies for carrying, transferring, and storing information^1,2,3.

A primary aim of the last few decades has been to characterize wide classes of excitations, and this has been facilitated by advances in spectroscopic techniques, such as neutron scattering^4,5,6,7. These techniques use the kinematics of scattered neutrons to obtain dispersion relations, lifetimes, and amplitudes of spin excitations. Neutron scattering studies are, however, challenging due to the paucity of available neutron sources, low neutron flux compared to other sources, and small neutron scattering cross sections. As a result, the question of how the efficiency of neutron experiments could be enhanced is drawing considerable interest in the field^{8, 9}. Notably, the interpretation of neutron scattering spectra can be challenging and time-consuming due to the complex nature of the physical processes involved, the diversity of samples, and the limited knowledge often provided by theoretical modeling. It is clear that there is an urgent need for collaboration among experiment, theory, and data science to accelerate the understanding of spin-related properties of materials¹⁰.

As the rates of data collection continue to increase rapidly, especially with the advent of next-generation X-ray free electron laser facilities and the ability to collect hyper-dimensional datasets, it is important to develop techniques for real-time modeling and analysis of experimental spectra. The ability to perform ‘on-the-fly’ fitting¹¹ would enable efficient use of expensive beamtime by ascertaining when sufficient data has been collected, as well as by coupling to adaptive sampling methods to gain the most information about parameters of interest with the least number of measurements. Currently, real-time fitting for neutron scattering data can require substantial preparation. For example, direct fitting with the software package SpinW¹² requires the extraction of the eigenmodes of the system and therefore, needs an accurate, and preferably automatic, peak extraction algorithm. When the chosen paths in reciprocal space are numerous or the dispersion relations change significantly along those paths, this can involve significant human guidance and monitoring. In addition, fitting directly with SpinW does not take into consideration the magnon peak intensities or their shapes. Approaches to fit peak intensities and shapes directly, such as Multi- or Tobyfit implemented in HORACE¹³, are possible alternatives – however, these fitting procedures still require significant human guidance and are either slow and therefore incompatible with data acquisition rates or else they require an analytical, rapidly calculable spin wave model. Finding such a model is usually only feasible for simpler systems with minimal magnetic frustration or a low number of magnetically distinct sites.

Machine learning methods have recently been utilized in the analysis of x-ray and neutron scattering measurements to improve the accuracy and speed of data interpretation^10,14. Convolutional neural networks, trained on linear spin wave simulations, have been applied to inelastic neutron scattering measurements to discriminate between two plausible magnetic exchange models^14,15. As these models are typically trained on simulated profiles, achieving robust prediction generally requires detailed modeling and corrective dataset augmentations of experimental effects accounting for attributes such as background noise, missing data, and matching instrumental profiles^15,16,17. Recently, a cycleGAN approach, which makes experimental data look like simulated data has been proposed as a way to improve model robustness¹⁶. In cases where the desired observables have continuous values, such approaches are often highly sensitive to background noise and other effects¹⁸. To predict continuous Hamiltonian parameters from static and inelastic neutron scattering data, previous approaches have utilized a combination of an autoencoder neural network, used for data compression, and a generative model, used for forward prediction^14,19,20,21. This pipeline has been shown to return excellent results on fully collected data but has not previously been applied to the setting of on-the-fly parameter extraction.

Prior machine learning efforts in the neutron scattering community have relied on traditional image-based data representations. A promising direction in this field can be capitalized on with the introduction of a new paradigm of data modeling based on neural implicit representations^22,23. Such models are often described as coordinate networks as they take a coordinate as input and typically output a single scalar or a small set of scalars. In computational imaging applications, these networks learn mappings from pixel coordinates (i, j) to an RGB value representing the color of that pixel. The coordinate-based representation encodes the image implicitly through a set of trainable weights and can be used to make predictions at sub-pixel scales. These models have been shown to be able to accurately capture high-frequency features in images and scenes and have been particularly successful at tasks such as 3D-shape representation and reconstruction. Furthermore, gradients and higher-order derivatives of the implicit representation can be readily calculated and used for solving inverse-problems^22,24,25,26.

In this work, we develop a neural implicit representation for the dynamical structure factor, S(Q, ω), as a function of energy transfer (ℏω), momentum transfer (Q), and Hamiltonian parameter coordinates. The dynamical structure factor is a general function measured in many inelastic x-ray and neutron experiments and is related to different correlation functions of the probed order, see “Methods” section for further details. To demonstrate the versatility of our method, we report the results using a series of calculations based on mean field theory through the linear spin wave theory (LSWT) framework²⁷. We simulate LSWT spectra for a spin-1 square-lattice Heisenberg model Hamiltonian over a large phase space of Hamiltonian parameters and use it to train a neural implicit representation. The model is applied to experimental time-of-flight neutron spectroscopy data²⁸ taken on the quasi-2D Néel antiferromagnet La₂NiO₄, and leverages a GPU-based optimization procedure to return the Hamiltonian parameters that represent the system under study. In particular, the method does not rely on peak fitting algorithms and performs well under low signal-to-noise ratio scenarios. To gain further insight, we use a Monte-Carlo simulation of the experimental data collection process to demonstrate the potential of our approach for continuous in-situ analysis to provide guidance on when an adequate amount of data has been collected to conclude the experiment. Collectively, these findings pave the way for conducting scattering experiments in a streamlined and efficient manner, and open exciting new avenues to swiftly unravel the parameterization of underlying dynamical models.

Results

Neural implicit representation modeling

Our machine learning framework is based on the concept of implicit neural representations which are machine learning models that can be used to store images (or hypervolumes) via trainable network parameters. Accordingly, we develop a neural implicit representation for the dynamical structure factor across different model Hamiltonian parameters. Our Hamiltonian, which corresponds to an extended nearest-neighbor Heisenberg model, is given by Eq. (1)^7,29.

$${{{{{{{\mathcal{H}}}}}}}}\,=\,J\mathop{\sum}\limits_{\langle i,j\rangle }\,{\hat{{{{{{{{\bf{S}}}}}}}}}}_{i}\cdot {\hat{{{{{{{{\bf{S}}}}}}}}}}_{j}\,+\,{J}_{p}\mathop{\sum}\limits_{\langle i,{j}^{{\prime} }\rangle }\,{\hat{{{{{{{{\bf{S}}}}}}}}}}_{i}\cdot {\hat{{{{{{{{\bf{S}}}}}}}}}}_{{j}^{{\prime} }}\!,$$

(1)

As depicted in Fig. 1a, J and J_p are the first- and second-nearest-neighbor Heisenberg exchange coupling parameters on a square lattice. Thus, for Q_x and Q_y, a square-lattice notation is utilized with a and b corresponding to the vectors connecting the first nearest neighbors or opposing edges of the square, respectively.

**Fig. 1: Overview of machine learning pipeline, model Hamiltonian and reciprocal space paths.**

The specific implicit neural representation presented in this work is a Sinusoidal Representation Network (SIREN)²², which is a fully-connected neural network³⁰ with sinusoidal activation functions that accepts coordinates as input. Our SIREN model is trained to approximate the scalar function $\log (1+S({{{{{{{\bf{Q}}}}}}}},\, \omega,\, J,\, {J}_{p}))\in {{\mathbb{R}}}_{+}^{1}$ (a real positive number {x ∈ R∣x > 0}), which is a logarithmic transformation of the dynamical structure factor evaluated at a specific ${{{{{{{\bf{Q}}}}}}}}\in {{\mathbb{R}}}^{3}$ (3 dimensional, reciprocal lattice vector in reciprocal lattice units (r.l.u.)), $\hslash \omega \in {{\mathbb{R}}}^{1}$ (energy transfer in units of meV) and J, ${J}_{p}\in {{\mathbb{R}}}^{1}$ (specific Hamiltonian coupling parameters in units of meV). We use the logarithm to increase the weighting of weaker features in the data and add one to prevent ill-conditioned behavior around zero.

The functional form of the SIREN neural network, denoted as Φ, involves applying a series of matrix multiplications, vector additions and sinusoidal operations to the coordinate vector ${\left[{{{{{{\mathbf{Q}}}}}}},\omega,J,J_p\right]}^\intercal\in{{\mathbb{R}}}^{6}$ (Eq. (2)).

$$h_0 =\sin(W_{0} {\left[{{{{{{\mathbf{Q}}}}}}},\, \omega,\, J,\, J_p\right]}^{\intercal}+b_0) \\ h_{i} =\sin(W_i h_{i-1}+b_i) \,{{{{{{\rm{with}}}}}}} \, i\in\{1,\, 2,\, 3\} \\ {{{\Phi}}} =(W_{4} h_{3})+b_{4}$$

(2)

Here, ${b}_{0}\in {{\mathbb{R}}}^{6},\,\{{b}_{1},\, {b}_{2},\, {b}_{3}\}\in {{\mathbb{R}}}^{64},\,{b}_{4}\in {{\mathbb{R}}}^{1},\,{W}_{0}\in {{\mathbb{R}}}^{64\times 6},\,\{{W}_{1},\, {W}_{2},\, {W}_{3}\}\in {{\mathbb{R}}}^{64\times 64}$ and ${W}_{4}\in {{\mathbb{R}}}^{1\times 64}$ are vectors and matrices, respectively, that are learned during the training process to ensure that Φ(Q, ω, J, J_p) mimics $\log (1+S({{{{{{{\bf{Q}}}}}}}},\, \omega,\, J,\, {J}_{p}))$ as closely as possible. Graphically, W₀, W₁, W₂, and W₃ correspond to the weights between the first four layers of the network which are transformed by applying the sine function in an element-wise manner. W₄ represents the weights for the final layer for which only a linear function is applied. This specific neural architecture is also illustrated in Fig. 1c.

We note that although our model is written for three-dimensional Q, the neutron profiles used in the following sections do not include a Q_z component due to limited sample orientations. The model is trained on 1200 LSWT simulations of S(Q_list, ω_list) over a large set of possible J and J_p values and on two paths in reciprocal space (Fig. 1b). Q-path 1 and 2 are denoted as P → M → X → P → Γ → X and P1 → M1 → X1 → P1 → Γ1 → X1 which correspond to ${{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{path1}}}}}}}}}=\left\{\left[\frac{3}{4}\,\frac{1}{4}\,0\right],\left[\frac{1}{2}\,\frac{1}{2}\,0\right],\left[\frac{1}{2}\,0\,0\right],\left[\frac{3}{4}\,\frac{1}{4}\,0\right],\, \left[1\,0\,0\right],\left[\frac{1}{2}\,0\,0\right]\right\}$ and ${{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{path2}}}}}}}}}=\left[-0.07\,0.03\,0\right]+{{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{path1}}}}}}}}}$. Here, ${{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{list}}}}}}}}}\in {{\mathbb{R}}}^{{N}_{{{{{{{{\bf{Q}}}}}}}}}}$ and ${\omega }_{{{{{{{{\rm{list}}}}}}}}}\in {{\mathbb{R}}}^{{N}_{\omega }}$ is an overloaded notation which refers to a series of N_Q and N_ω points in the (Q, ω)-space, respectively.

Once the differentiable neural implicit model is trained, it is possible to use gradient-based optimization to solve the inverse problem of determining the unknown J and J_p parameters from data. Our objective function for the optimization task measures the Pearson correlation coefficient (r) between $\log (1+S({{{{{{{\bf{Q}}}}}}}},\, \omega,\, J,\, {J}_{p}))$ and the machine learning prediction (Equation (3)).

$${{{{{{{\rm{L}}}}}}}}=1-{{{{{{{\rm{r}}}}}}}}(\log (1+{S}_{{{{{{{{\rm{measured}}}}}}}}}),\, {{\Phi }}({{{{{{{\bf{Q}}}}}}}},\, \omega,\, J,\, {J}_{p}))$$

(3)

We use the correlation as the metric because the normalization factors between the experiment and simulation are here unknown. Using the logarithmic transformation is favorable as it enhances the weighting of the coherent excitation at high ℏω and further helps evade contamination due to statistical noise in the elastic and incoherent-inelastic scattering, which arises primarily at low ℏω and that cannot be removed by background subtraction. The normalization scheme is important since we are not aiming to fully describe the spectral weights, which would require the exact handling of all individual neutrons in the full three-dimensional Q-space, instead of the averaged weight in the reduced two-dimensional Q-space. During optimization, any subset of (Q_list, ω_list) coordinates can be chosen as long as they fall along either of the paths defined in Fig. 1b. Here, we note that from an inference point of view, any momentum or energy coordinates could be chosen, however our training data only includes two reciprocal-space paths. To determine the Hamiltonian parameters, J and J_p are treated as free parameters in the optimization problem. The objective in Eq. (3) is optimized using the Adam optimizer³¹, a commonly used gradient-based optimization algorithm that exploits the automatic differentiation capabilities in Tensorflow³² to calculate $\frac{{{{{{{{\rm{d}}}}}}}}L}{{{{{{{{\rm{d}}}}}}}}J}$ and $\frac{{{{{{{{\rm{d}}}}}}}}L}{{{{{{{{\rm{d}}}}}}}}{J}_{p}}$, see “Methods” section for details.

In our technique, it is not necessary to use all sets of Q_list, ω_list along both paths to perform the fitting. Instead, random batches of coordinates (Q_batch, ω_batch) can be queried at each optimization iteration in order to improve computational efficiency and converge to a better minimum, in a manner similar to the regularization effects of stochastic gradient descent³³. Pseudo-code for the optimization procedure is provided in Algorithm 1.

Algorithm 1

Differentiable Neural Optimization

while N < MaxIter do

Q_batch, ω_batch, S_batch ~ [Q_list, ω_list, S_list]

$\log (1+{S}_{{{{{{{{\rm{pred}}}}}}}}})={{\Phi }}({{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{batch}}}}}}}}},\,{\omega }_{{{{{{{{\rm{batch}}}}}}}}},\,J,\, {J}_{p})$

J, J_p ← ADAM(L(S_batch, S_pred))

end while

Application to La₂NiO₄

We first characterize the performance of our machine learning framework on simulated SpinW data in order to demonstrate the viability of using a neural implicit representation for the LSWT simulator. Figure 1e demonstrates the ability of our implicit model to generate new predictions for S(Q, ω) under Hamiltonian parameter ranges that lie outside the training data. Figure 2 provides a comparison between the LSWT and machine learning simulation with specific values of the input parameters (J = 45.57 meV and J_p = 2.45 meV). In this example, the machine learning framework was fed (J, J_p) directly (instead of obtaining these parameters using gradient descent through the neural representation). The machine learning prediction and the LSWT simulation are seen to be almost indistinguishable. A quantitative analysis of the difference between simulation and prediction is provided in Supplementary Fig. 2.

**Fig. 2: Comparison between linear spin wave theory simulation and machine learning prediction for a given set of parameter values (J = 45.57 meV and J_p = 2.45 meV).**

Although our model can clearly approximate simulated data well, our main motivation, however, is to provide a tool that can reliably extract the spin Hamiltonian parameters of interest from real, experimental data. For this reason, we applied our method to the measured inelastic neutron scattering data (after an automatic background-subtraction) taken from the quasi-2D Néel antiferromagnet La₂NiO₄. Experimental data prior to background subtraction are shown in Supplementary Fig. 1. Though a full 3D dataset was collected, we chose two paths in Q-space to simulate many spectra for a range of J and J_p for the model training prior to any inclusion of real data. After the model was trained on the two simulated paths, we applied Algorithm 1 to determine J and J_p from the data. The optimization for both experimental paths was performed jointly, and therefore, the fit parameters are the same for both cases. Our approach was found to yield excellent predictions, both qualitatively and quantitatively, relative to the results of a detailed and expensive analytical fit, as shown in Fig. 3a, b. The analytical parameters in the LSTW limit, adapted from Petsch et al.²⁸, are J = 29.00(8) meV and J_p = 1.67(5) meV. The parameters obtained from our machine learning fitting are J = 29.68 meV and J_p = 1.70 meV. We also experimented with fitting each path independently and also obtained similar predictions; for path 2, this is especially notable since a significant portion of the experimental data is missing in this case, see Supplementary Fig. 3. Supplementary Figure 4 provides fitting results from SpinW with algorithmic peak-fitting, which yields similar results for this dataset.

**Fig. 3: Hamiltonian parameter extraction via auto-differentiation of the neural implicit representation.**

Since our neural implicit model is computationally inexpensive to evaluate, we also constructed a loss landscape of the objective function with respect to J and J_p. The objective function is found to be well-behaved and the gradient descent scheme finds a fit close to the analytical result (Fig. 3e). We emphasize that the only information provided to the algorithm is the knowledge of a region of the (Q, ℏω)-space on which to carry out an automatic background subtraction prior to fitting the data. Importantly, no peak finding or extraction is needed as the optimization objective uses the intensity of all provided voxels in the (Q, ℏω)-space or pixels on the 2D intensity map rather than the magnon peak positions ℏω_Q.

Real-time fitting

In real experimental settings, another critical issue is the ability to make rapid decisions on whether or not sufficient data have been collected at any one time to allow for a good understanding of the physics being explored.

To probe the effectiveness of our framework for real-time fitting during an experiment, and to reduce data collection time, we used the experimental data to generate plausible data for low counting situations. Specifically, we smoothed the experimental data and used it as a probability distribution which is sampled using rejection sampling, see Methods. In a real experiment, a sample is normally measured using a series of different orientations on the spectrometer, often with varying time scales. Here, the rejection sampling simulates the La₂NiO₄ neutron scattering experiment performed in the same sample orientations but with shorter data collection times. This exercise gives insight into the viability of the approach for handling low statistics and noisy data. We note that any “detector noise" and scattering from the sample environment is negligible compared to statistical noise in the scattering from the sample. In Fig. 4a, we show the obtained parameters from the machine learning fitting as a function of the number of detected neutrons within the two path regions. Visualizations of path 1 at selected points in time are also shown in Fig. 4b. The machine learning prediction is obtained as the lowest objective value from 10 independent gradient descent optimizations starting from random locations in Hamiltonian parameter space. Using the median prediction gives very similar results. This test demonstrates that our machine learning model quickly converges to the true solution and is effective under low signal-to-noise conditions.

**Fig. 4: Real-time Hamiltonian parameter estimation using a differentiable implicit neural representation.**

Discussion

In this work, we develop a neural implicit representation customized for inelastic neutron scattering analysis and show that this model can enable precise extraction of Hamiltonian parameters and has the potential to be deployed in real-time settings to minimize required counting time.

We emphasize that our implicit modeling scheme considers data as coordinates (Q, ω, J, J_p) which is fundamentally different from the traditional image-based representations. One benefit of this approach is that the model continuously represents energy, momentum, and Hamiltonian parameters, and can therefore be used to make predictions at displaced coordinates (Q + δQ, ω + δω, J + δJ, J_p + δJ_p). This enables prediction at finer resolutions of Q and ω than those recorded on pixelized detectors or at Hamiltonian parameters not present in the training set. Additionally, since the model is a SIREN neural network, it is composed of a series of differentiable operations and is therefore amenable to automatic differentiation techniques. This is highly advantageous and allows the entire analysis pipeline to be compactly expressed by a single model that is end-to-end differentiable relative to the parameters of interest. This approach also allows for an elegant treatment of missing data. Here, missing coordinates can simply be dropped from the parameter estimation step without the need for additional model retraining or data masking.

To validate our approach, we use inelastic scattering data from La₂NiO₄ and find that our method accurately recovers unknown parameters corresponding to the assumed spin-1 Heisenberg Hamiltonian model on a square lattice. The small overestimation of J arises from several factors. Small differences in the value of J arise from the 3-dimensionality of Q and the associated variations in the magnetic form factors and polarization factors. Such 3-dimensional information is not included in our analysis since we only consider quantities averaged over Q_z ∈ [−10, 10] r.l.u. Also, the resolution function and finite lifetime are only approximations here and further, any multi-magnon scattering is not described by LSWT. Finally, we do not include effects of the experimentally observed energy shifts resulting from the spin gaps^28,34. These issues could, however, be addressed through more comprehensive simulations within the overall modeling framework presented here.

Another area for improvement concerns the challenging task of background subtraction. For the analysis of La₂NiO₄, we were able to develop an automatic background subtraction scheme, based on human insight, to successfully suppress non-magnetic contributions which include non-magnetic coherent excitations (phonons here). However, the suppression of other contributions by this method may not always be feasible. In future work - phonon dispersion calculations, nuclear structure factors, and usage of Q-dependence of spectral weights - could be implemented in our framework to distinguish additional coherent excitations.

Our ability to continuously fit and refine data as it is collected is important for enabling more efficient and informative experimental design. Since neutron scattering measurements typically involve low detector count rates, this is a major factor that will influence the efficiency of measurement time at facilities. Moreover, one would like to minimize the amount of time needed to complete an experiment without sacrificing data quality. We have shown our model to perform well under low signal conditions and to yield accurate Hamiltonian parameter predictions, thereby providing guidance on when best to conclude data collection. Here, stochastic gradient descent of the neural implicit model is an effective strategy to filter noise and achieve robust optimization. Note that, if other paths in reciprocal space were available, leveraging the information obtained in the additional data would have simply required training with additional simulations, without any necessary changes to the overall machine-learning model. This is an important point for real-time applications, as the flexibility of the coordinate-based representation to ingest additional data is a significant advantage over from conventional analysis pipelines, which rely on manually guided peak-fitting algorithms that are not suited to this type of high-dimensional data. We note that the characterization of the framework’s effectiveness for real-time fitting only considers the case of shorter counting times across all measured sample orientations, highlighting the framework’s capability to handle sparsely distributed detection. Since such measurements usually have to be repeated, this analysis approach could be applied between repetitions to determine whether more data collection is necessary. Furthermore, additional work could involve simulating the training data with respect to sample orientations, which would be preferred when considering experimental guiding for a real, live experiment. In general, we anticipate that our method will be readily compatible with autonomous experimental steering agents by exploiting the model’s fast and scalable forward computations which are essential in Bayesian experimental design^35,36.

Although the present contribution focused on linear spin wave simulations, the approach presented here is not restricted to a particular choice of theoretical scheme. We expect that our framework will be particularly impactful when combined with using expensive and advanced computational methods for simulating strongly correlated systems, such as exact diagonalization (ED)³⁷, density matrix renormalization group (DMRG)^38,39, determinant quantum Monte-Carlo (DQMC)^40,41, and variational Monte Carlo (VMC)^42,43 simulations.

The methodology presented here breaks the barrier of real-time fitting of inelastic neutron and x-ray scattering data, bypassing the need for complex peak-fitting algorithms or user-intensive post-processing. Our study thus opens new opportunities for significantly improved analysis of excitations in classical and quantum systems.

Methods

Sample preparation and data collection

In the experiment, a 21 g single crystal of the quasi-2D Néel antiferromagnet La₂NiO_4+δ (P4₂/ncm with a = b = 5.50 Å and c = 12.55 Å), grown by the floating-zone technique, was utilized. The presented time-of-flight neutron spectroscopy data were collected on the SEQUOIA instrument at the Spallation Neutron Source at the Oak Ridge National Laboratory⁴⁴ with an incident neutron energy of 190 meV, the high-flux Fermi chopper spun at 300 Hz, and a sample temperature of 6 K. The data is integrated over the out-of-plane momentum Q_z ∈ ± 10 r.l.u. The lattice can be approximated by I4/mmm with a = b ≈ 3.89 Å. Q_x and Q_y for I4/mmm are equivalent to Q_x and Q_y in the square-lattice notation. For more details see ref. ²⁸.

SpinW simulation and fitting

In an inelastic scattering experiment, the measured quantity is the partial differential cross section which is related to the dynamical structure factor S(Q, ω) by $\frac{{{{{{{{{\rm{d}}}}}}}}}^{2}\sigma }{{{{{{{{\rm{d}}}}}}}}{{\Omega }}{{{{{{{\rm{d}}}}}}}}{E}_{f}}={k}_{f}/{k}_{i}\,S({{{{{{{\bf{Q}}}}}}}},\, \omega )$, where k_i and k_f are the incident and final neutron or photon wave vectors. In our simulations, the dynamical structure factor is approximated to $S({{{{{{{\bf{Q}}}}}}}},\, \omega )\propto {\sum }_{m,n}\int\,{{{{{{{\rm{d}}}}}}}}t\,{e}^{-i{{{{{{{\bf{Q}}}}}}}}\cdot ({{{{{{{{\bf{r}}}}}}}}}_{m}-{{{{{{{{\bf{r}}}}}}}}}_{n})}{e}^{-i\omega t}\langle {S}_{m}(t){S}_{n}(0)\rangle$, where 〈S_m(t)S_n(0)〉 represents spin-spin correlations at different atomic sites m, n. The neutron polarization factor as well as the magnetic form factor are neglected here.

The two momentum paths used for S(Q, ω) simulation are ${{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{list1}}}}}}}}}=\left\{\left[\frac{3}{4}\,\frac{1}{4}\,0\right],\left[\frac{1}{2}\,\frac{1}{2}\,0\right],\left[\frac{1}{2}\,0\,0\right],\left[\frac{3}{4}\,\frac{1}{4}\,0\right],\left[1\,0\,0\right],\left[\frac{1}{2}\,0\,0\right]\right\}$ and ${{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{list2}}}}}}}}}=\left[-0.07\,0.03\,0\right]+{{{{{{{{\bf{Q}}}}}}}}}_{{{{{{{{\rm{list1}}}}}}}}}$, respectively in reciprocal lattice units. The SpinW software¹² was used to perform 600 simulations for each path (1200 total) corresponding to randomly sampling J and J_p in ranges of [20, 75] meV and [-30, 10] meV. The lower limit for J and upper limit for J_p are chosen such that the ground state remains the Néel state which is satisfied in LSWT for J > 2J_p and J > 0. For each location in Q, the corresponding energies from 0 to 200 meV were obtained. The quantum fluctuation renomalization factor Z_c is set to 1.09^28,45,46. After simulation, the data was convoluted with an energy-dependent kernel based on the beamline instrument profile. For this procedure, an in-built tool from SEQUOIA was used to give a polynomial fit for the dependence of the resolution (FWHM) in meV on the energy transfer (ℏω) in meV: FWHM = 1.4858 × 10⁻⁷(ℏω)³ + 1.2873 × 10⁻⁴(ℏω)² − 0.084492ℏω + 14.324⁴⁴. In addition, the data was broadened with a 1D Gaussian kernel (σ = 5 pixels) in Q to correct for the discrete sampling of the simulation and to partially consider the momentum resolution of the instrument.

The SpinW-software-based spin wave spectrum fitting was implemented using its built-in function. The inputs are peak information extracted from experimental spin wave dispersion data. The R value is optimized using a particle swarm algorithm to find the global minimum defined as $R=\sqrt{1/{n}_{E}\times {\sum }_{i,{{{{{{{\bf{Q}}}}}}}}}1/{\sigma }_{i,q}^{2}{(\hslash {\omega }_{i,{{{{{{{\bf{Q}}}}}}}}}^{{{{{{{{\rm{sim}}}}}}}}}-\hslash {\omega }_{i,q}^{{{{{{{{\rm{meas}}}}}}}}})}^{2}}$, where (i, q) index the spin wave mode and momentum, respectively. E_sim and E_meas are the simulated and measured spin wave energies, σ is the standard deviation of the measured spin wave energy determined previously by fitting the inelastic peak and n_E is the number of energies to fit.

SIREN model training

A 5-layer SIREN neural network (Fig. 1c) was trained on 1000 simulations of (S(Q, ω), J, J_p) tuples; 200 simulations were left aside for validation and testing. Here, ℏω ∈ [0–200] meV, J ∈ [20 − 75] meV and J_p ∈ [−30–10] meV were normalized to 0-1 in order for all the parameters to be on approximately the same scale. The model was trained to predict $\log (1+S({{{{{{{\bf{Q}}}}}}}},\, \omega,\, J,\, {J}_{p}))$ by optimizing the mean-squared-error objective L between the prediction and the label with respect to the network parameters. During training, the following hyperparameters and settings were used: Adaptive Moment Estimation (ADAM) algorithm for optimization (β₁ = 0.9, β₂ = 0.999)³¹, batch size = 2048, learning rate = 0.001. Here, β₁ and β₂ influence the degree to which past gradients affect the current step. The batch size is a parameter that controls the number of images used to compute the mean-squared-error objective and the learning rate controls the gradient descent step size. The learning rate was exponentially decayed by a factor of $\exp (-0.1)$ for every epoch (full pass through the entire dataset) after the first ten epochs. We used NVIDIA A100 GPU hardware with the Keras API⁴⁷ and the model was trained for 50 epochs.

Machine learning parameter extraction

Prior to differentiable optimization, the experimental data were automatically background subtracted using the following procedure. First, a region of (Q_list, ω_list) space was chosen for each slice (160-170 pixel location in the Q-axis) and averaged across Q_list to yield a one-dimensional energy profiles. This procedure was chosen based on prior assumptions on the isotropic nature of the scattering and the Néel ground state. Next, the one-dimensional energy profiles were fit using a Savitzy-Golay filter (window size = 51, polynomial order = 3) and used for background subtraction.

The unknown J and J_p parameters were recovered from data using gradient-based optimization of the neural network implicit representation. For the experimental data presented in this work, the metric (1 − r) between the measured and simulated (1 + S(Q, ω, J, J_p)) was used as the objective function L introduced in Equation (3); here, r refers to the Pearson correlation coefficient. No normalization was performed for scaling the simulation data relative to the experimental data.

The objective L was optimized using the ADAM algorithm with respect to J and J_p and Q_list and ω_list were randomly sampled from the list of paths containing the experimental data. Here, a batch size of 4096 was used for the(Q_list, ω_list) sampling, with 2000 Adam optimization steps and a learning rate of 0.005. Here, the batch size refers to the number of pixels in the experimental image that are randomly selected in each step of the optimization procedure.

Low count data generation and fitting

High-count data for each slice (without background subtraction) were smoothed using a 3 × 3 Gaussian convolutional kernel. The resultant images were each normalized to (0, 1) using the total intensity. Each slice was treated as a probability distribution which was sampled using Monte-Carlo rejection sampling. This process was used to create a series of datasets with neutron counts in the range (1 × 10⁴ − 9 × 10⁶). Each dataset was individually and automatically background subtracted by the previously described method and fit ten times from random starting locations in (J, J_p) using the machine learning optimization procedure. Note, the corresponding low-count data was used in order to perform the automated background subtraction.

Data availability

All data generated in this study as well as a minimal dataset have been deposited in the Zenodo database under accession code https://doi.org/10.5281/zenodo.8267499⁴⁸.

Code availability

The code developed in this study have been deposited in the Zenodo database under accession code https://doi.org/10.5281/zenodo.8267474⁴⁹ and is also available at https://github.com/slaclab/neural-representation-sqw.git.

References

Chumak, A. V., Vasyuchka, V. I., Serga, A. A. & Hillebrands, B. Magnon spintronics. Nat. Phys. 11, 453–461 (2015).
Article CAS Google Scholar
Neusser, S. & Grundler, D. Magnonics: spin waves on the nanoscale. Adv. Mater. 21, 2927–2932 (2009).
Article CAS Google Scholar
Gutfleisch, O. et al. Magnetic materials and devices for the 21st century: stronger, lighter, and more energy efficient. Adv. Mater. 23, 821–842 (2011).
Article CAS PubMed Google Scholar
Rossat-Mignod, J. et al. Neutron scattering study of the YBa₂Cu₃O_6+x system. Phys. C: Supercond. 185, 86–92 (1991).
Article ADS Google Scholar
Chatterji, T. Neutron Scattering from Magnetic Materials (Elsevier, 2005).
Braden, M. et al. Inelastic neutron scattering study of magnetic excitations in Sr₂RuO₄. Phys. Rev. B 66, 064522 (2002).
Article ADS Google Scholar
Coldea, R. et al. Spin waves and electronic interactions in La₂CuO₄. Phys. Rev. Lett. 86, 5377–5380 (2001).
Article ADS CAS PubMed Google Scholar
Weinfurther, K., Mattingly, J., Brubaker, E. & Steele, J. Model-based design evaluation of a compact, high-efficiency neutron scatter camera. Nucl. Instrum. Methods Phys. Res. Sect. A: Accel. Spectrom. Detect. Assoc. Equip. 883, 115–135 (2018).
Article ADS CAS Google Scholar
Peterson, P. F., Olds, D., Savici, A. T. & Zhou, W. Advances in utilizing event based data structures for neutron scattering experiments. Rev. Sci. Instrum. 89, 093001 (2018).
Article ADS PubMed Google Scholar
Chen, Z. et al. Machine learning on neutron and x-ray scattering and spectroscopies. Chem. Phys. Rev. 2, 031301 (2021).
Article ADS Google Scholar
Li, Z., Kermode, J. R. & De Vita, A. Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. Phys. Rev. Lett. 114, 096405 (2015).
Article ADS PubMed Google Scholar
Toth, S. & Lake, B. Linear spin wave theory for single-Q incommensurate magnetic structures. J. Phys.: Condens. Matter 27, 166002 (2015).
ADS CAS PubMed Google Scholar
Ewings, R. et al. HORACE: Software for the analysis of data from single crystal spectroscopy experiments at time-of-flight neutron instruments. Nucl. Instrum. Methods Phys. Res. Sect. A: Accel. Spectrom. Detect. Assoc. Equip. 834, 132–142 (2016).
Article ADS CAS Google Scholar
Doucet, M. et al. Machine learning for neutron scattering at ORNL. Mach. Learn.: Sci. Technol. 2, 023001 (2020).
Google Scholar
Butler, K. T., Le, M. D., Thiyagalingam, J. & Perring, T. G. Interpretable, calibrated neural networks for analysis and understanding of inelastic neutron scattering data. J. Phys.: Condens. Matter 33, 194006 (2021).
ADS CAS Google Scholar
Anker, A. S., Butler, K. T., Le M. D., Perring, T. G. & Thiyagalingam, J. Using generative adversarial networks to match experimental and simulated inelastic neutron scattering data. Digital Discov. 2, 578–590 (2023).
Wang, H. et al. Rapid identification of X-ray diffraction patterns based on very limited data by interpretable convolutional neural networks. J. Chem. Inf. Model. 60, 2004–2011 (2020).
Article CAS PubMed Google Scholar
Chitturi, S. R. et al. Automated prediction of lattice parameters from X-ray powder diffraction patterns. J. Appl. Crystallogr. 54, 1799–1810 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Samarakoon, A. M. et al. Machine-learning-assisted insight into spin ice Dy₂Ti₂O₇. Nat. Commun. 11, 1–9 (2020).
Article Google Scholar
Samarakoon, A., Tennant, D. A., Ye, F., Zhang, Q. & Grigera, S. A. Integration of machine learning with neutron scattering for the Hamiltonian tuning of spin ice under pressure. Commun. Mater. 3, 1–11 (2022).
Article Google Scholar
Samarakoon, A. M. et al. Extraction of interaction parameters for α-RuCl₃ from neutron data using machine learning. Phys. Rev. Res. 4, L022061 (2022).
Article CAS Google Scholar
Sitzmann, V., Martel, J., Bergman, A., Lindell, D. & Wetzstein, G. Implicit neural representations with periodic activation functions. Adv. Neural Inf. Process. Syst. 33, 7462–7473 (2020).
Google Scholar
Xie, Y. et al. Neural fields in visual computing and beyond. Comput. Graphics Forum 41, 641–676 (2022).
Article Google Scholar
Cheong, S. et al. Novel light field imaging device with enhanced light collection for cold atom clouds. J. Instrum. 17, P08021 (2022).
Article Google Scholar
Vlašić, T., Nguyen, H. & Dokmanić, I. Implicit Neural Representation for Mesh-Free Inverse Obstacle Scattering. In 56th Asilomar Conference on Signals, Systems, and Computers. 947–952 (2022).
Levy, A., Wetzstein, G., Martel, J. N., Poitevin, F. & Zhong, E. Amortized Inference for Heterogeneous Reconstruction in Cryo-EM. Adv. Neural Inf. Process. Syst. 35, 13038–13049 (2022).
PubMed PubMed Central Google Scholar
Kubo, R. The spin-wave theory of antiferromagnetics. Phys. Rev. 87, 568 (1952).
Article ADS MATH Google Scholar
Petsch, A. N. et al. High-energy spin waves in the spin-1 square-lattice antiferromagnet La₂NiO₄. Phys. Rev. Res. 5, 033113 (2023).
Article CAS Google Scholar
Marshall, W. & Lovesey, S. W. Theory of Thermal Neutron Scattering (Oxford University Press, 1971).
Hastie, T., Tibshirani, R., Friedman, J. H. & Friedman, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2 (Springer, 2009).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
Abadi, M. et al. Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).
Bottou, L. Neural Networks: Tricks of the Trade: Second Edition 421–436 (Springer, Heidelberg, 2012).
Nakajima, K., Yamada, K., Hosoya, S., Omata, T. & Endoh, Y. Spin-wave excitations in two dimensional antiferromagnet of stoichiometric La₂NiO₄. J. Phys. Soc. Jpn 62, 4438–4448 (1993).
Article ADS CAS Google Scholar
Granade, C. E., Ferrie, C., Wiebe, N. & Cory, D. G. Robust online Hamiltonian learning. New J. Phys. 14, 103013 (2012).
Article ADS MathSciNet MATH Google Scholar
McMichael, R. D. & Blakley, S. M. Simplified algorithms for adaptive experiment design in parameter estimation. Phys. Rev. Appl. 18, 054001 (2022).
Article ADS CAS Google Scholar
Dagotto, E. Correlated electrons in high-temperature superconductors. Rev. Mod. Phys. 66, 763 (1994).
Article ADS CAS Google Scholar
White, S. R. Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 69, 2863–2866 (1992).
Article ADS CAS PubMed Google Scholar
White, S. R. Density-matrix algorithms for quantum renormalization groups. Phys. Rev. B 48, 10345 (1993).
Article ADS CAS Google Scholar
Blankenbecler, R., Scalapino, D. J. & Sugar, R. L. Monte Carlo calculations of coupled boson-fermion systems. I. Phys. Rev. D 24, 2278–2286 (1981).
Article ADS CAS Google Scholar
White, S. R. et al. Numerical study of the two-dimensional Hubbard model. Phys. Rev. B 40, 506–516 (1989).
Article ADS CAS Google Scholar
Ferrari, F., Parola, A., Sorella, S. & Becca, F. Dynamical structure factor of the J₁ − J₂ Heisenberg model in one dimension: the variational Monte Carlo approach. Phys. Rev. B 97, 235103 (2018).
Article ADS CAS Google Scholar
Hendry, D., Chen, H., Weinberg, P. & Feiguin, A. E. Chebyshev expansion of spectral functions using restricted Boltzmann machines. Phys. Rev. B 104, 205130 (2021).
Article ADS CAS Google Scholar
Granroth, G. E. et al. SEQUOIA: a newly operating chopper spectrometer at the SNS. J. Phys.: Conf. Ser. 251, 012058 (2010).
Google Scholar
Igarashi, J.-i 1/S expansion for thermodynamic quantities in a two-dimensional Heisenberg antiferromagnet at zero temperature. Phys. Rev. B 46, 10763 (1992).
Article ADS CAS Google Scholar
Singh, R. R. Thermodynamic parameters of the T=0, spin-1/2 square-lattice Heisenberg antiferromagnet. Phys. Rev. B 39, 9760 (1989).
Article ADS CAS Google Scholar
Chollet, F. et al. Keras. https://keras.io (2015).
Chitturi, S. R. et al. Capturing dynamical correlations using implicit neural representations dataset. Zenodo https://doi.org/10.5281/zenodo.8267499 (2023).
Chitturi, S. R. et al. Capturing dynamical correlations using implicit neural representations accompanying code. Zenodo https://doi.org/10.5281/zenodo.8267474 (2023).

Download references

Acknowledgements

This work is supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0022216, as well as under Contract DE-AC02-76SF00515, both for the Materials Sciences and Engineering Division, as well as for the Linac Coherent Light Source (LCLS), part of the Scientific User Facilities Division. A portion of this research used resources at the Spallation Neutron Source, a DOE Office of Science User Facility operated by the Oak Ridge National Laboratory. J.J. Turner acknowledges support from the U.S. DOE, Office of Science, Basic Energy Sciences through the Early Career Research Program. Z.J. is supported by the Stanford Science fellowship, and the Urbanek -Chodorow postdoctoral fellowship awards. A.N. Petsch and S.M. Hayden acknowledge funding and support from the Engineering and Physical Sciences Research Council (EPSRC) under Grant Nos. EP/L015544/1 and EP/R011141/1.

Author information

Authors and Affiliations

SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
Sathya R. Chitturi, Alexander N. Petsch, Rajan Plumley, Mike Dunne, Daniel Ratner, Chunjing Jia, Youssef Nashed & Joshua J. Turner
Department of Materials Science and Engineering, Stanford University, Stanford, CA, 94305, USA
Sathya R. Chitturi
Department of Physics and Applied Physics, Stanford University, Stanford, CA, 94305, USA
Zhurun Ji
Geballe Laboratory for Advanced Materials, Stanford University, Stanford, CA, 94305, USA
Zhurun Ji
Stanford Institute for Materials and Energy Sciences, Stanford University, Stanford, CA, 94305, USA
Alexander N. Petsch, Cheng Peng, Zhantao Chen, Rajan Plumley, Chunjing Jia & Joshua J. Turner
H.H. Wills Physics Laboratory, University of Bristol, Bristol, BS8 1TL, UK
Alexander N. Petsch & Stephen M. Hayden
Department of Physics, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Rajan Plumley
Department of Physics and Astrophysics, Howard University, Washington, DC, USA
Sougata Mardanya & Sugata Chowdhury
Department of Physics, Northeastern University, Boston, USA
Hongwei Chen, Arun Bansil & Adrian Feiguin
Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
Alexander I. Kolesnikov
Department of Physics, University of Oxford, Clarendon Laboratory, Oxford, OX1 3PU, UK
Dharmalingam Prabhakaran
Department of Physics, University of Florida, Gainesville, FL, 32611, USA
Chunjing Jia

Authors

Sathya R. Chitturi
View author publications
You can also search for this author in PubMed Google Scholar
Zhurun Ji
View author publications
You can also search for this author in PubMed Google Scholar
Alexander N. Petsch
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Peng
View author publications
You can also search for this author in PubMed Google Scholar
Zhantao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rajan Plumley
View author publications
You can also search for this author in PubMed Google Scholar
Mike Dunne
View author publications
You can also search for this author in PubMed Google Scholar
Sougata Mardanya
View author publications
You can also search for this author in PubMed Google Scholar
Sugata Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Arun Bansil
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Feiguin
View author publications
You can also search for this author in PubMed Google Scholar
Alexander I. Kolesnikov
View author publications
You can also search for this author in PubMed Google Scholar
Dharmalingam Prabhakaran
View author publications
You can also search for this author in PubMed Google Scholar
Stephen M. Hayden
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Ratner
View author publications
You can also search for this author in PubMed Google Scholar
Chunjing Jia
View author publications
You can also search for this author in PubMed Google Scholar
Youssef Nashed
View author publications
You can also search for this author in PubMed Google Scholar
Joshua J. Turner
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.R.C., Z.J., and A.N.P. contributed equally to this work and focused on the machine learning, simulation, and experimental portions respectively. A.I.K. and S.M.H. assisted with data collection. C.P., Z.C., R.P., H.C., S.M., M.D., S.C., A.B., A.F., D.P., and D.R. assisted with data analysis and manuscript writing. C.J., Y.N., and J.T. supervised the work.

Corresponding authors

Correspondence to Sathya R. Chitturi, Zhurun Ji, Alexander N. Petsch or Joshua J. Turner.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Astrid Schneidewind, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chitturi, S.R., Ji, Z., Petsch, A.N. et al. Capturing dynamical correlations using implicit neural representations. Nat Commun 14, 5852 (2023). https://doi.org/10.1038/s41467-023-41378-4

Download citation

Received: 08 April 2023
Accepted: 01 September 2023
Published: 20 September 2023
DOI: https://doi.org/10.1038/s41467-023-41378-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.