Introduction

Magnetic contributions are essential for modeling magnetic materials as they critically affect phase stability1,2,3, vibrational properties4,5,6, interstitial energies7, local8,9 and extended defects10,11, and kinetics12,13. Taking the magnetic degrees of freedom properly into account is a prerequisite for computationally-aided design and development of a large number of technologically relevant materials, ranging from various steels for construction and safety applications1,2,3,4,5,6,8,9,10,11,12,13 to hard magnets for applications in electrical transportation and renewable energy technologies14,15.

One of the most popular computational methods, capable of capturing magnetism, are first-principles calculations realized by density functional theory (DFT). DFT calculations are, however, computationally expensive and limited to small system sizes and to a small number of magnetic configurations. DFT calculations that sample the magnetic degree of freedom explicitly, as needed for, e.g., lattice vibrations or vacancy formation energies in magnetically excited states, are therefore available only for very few selected cases.

Recent progress in machine-learning potentials has significantly accelerated accurate simulations of materials and molecules16,17,18,19,20,21,22,23,24,25,26. Such potentials express the interatomic energy as a function of atomic positions alone. Ignoring the electronic degrees of freedom, yet assuming a very flexible functional form for the interatomic energies, machine-learning potentials feature near-quantum mechanical accuracy at a computational efficiency of the order of classical interatomic potentials27. However, by ignoring the electronic degrees of freedom such potentials cannot distinguish different magnetic states, simply because different magnetic states feature different energies and the functional form of machine-learning potentials prohibits capturing such a magnetically-induced energy variation. In this paper, we introduce a strategy to overcome this fundamental shortcoming.

Results

Magnetic moment tensor potential

The starting point is a given set of energies, EDFT(R, S), which include the magnetic degree of freedom, e.g., computed via DFT, and where R and S denote a set of atomic coordinates and corresponding atomic spins. There are various ways to compute these energies from DFT, e.g., via fully relaxing the spin degree of freedom or, if one is interested in a broader sampling of EDFT(R, S), via constrained spin calculations28,29,30,31,32. We on purpose do not discuss in this work the different approaches available and their corresponding challenges to computing EDFT(R, S) since the main focus here is on an efficient parametrization for a given EDFT(R, S). We utilize standard spin-polarized DFT calculations where the local atomic moments are differently initialized while their longitudinal component is fully relaxed. The different magnetic configurations sampled are discussed further below. We note, however, that the proposed machine-learning potential can be straightforwardly applied with, e.g., constrained spin calculations.

The heart of the proposed approach is to approximate the energy EDFT(R, S) with Moment Tensor Potentials (MTPs)33,34 the idea of which is to expand the energy locally as a polynomial of its degrees of freedom, corrected in order to allow for a finite cutoff of the potential. We note that there are other functional forms allowing for approximation of EDFT as a function of enriched degrees of freedom35,36,37. A similar functional form as utilized in MTPs has been recently employed within the atomic cluster expansion (ACE)38. Both approaches feature a complete basis of invariant polynomials that differ only in the representation of the angular terms; MTP uses tensors while ACE uses spherical harmonics.

In our approach the total interaction energy is partitioned into contributions of individual local atomic environments:

$${E}^{{{{\rm{mMTP}}}}}=\mathop{\sum }\limits_{i=1}^{N}V({{\mathfrak{n}}}_{i}),$$
(1)

where \({{\mathfrak{n}}}_{i}\) is the neighborhood of the i’th atom and N is the number of atoms in the atomic configuration. In the present paper the degrees of freedom are atomic positions R = {ri, i = 1, …, N} and spins S = {si, i = 1, …, N} as opposed to the originally developed MTPs33,34 in which the potential energy depends only on atomic positions. The atomic neighborhood of the i’th atom, \({{\mathfrak{n}}}_{i}\), is hence described by the relative interatomic positions rij = rj − ri, the spin of the central atom, si, and the spins of the neighboring atoms sj, formally

$${{\mathfrak{n}}}_{{{{\rm{i}}}}}=\{({{{{\boldsymbol{r}}}}}_{ij},{s}_{i},{s}_{j}):j=1,\ldots ,{N}_{{{{\rm{nb}}}}}^{i}\},$$

where \({N}_{{{{\rm{nb}}}}}^{i}\) is the number of neighbors of the i’th atom.

The expansion of the function V is:

$$V({{\mathfrak{n}}}_{i})=\mathop{\sum}\limits_{\alpha }{\xi }_{\alpha }{B}_{\alpha }({{\mathfrak{n}}}_{i}),$$

where ξ = {ξα} are the “linear” parameters to be optimized. The function V is assumed to be an arbitrary polynomial of the corresponding degrees of freedom, modified so that instead of the polynomial growth the potential V vanishes beyond some cutoff distance. The potential is expanded via basis functions Bα defined through the so-called moment tensor descriptors

$${M}_{\mu ,\nu }({{\mathfrak{n}}}_{i})=\mathop{\sum }\limits_{j=1}^{{N}_{{{{\rm{nb}}}}}^{i}}{f}_{\mu }(| {{{{\boldsymbol{r}}}}}_{ij}| ,{s}_{i},{s}_{j})\underbrace{{{{{\boldsymbol{r}}}}_{ij}\otimes ...\otimes {{{{\boldsymbol{r}}}}}_{ij}}}_{\begin{array}{c}\nu\,{{\mbox{times}}}\end{array}},$$
(2)

where “  ” is the outer product of vectors, and, thus, the angular part rij . . . rij is a tensor of ν’th rank. The function fμ(rij, si, sj) is a polynomial of rij, si and sj, modified for a finite cutoff radius. It has the form:

$${f}_{\mu }(| {{{{\boldsymbol{r}}}}}_{ij}| ,{s}_{i},{s}_{j})=\mathop{\sum }\limits_{\zeta =1}^{{N}_{\varphi }}\mathop{\sum }\limits_{\gamma =1}^{{N}_{\psi }}\mathop{\sum }\limits_{\beta =1}^{{N}_{\psi }}{c}_{\mu }^{\beta ,\gamma ,\zeta }{\psi }_{\beta }({s}_{i}){\psi }_{\gamma }({s}_{j}){\varphi }_{\zeta }(| {{{{\boldsymbol{r}}}}}_{ij}| ){({r}_{{{{\rm{cut}}}}}-| {{{{\boldsymbol{r}}}}}_{ij}| )}^{2},$$
(3)

where \({{{\boldsymbol{c}}}}=\{{c}_{\mu }^{\beta ,\gamma ,\zeta }\}\) are the “radial” parameters to be optimized, Nφ is the number of polynomial basis functions φζ(rij) on the interval \([{r}_{\min },{r}_{{{{\rm{cut}}}}}]\), where \({r}_{\min }\) is the minimal distance between atoms and rcut is the cutoff radius beyond which atoms do not interact. The term \({({r}_{{{{\rm{cut}}}}}-| {{{{\boldsymbol{r}}}}}_{ij}| )}^{2}\) ensures a smooth vanishing of the potential for rij > rcut. The other functions, ψβ(si) and ψγ(sj), are the polynomial basis functions of the local spins of the central and neighboring atoms, respectively. The number of these spin basis functions is Nψ. They are defined on the interval \([{s}_{\min },{s}_{\max }]\), where the values \({s}_{\min }\) and \({s}_{\max }\) are the minimal and maximal local magnetic moments in the system being investigated.

The mMTP basis functions Bα are defined as all possible contractions of \({M}_{\mu ,\nu }({{\mathfrak{n}}}_{i})\) to a scalar, e.g.,

$${M}_{1,0}({{\mathfrak{n}}}_{i}),\,{M}_{0,1}({{\mathfrak{n}}}_{i})\cdot {M}_{1,1}({{\mathfrak{n}}}_{i}),\,{M}_{3,2}({{\mathfrak{n}}}_{i}):{M}_{1,2}({{\mathfrak{n}}}_{i}),\,\ldots \ ,$$

where “  ” is the dot product of two vectors, and “:” is the Frobenius product of two matrices. In principle, an infinite number of such mMTP basis functions could be constructed. In order to choose which basis functions to include in practice in the mMTP, we introduce the so-called level of each descriptor, levMμ,ν = 2 + 4μ + ν, choose a certain \({{{{\rm{lev}}}}}_{\max }\), and include in the mMTP each basis function with \({{{\rm{lev}}}}{B}_{\alpha }\le {{{{\rm{lev}}}}}_{\max }\) (see Novikov et al.39 for details). Thus, the number of the “linear” parameters ξ depends on \({{{{\rm{lev}}}}}_{\max }\), which also determines the number of radial functions, Nμ. The number of the “radial" parameters c is equal to \({N}_{\mu }{N}_{\varphi }{N}_{\psi }^{2}\). We denote all free parameters of an mMTP collectively by θ = {ξ, c}, and the total interaction energy by EmMTP = EmMTP(θ; R, S).

We note that the mMTP formalism contains the Heisenberg model as a special, limiting case. In particular, first-degree polynomials have to be utilized for ψβ(s) = ψγ(s) = s in Eq. (3), and φζ needs to “encompass” (i.e., be nonzero at) the nearest neighbors only. Such a choice of terms in the expansion Eq. (3) also leads to a model similar to the one proposed in Nikolov et al.37, except that in the latter case the full vectorial spins were considered. Moreover, the biquadratic terms, \({({s}_{i}{s}_{j})}^{2}\)40,41, adopted by Nikolov et al.37, arise naturally when Mμ,0 is constructed with such choices of ψβ and φζ and gets multiplied by itself. Then the radial parameters \({c}_{\mu }^{\beta ,\gamma ,\zeta }\) correspond to the coupling constants as obtained from DFT data.

The free parameters θ in our approach are found by fitting EmMTP to DFT data. We consider a training set including K magnetic configurations (R(k), S(k)) with known DFT energies EDFT, DFT forces \({{{{\boldsymbol{f}}}}}_{i}^{{{{\rm{DFT}}}}}\) on every atom i, and a 3 × 3 tensor of DFT stresses σDFT and minimize the objective function:

$$\begin{array}{l}\mathop{\sum }\limits_{k=1}^{K}\left[{w}_{{{{\rm{e}}}}}{| {E}^{{{{\rm{mMTP}}}}}\left({{{\boldsymbol{\theta }}}};{{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)-{E}^{{{{\rm{DFT}}}}}\left({{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)| }^{2}\right.\\ \,+{w}_{{{{\rm{f}}}}}\mathop{\sum}\limits_{i}{| {{{{\boldsymbol{f}}}}}_{i}^{{{{\rm{mMTP}}}}}\left({{{\boldsymbol{\theta }}}};{{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)-{{{{\boldsymbol{f}}}}}_{i}^{{{{\rm{DFT}}}}}\left({{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)| }^{2}\\ \,\left.+{w}_{{{{\rm{s}}}}}{| {\sigma }^{{{{\rm{mMTP}}}}}\left({{{\boldsymbol{\theta }}}};{{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)-{\sigma }^{{{{\rm{DFT}}}}}\left({{{{\boldsymbol{R}}}}}^{(k)},{S}^{(k)}\right)| }^{2}\right]\end{array}$$

where is the length of a vector or the Frobenius norm of a matrix. The optimization of the parameters is carried out using an iterative quasi-Newton optimization method, specifically, the Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) starting with a random initial guess. As opposed to mMTP, the energy of the non-magnetic MTP, proposed in our earlier works, does not depend on spins, i.e. EMTP = EMTP(θ; R), and the functions fμ(rij) do not include spins.

Convergence of magnetic MTP

We first analyze the convergence behavior of the magnetic and non-magnetic MTP toward DFT energies as the number of parameters is increased. The convergence was measured on a hold-out set of about 1000 configurations not participating in the fitting of the potentials. Figure 1 shows that the mMTP exhibits a steady convergence, while the non-magnetic MTP does not. This reiterates our original motivation: the space of atomic positions (R) is not the right one for approximating the quantum-mechanical energy, but enriched with spins, (R, S), this becomes a suitable space for that purpose.

Fig. 1: Convergence of the magnetic potential, mMTP, with respect to DFT energies and lack of convergence for the non-magnetic MTP.
figure 1

The graph indicates that most variation of the energy on the training set is in the magnetic degrees of freedom that can be captured only by the magnetic potential.

Based on the convergence tests, we have chosen a well converged \({{{{\rm{lev}}}}}_{\max }=24\) for the subsequent tests. For both MTP and mMTP we took Nφ = 12 polynomial functions of the atomic positions with \({r}_{\min }=2\) Å, rcut = 5.5 Å. For the mMTP we took Nψ = 2 polynomial functions of the local magnetic moments with \({s}_{\min }=-3.5\,{\mu }_{B}\) and \({s}_{\max }=3.5\,{\mu }_{B}\). The total number of MTP parameters was 937 while that of mMTP was 1153. The weights in the objective function were we = 1, wf = 0.01, and ws = 0.001.

For each model, we fitted five potentials and selected the best (with the least training error). The validation root-mean-square errors are shown in Table 1. We can see that adding local magnetic moments to the potential as additional degrees of freedom does not significantly increase the number of parameters, but greatly improves the accuracy of training.

Table 1 The best non-magnetic and magnetic MTPs.

Phonon spectra prediction

We next evaluate the performance of the best optimized MTP and mMTP potentials to predict phonon spectra of different magnetic states. We consider two extreme scenarios representing the limits of magnetic configurations, namely the ferromagnetic state, in which all spins are aligned parallel and a paramagnetic state, treated in the adiabatic limit of fast fluctuating spins. Since the phonon energies were derived from small perturbations (utilizing the small displacement method), this test is a very sensitive measure to detect how well even very small variations in interatomic forces can be captured. The results for the ferromagnetic case for both potentials are shown in Fig. 2a in comparison with the data directly obtained from DFT. The agreement between the mMTP and the DFT data is excellent whereas the non-magnetic MTP shows significant deviations, in particular around the N-point. The deviations for the non-magnetic MTP are a direct consequence of the training database which also includes magnetically disordered configurations responsible for pronounced phonon softening as discussed in the following.

Fig. 2: Phonon spectra computed for different magnetic states.
figure 2

Magnetic MTP reproduces spectra both for a the ferromagnetic and b paramagnetic state (modeled using the SSA approach) based on the DFT calculations, whereas non-magnetic MTP cannot distinguish the two states and produces results between the ferromagnetic and paramagnetic ones.

To compute the phonon spectra in the paramagnetic regime we utilize the spin-space averaging (SSA) method6. In this approach, effective interatomic forces can be defined by averaging over various disordered magnetic configurations weighted by a Boltzman distribution. For the actual averaging we utilized the crystal symmetries as proposed in refs. 5,6 and performed the SSA using a single random magnetic configuration for which each atom is displaced in each cartesian direction. This provides a large number of locally inequivalent magnetic configurations (i.e., 54  3 = 162 configurations for the employed supercell). This procedure was shown to be robust with respect to the actually chosen random magnetic configuration as discussed in Körmann et al.6.

The resulting DFT-based phonon spectrum shown in Fig. 2b features a pronounced softening at the N-point6. This softening is related to the decrease of the elastic constants and constitutes an important precursor of the structural transformation in iron. The non-magnetic MTP cannot distinguish the underlying atomic forces in these different magnetic states from the ferromagnetic forces. This is the reason why the MTP phonon spectrum for the paramagnetic state shown in Fig. 2b is exactly the same as the one in Fig. 2a for the ferromagnetic state. The non-magnetic MTP spectra fall in-between the ferromagnetic and paramagnetic solutions and hence do not quantitatively reproduce the DFT data in either regime. In contrast, applying the SSA approach with the mMTP reveals an excellent agreement with the DFT data, reproducing quantitatively important characteristics such as, e.g., the decrease of the phonon energies near the N-point and along the H-P path.

Disordered-local-moment molecular-dynamics simulations

To evaluate the performance of the mMTP at finite temperatures and larger atomic displacements, we have performed molecular dynamics (MD) simulations. The temperature was set to 800 K and the lattice constant to 2.9 Å. To sample not only the vibrational degrees of freedom but the spin space and in particular the coupling between vibrations and spins, we have performed disordered-local-moment MD (DLM-MD) simulations42. Further, in order to explicitly validate the mMTP against DFT, we have utilized the concept of thermodynamic integration, similarly as used in the TU-TILD+MTP method previously43. Specifically, we have introduced a linear coupling between DFT and mMTP forces,

$${F}_{\lambda }=\lambda {F}^{{{{\rm{DFT}}}}}+(1-\lambda ){F}^{{{{\rm{mMTP}}}}},$$
(4)

with the coupling constant λ and DFT and mMTP forces FDFT and FmMTP. The coupled forces Fλ were used for evolving the DLM-MD trajectories. The mMTP in this test was fitted to ’pure’ DFT calculations (i.e., nominally corresponding to λ = 1) and tested independently for a new set of calculations at λ = 0, 0.5, 1. To render the DFT calculations feasible we employed a 16-atom supercell for the DLM-TI calculations; cross-checks for a 54-atom supercell showed similar results. Further details are given in the Methods section.

Figure 3 highlights the excellent performance of the mMTP. In the left panel, we observe that the mMTP energies fall almost on top of the DFT energies; the root-mean-square error (RMSE) is only 2.0 meV/atom—of the same order as obtained previously for non-magnetic systems43. The middle panel clarifies that the best possible non-magnetic MTP is almost an order of magnitude away in terms of energy accuracy, with an RMSE of 16 meV/atom. The right panel of Fig. 3 shows the spin correlation between mMTP and DFT, which of course only the magnetic version of the MTP is capable to reproduce. We observe an RMSE of 0.12 μB, which is ~5% of the magnitude of the absolute spin.

Fig. 3: Energy and spin correlations between different models obtained from DLM-TI calculations at 800 K.
figure 3

It demonstrates that the magnetic MTP predicts both a DFT energies, and c spins with high accuracy, whereas b non-magnetic MTP is almost an order of magnitude worse in accuracy.

We stress that Fig. 3 includes values for all the investigated coupling constants λ = 0.0, 0.5, 1.0. Looking at each λ value separately, the correlations are in fact very similar. This means that there is no difference in the correlation, if we use pure DFT forces (cf. Eq. (4)), pure mMTP forces, or DFT-mMTP coupled forces to evolve the MD. This hence allows one to perform a full thermodynamic integration from the mMTP to DFT and compute the respective free energy difference, which is however beyond the scope of the present work.

Discussion

We have developed the mMTPs, a class of magnetic machine-learning interatomic potentials capable of simultaneously and accurately approximating spin and atomic degrees of freedom. This has been achieved by utilizing a two-step minimization scheme for the spin and atomic configurational space. Applying the mMTP to DFT-derived data for the prototypical bcc iron system reveals that the mMTPs are capable to quantitatively approximate local magnetic moments, energies, and forces for various magnetic states (see the Supplementary Material for further tests). A number of applications such as the computation of phonon spectra in ferro- and paramagnetic states, as well as MD simulations including spin-flips, demonstrate that mMTPs provide near DFT accuracy without significantly losing the computational efficiency of classical interatomic potentials.

Methods

Derivation of mMTP

Here we derive the form of the MTP as a function of relative atomic positions rij and vectorial magnetic moments, si and sj. Following the logic of the original paper introducing the MTP33, an arbitrary polynomial of the positions and magnetic moments can be represented as all possible contractions of the following Moment Tensors,

$${M}_{\zeta ,\nu ,\beta ,\gamma ,\xi ,\eta }=\mathop{\sum }\limits_{j=1}^{{N}_{{{{\rm{nb}}}}}^{i}}{Q}_{\zeta }(| {{{{\boldsymbol{r}}}}}_{ij}| )| {s}_{i}{| }^{\beta }| {s}_{j}{| }^{\gamma }\ ({{{{\boldsymbol{r}}}}}_{ij}^{\otimes \nu })\,\otimes \,({{{{\boldsymbol{s}}}}}_{i}^{\otimes \xi })\,\otimes \,({{{{\boldsymbol{s}}}}}_{j}^{\otimes \eta }),$$
(5)

where Qζ is the ζ-th radial basis function and by definition

$${{{{\boldsymbol{v}}}}}^{\otimes n}=\underbrace{{{{\boldsymbol{v}}}\otimes ...\otimes {{{\boldsymbol{v}}}}}}_{\begin{array}{c}n\,{{\mbox{times}}}\end{array}}$$

for an arbitrary vector v. We note that Eq. (5) directly corresponds to ref. 36, Eq. 26 with spherical harmonics \({{{{\boldsymbol{Y}}}}}_{l}^{m}({{{\boldsymbol{v}}}}/| v| )\) instead of tensors vn (lmn).

We next assume scalar-valued spins, i.e., that it is sufficient to consider ξ = η = 0 (and adsorb the sign of the spin into the radial part):

$${M}_{\zeta ,\nu ,\beta ,\gamma }=\mathop{\sum }\limits_{j=1}^{{N}_{{{{\rm{nb}}}}}^{i}}{Q}_{\zeta }(| {{{{\boldsymbol{r}}}}}_{ij}| ){s}_{i}^{\beta }{s}_{j}^{\gamma }\ ({{{{\boldsymbol{r}}}}}_{ij}^{\otimes \nu }).$$
(6)

We could choose to directly expand the energy over different contractions of tensors Mζ,ν,β,γ, but instead we combine different products of radial and spin basis functions, \({Q}_{\zeta }(| {{{{\boldsymbol{r}}}}}_{ij}| ){s}_{i}^{\beta }{s}_{j}^{\gamma }\), into the functions fμ(rij, si, sj) with coefficients \({c}_{\mu }^{\beta ,\gamma ,\zeta }\) that are found from data, as explained in the main text of the manuscript. Note that in this work we use Chebyshev polynomials ψβ(s) instead of monomials sβ.

DFT details

All DFT calculations were performed with VASP44,45,46,47 utilizing the projector augmented wave (PAW) method48 and the generalized gradient approximation49. For the training set of the 54-atom supercell, we considered 70 atomic configurations generated from an initial ferromagnetic MD at 1000 K. For each of these atomic configurations 200 different arrangements of magnetic spins have been initialized of which 67% converged under the high cutoff energy of 500 eV and k-point density of 11,664 k-points × atoms (6 × 6 × 6 grid) chosen in combination with a convergence criterion of 10−7 eV per supercell to ensure high-accurate DFT data. This resulted into in total 9351 calculations. To impose spin-inversion symmetry we added the same number of configurations to the training with reversed spin directions. The DFT calculations were performed at a lattice parameter of 2.9 Å corresponding to the experimental value near the Curie temperature. The phonon calculations have been performed utilizing the finite-displacement method with a displacement of 0.02 Å and utilizing the same set of technical parameters.

Disordered-local moment thermodynamic integration from mMTP to DFT

To sample the paramagnetic state at finite temperatures within the framework of thermodynamic integration (TI), we have employed the disordered-local-moment (DLM) MD42. The local magnetic moments were flipped randomly every 10 fs ( =10 MD steps) such that half of the moments were pointing up and the other half down. The timestep for the MD was set to 1 fs; small enough to sample well the time development of the magnetic moments also within the 10 fs time intervals. The temperature was controlled by the Nose thermostat50. Usage of the Nose thermostat was critical; tests with the Langevin thermostat showed that it cannot stabilize the temperature well due to the additional impact of the spin flips on the energy of the system.

Spin-polarized DFT calculations in general and DLM calculations for Fe in particular are very prone to convergence problems, due to a flat energy landscape with many local minima as a function of spin state. Therefore, for the calculation of the DFT energy and forces during the MD, a very tight convergence criterion of 10−7 eV per supercell was set, in order to enforce sampling of the original DFT energy landscape that served as the input to the magnetic MTP fitting. To nevertheless allow for an efficient DFT MD, we have restricted the number of electronic iteration steps (typically to 40). Not fully converged DFT calculations were omitted from the comparison to the magnetic mMTP. Likewise, DFT calculations featuring local moment flips with respect to the mMTP data were not considered in the comparison.

To increase the efficiency of the DFT DLM-MD simulations we found it necessary to turn off the wave function extrapolation (both linear and quadratic); the reason for this lying in the randomization of the spins along the MD trajectory. A further efficiency increase was achieved by equilibrating the MD at the temperature of interest by utilizing the efficient mMTP. In this way, the part of the MD involving the expensive DFT calculations started directly on a well-equilibrated trajectory.

The DLM-TI was performed at a temperature of 800 K and at a lattice constant of 2.9 Å. A supercell of 2 × 2 × 2 (in terms of the conventional bcc unit cell) with 16 atoms was utilized. A dense k-point sampling of 6 × 6 × 6 corresponding to 3456 k-points × atom, a plane wave cutoff of 500 eV, and Fermi-Dirac smearing were used for the DFT calculations. For the mMTP calculations, initial magnetic moments were set according to the DFT moments. Then, for every mMTP energy and force calculation, the magnetic moments were fully relaxed based on the mMTP energetics. Coupling constants of λ = 0.0, 0.5, 1.0 were utilized. At each coupling constant two different random seeds were used to generate distinct trajectories. In total, more than 22,000 MD steps (22 ps) were conducted to generate statistically highly reliable correlation plots as shown in Fig. 3 of the main text. Test calculations for a larger 3 × 3 × 3 supercell with 54 atoms turned out to be computationally highly demanding due to the strict DFT convergence parameters. Corresponding results indicate however a similar performance of the mMTP also for the larger supercell.