Introduction

From a fundamental point of view and in the perspective of various applications, it is of utmost importance to be able to accurately calculate magnetic exchange interactions. These interactions are essential in a theoretical description for everything from the magnetic ground-state to dynamical and thermodynamical properties for all kinds of materials in different geometries. This is especially important in the field of spintronics, which involves the study of active control and manipulation of spin degrees of freedom in solid-state systems1. In particular, when modeling magnon spintronics that utilizes spin waves to carry spin current2,3. The present study focuses on developing a method that can properly describe the magnetic interactions in a wide range of systems, with collinear or non-collinear magnetic structures.

Calculations of magnetic exchange interaction are frequently based on multi-scale modelling described by a phenomenological, classical or quantum, Heisenberg model fitted to total energy calculations by e.g. density functional theory. Building on the formulation of Andersen’s force theorem4,5,6, the works by Liechtenstein et al.7 and Oswald et al.8 introduced the magnetic force theorem (MFT). The general formulation by Liechtenstein et al.7,9 is one of the most widely used methods for the determination of inter-atomic exchange interactions since it may be applied to any collinear magnetic configuration. The great utility of the theorem for the electronic structure community is due to the fact that the exchange interactions are determined from non-self consistent calculations that are orders of magnitude faster than most self-consistent approaches. However, further improvements to the method for calculating inter-atomic magnetic exchange interactions are still widely discussed10,11,12,13,14,15.

Developments in time-dependent density functional theory (TD-DFT) and many body perturbation theory have made it possible to access the full dynamical magnetic susceptibility16,17,18,19, which gives detailed information on both magnon and Stoner excitations, the latter not being accessible by the different versions of MFT calculations that operate in the adiabatic approximation. However, by a multi-code and multi-scale approach using ab initio calculations, Monte Carlo and atomistic spin-dynamics simulations, magnon excitations and magnetic phase transitions may be accurately simulated for larger systems20,21,22.

The MFT relies on the assumption of small changes in the magnetisation and charge density. In addition, usually the internal magnetic fields are varied in the calculations rather than the directions of the moments. Short wavelength excitations are, therefore, determined less accurately and the calculations are said to be performed in the long wavelength approximation (LWA) (for a more extensive discussion see the work of Antropov et al.23,24). Various strategies to avoid the LWA have been suggested over the years23,24,25,26. In particular, developments by Patrick Bruno25 using “constrained” density functional theory introduced by Dederichs et al.27, included constraining magnetic fields acting on each atomic site. These fields reinforce the directions of the small transverse spin displacements used to probe the exchange interactions. It was suggested that by neglecting the constraining fields, one obtains a set of “bare” exchange parameters, which differ from the real exchange parameters. The inclusion of the constraining fields leads to a “renormalisation” of the spin-wave spectrum and the Curie temperature. The work by Katsnelson et al.26 showed that while the renormalisation suggested by Bruno25 gives better thermodynamic properties, the improvement in the magnon dispersion is questionable since the renormalisation should be small in every case where the adiabatic approximation is valid. Moreover, it is also interesting to compare the results with those obtained using the disordered local moment (DLM) formalism28,29 that also operates beyond the LWA.

The MFT is usually applied in the frozen magnon method30, where exchange interactions are determined from energy differences or evaluation of the spin-torque for different spin spiral configurations. It has been noted that in the absence of constraining fields (converged in self-consistent spin spiral calculations), a mismatch is introduced between the desired and the resulting directions of the magnetic moments, thus leading to a significant unphysical spin-torque31,32. Analogous to Bruno’s correction, this can be solved by adding an additional term to the potential, that includes the constraining fields. This procedure will be referred to as the corrected frozen magnon method in the continuation of the text. It should be noted that the constraining fields are equal to the Heisenberg exchange fields that act on the magnetic moments. It is, therefore, quite possible to derive the exchange parameters directly from the self-consistently converged constraining fields31,32. This approach is used in the present work to formulate the transverse-field method that is found to be preferable to the frozen magnon procedures due to better scaling and the avoidance of a frozen potential. A possible exception could be for systems with strong relativistic effects, where spin-orbit coupling can be introduced through perturbation theory33,34. Furthermore, it is well-suited to parameterise finite deviations from the ground state since no assumption of small changes in magnetisation and charge density is necessary.

After setting the grounds of the transverse-field method, the next step is to extend the Hamiltonian by including multi-spin interactions and see to what extent the increased flexibility improves the predictive power of the model for finite deviations. Interactions beyond the Heisenberg model have made the focus of many studies in the field of materials science10,35,36,37,38. However, it is important to avoid overfitting when constructing effective models. A simple and effective strategy is to use cross validation39, where the calculated data can be used both to parameterise an effective model and to estimate the predictive power of the model.

When one aims at an automatic scan of the magnetic behaviour of a large number of compounds, following the lines of high-throughput computational materials design40, it is advantageous to have a general Hamiltonian and a method of parameterisation that can accurately capture the magnetic phase space for many different compounds at a reasonable computational cost. The transverse field method presented in this work fulfills those requirements. Given a computationally designed, still non-existing material, the developed method allows for a fast calculation of exchange parameters and, importantly, the cross-validation offers a programmable decision maker that will automatically determine which interactions in the extended Heisenberg model are important.

The article is structured in the following way: In “Introduction”, we present different methods for obtaining the exchange interactions. We also introduce an extended Heisenberg models and discuss the predictive power of the cross-validation method for these models. The computational details are described in “Theory”. In “Computational details” we present calculated energies and exchange interactions for three different systems: bcc Fe, fcc Ni and the B2 FeCo alloy.

Theory

The frozen magnon methods

As mentioned in the Introduction, one way of obtaining the exchange interactions is by means of the frozen magnon method. Within this technique, one needs to construct different spin-spiral configurations and compare their energies in order to evaluate the exchange parameters. At a closer look, one can observe a discrepancy between the input-direction of the magnetic moments and their actual orientation. This leads to an undesired effect, namely an unphysical spin-torque acting on the magnetic moments. In order to avoid this, the constraining fields are introduced.

In the augmented plane-waves (APW) based methods, the crystals are divided into spherical muffin-tin regions around the atomic sites and interstitial regions between the sites. The constraining fields (\({\mathbf {B}}^\mathrm {C} _i\)) are introduced in the Kohn–Sham equation,

$$\begin{aligned} \left\{ -\frac{\hbar ^2}{2m}\nabla ^2+V^\mathrm {eff}({\mathbf {r}})+\varvec{\sigma } \cdot \mathbf { B}^\mathrm {eff}({\mathbf {r}})-\epsilon _{\nu }\right\} \psi _{\nu }({\mathbf {r}})=0 \end{aligned}$$
(1)

as uniform vector fields in the muffin-tins in addition to the magnetic field \({\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\), generated by the exchange-correlation potential,

$$\begin{aligned} {\mathbf {B}}^\mathrm {eff}({\mathbf {r}})={\mathbf {B}}^\mathrm {C}_i+{\mathbf {B}}^\mathrm {XC}(\mathbf { r}). \end{aligned}$$
(2)

The constraining fields \({\mathbf {B}}^\mathrm {C}_i\), are the Lagrange multipliers necessary to perform the energy minimisation under the constraint of the specific magnetisation density41. In the self-consistency cycle, they are converged together with the rest of the potential such that the components of the total integrated magnetic moments perpendicular to the chosen direction of the moments are zero

$$\begin{aligned} {\mathbf {B}}^\mathrm {C}_{i,p+1} = {\mathbf {B}}^{\mathrm {C}}_{i,p} + \lambda ( {\mathbf {m}}_{i,p} - {\mathbf {m}}_{i,chosen}), \end{aligned}$$
(3)

where i is the site index, \({\mathbf {m}}_i\) is the magnetic moment at site i, p is the iteration number and \(\lambda\) is a scaling factor. The procedure still allows for intra-atomic non-collinearity.

When calculating the total energy of a cone-spin-spiral, it is usually necessary to add constraining magnetic fields to each magnetic sub-lattice in order to preserve the cone-angle during the self consistency cycles. Exceptions are systems with the cone-spin-spiral ground state42 since the constraining fields are equal to the transverse part of the Heisenberg exchange fields. Non-zero transverse exchange fields imply that the system is not in equilibrium.

In the frozen magnon methods, a number of reference \(\Gamma\)-point spin configurations are converged self-consistently. Configurations where there is a non-zero cone angle of the magnetic moment on either a single magnetic sub-lattice or a pair of magnetic sub-lattices are considered here. These are only collinear if all magnetic sub-lattices are tilted with the same angle. In some previous publications22,43, these reference states were obtained by rotations of the potential of a collinear state. Since only the potentials in the muffin-tins were rotated, this left a discontinuity in the magnetisation density at the border between the muffin-tins and the interstitial region. It was therefore found that better agreement with self-consistent calculations was achieved by setting the interstitial magnetisation to zero22.

The procedure followed in this work makes it possible to keep a non-zero interstitial magnetisation since we don’t perform any rotations of the potential. On the other hand, it leads to a moderate increase in computational cost for systems with more than one magnetic sub-lattice since several self-consistent non-collinear calculations have to be carried out. Once the potentials for the reference configurations are obtained, non-self consistent total energy calculations are done for a set of spin-spiral wave vectors using the converged \(V^\mathrm {eff}({\mathbf {r}})\) and \({\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\) from the reference configurations and \({\mathbf {B}}^\mathrm {C}({\mathbf {r}})\) set to zero. The total energy differences \(\Delta E ({\mathbf {q}})\) are approximated as the difference in the sums of eigenvalues through the MFT9

$$\begin{aligned} \Delta E ({\mathbf {q}})\approx \sum _{\nu } \epsilon _{\nu }({\mathbf {q}})-\sum _{\nu } \epsilon _{\nu }({\mathbf {0}}). \end{aligned}$$
(4)

In the last step, exchange parameters \(J_{ij}\) of a Heisenberg Hamiltonian are obtained through a least square fitting procedure43

$$\begin{aligned} \Delta E ({\mathbf {q}})=\frac{1}{2} \sum _{i\ne j}J_{ij}({\mathbf {e}}_{i}({\mathbf {q}})\cdot {\mathbf {e}}_{j}({\mathbf {q}})-{\mathbf {e}}_{i}({\mathbf {0}})\cdot {\mathbf {e}}_{j}({\mathbf {0}})). \end{aligned}$$
(5)

Here, the direction of the magnetic moments of a sub-lattice with a cone-angle \(\theta\) is given by the expression,

$$\begin{aligned} {\mathbf {e}}_{i} ({\mathbf {q}}) = \sin (\theta _{i}) \cos ({\mathbf {q}}\cdot {\mathbf {R}}_{i} + \phi _{i}){\mathbf {x}} + \sin (\theta _{i}) \sin ({\mathbf {q}}\cdot {\mathbf {R}}_{i} + \phi _{i}){\mathbf {y}} + \cos (\theta _{i}){\mathbf {z}}, \end{aligned}$$
(6)

where \(\phi _i\) are the phase factors, \({\mathbf {q}}\) is the spin-spiral wave-vector and \({\mathbf {R}}_i\) are the lattice vectors.

In the corrected frozen magnon method an extra calculation step is done compared to the conventional method. First, \(V^\mathrm {eff}({\mathbf {r}})\) is kept fixed and \({\mathbf {B}}^\mathrm {C}({\mathbf {r}})\) is converged separately for each spin-spiral wave vector. In the final total energy calculations, \(V^\mathrm {eff}({\mathbf {r}})\) and \({\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\) are kept fixed, as in the conventional method, but now the pre-converged \({\mathbf {B}}^\mathrm {C}_{i}\) are added for each one-shot calculation of the sums of eigenvalues. There are slight numerical differences between the constraining fields obtained in this way compared to fields obtained fully self-consistently. However, this procedure is much faster than a fully self-consistent calculation and no significant difference was obtained with respect to the total energy differences for the different compounds studied in this work. A noteworthy difference between the two frozen magnon methods is that, while \({\mathbf {e}}_{i}\) is roughly parallel to \({\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\) in both cases, only in the corrected method is \({\mathbf {e}}_{i}\) also guaranteed to be the direction of the integrated moment after a single iteration of the electronic structure code. For materials and magnetic states, where the self-consistently converged magnetic fields fulfil \({\mathbf {B}}^{\mathrm {C}}_{i}~\gtrsim ~{\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\), the correction to the conventional frozen magnon method can be expected to be important. The constraining field \({\mathbf {B}}^\mathrm {C}({\mathbf {r}})\) compensates for the spin-torques the system experiences when it is forced to assume magnetic configurations different from the ground state structure. The correction will thus tend to be more relevant for itinerant magnetic systems with small moments, materials with strong exchange interactions or magnetic configurations far from the ground state.

Exchange parameters from a transverse field

In order to obtain exchange parameters from self-consistent quantities one could perform fully self consistent spin-spiral total energy calculations and extract the exchange parameters from Eq. (5). This is however computationally expensive even for systems that can be described with a small unit-cell if we want to use an accurate all-electron code. A faster procedure, that is also fully self-consistent, was worked out by Grotheer et al.31,32. Their approach exploits the fact that the constraining fields are equal and opposite to the transverse part of the Heisenberg exchange fields. We use this as a starting point and develop a method which we coin the transverse field method (TF).

The exchange field \({\mathbf {H}}_{0i}\) acting on a magnetic moment \({\mathbf {m}}_{i}\) is given by:

$$\begin{aligned} {\mathbf {H}}_{0i}=-\frac{1}{ m_{i}} \sum _{j}J_{ij}{\mathbf {e}}_{j}. \end{aligned}$$
(7)

Once the constraining fields and magnetic moments are converged for a number of magnetic configurations, we can solve the following system of equations for the exchange parameters, \(J_{ij}\):

$$\begin{aligned} {\mathbf {B}}^\mathrm {C}_{i} \cdot {\mathbf {B}}^\mathrm {C}_{i} = \frac{1}{ m_{i}} \sum _{j}J_{ij} {\mathbf {B}}^\mathrm {C}_{i} \cdot {\mathbf {e}}_{j}. \end{aligned}$$
(8)

Relative to previous studies, besides the difference in starting point (i.e. real space vs. reciprocal space), a further difference is the choice of the free variable, which is the constraining field \({\mathbf {B}}^\mathrm {C}_{i}\) in our method, while in Grotheer’s et al. approach31,32 it is the cone-angle. The transverse field method has a better scaling with respect to the size of the system than the frozen magnon method has, since the number of determined variables is equal to the number of magnetic atoms for each calculation, while in the latter method only a single variable is determined in each calculation. While the constraining fields in the transverse field method are converged self-consistently, which is computationally more demanding compared to non-self-consistent methods, this is compensated well by the fact that the constraining fields are robust quantities that are usually much faster to converge with respect to the computational parameters than small total energy differences44.

In Fig. 1 we present a diagram that summarises the two methods that we developed, namely the corrected frozen magnon and the transverse-field method in comparison to the canonical frozen magnon method. A detailed comparison and analysis of exchange parameters obtained within these three methods is given in “Computational details”.

Figure 1
figure 1

Three different methods for obtaining exchange parameters: the frozen magnon, the corrected frozen magnon and the transverse-field method. Due to the fact that the constraining fields in the corrected frozen magnon method are converged self-consistently, while the potential is kept fixed, we call these fields pseudo-self-consistent.

Beyond the Heisenberg model

The regular Heisenberg model performs poorly when large deviations from the ground state are considered for itinerant magnetic systems in the sense that the convergence of the residuals with respect to the number of parameters is slow compared to more complex models45. In the final section (“Computational details”), the good scaling of the constraining field method is therefore applied to parameterise models that—besides the bilinear terms—also contain higher order interactions such as bi-quadratic, three- and four-spin interactions. The exchange fields of two models are considered here in addition to the regular Heisenberg exchange field \({\mathbf {H}}_{0}\) (Eq. (7)). First, \({\mathbf {H}}_{0}\) together with bi-quadratic, three- and four-spin interactions, called \({\mathbf {H}}_{2}\):

$$\begin{aligned} {\mathbf {H}}_{2i}=-\frac{1}{m_{i}}\left( \sum _{j}J_{ij}{\mathbf {e}}_{j} +\frac{1}{2} \sum _{j,k \ne l}B_{ijkl}({\mathbf {e}}_{j})({\mathbf {e}}_{k}\cdot {\mathbf {e}}_{l})\right) \end{aligned}$$
(9)

and then \(\mathbf{H}_{2}\) restricted to bi-quadratic interactions only, called \(\mathbf{H}_{1}\):

$$\begin{aligned} \mathbf{H}_{1i}=-\frac{1}{m_{i}}\left( \sum _{j}J_{ij}{} \mathbf{e}_{j} +\sum _{j}B_{ijji}(\mathbf{e}_{j}) ( \mathbf{e}_{i} \cdot \mathbf{e}_{j})\right) \end{aligned}$$
(10)

where \(J_{ij}\) and \(B_{ijkl}\) are the exchange parameters. Similarly to Eq. (8), the parameters can be related to the constraining fields. For example, applied to the most general model \({\mathbf {H}}_{2}\) (Eq. (9)), this gives:

$$\begin{aligned} {\mathbf {B}}^\mathrm {C}_{i}\cdot {\mathbf {B}}^\mathrm {C}_{i} = \frac{1}{m_{i}}\sum _{j}J_{ij}{\mathbf {B}}^{C}_{i} \cdot {\mathbf {e}}_{j} + \frac{1}{2m_{i}} \sum _{j,k\ne l}B_{ijkl}({\mathbf {B}}^\mathrm {C}_{i} \cdot {\mathbf {e}}_{j} )( {\mathbf {e}}_{k}\cdot {\mathbf {e}}_{l}). \end{aligned}$$
(11)

In order to address the problem of overfitting and access the predictive power of the models, a leave-one-out cross-validation analysis is performed (“Computational details”). Each single data point is left out in turn from the set, and the parameters are then extracted from the remaining set of data points and used to predict the value of the left-out data point.

In the regular Heisenberg model, the terms are usually ordered and included in the model according to the relative distances of the atomic pairs. For more complicated models, additional conventions have to be made. In \({\mathbf {H}}_{1}\) and \({\mathbf {H}}_{2}\) the higher order terms are ordered according to the sum of all the relative distances between the members of the moment-quadruples. Hence, the nearest neighbour bi-quadratic interaction is always the first higher order term included in the model. In addition, rules are introduced that order the \(J_{ij}\) with respect to the \(B_{ijkl}\) coefficients. In the \({\mathbf {H}}_1\) model, \(B_{ijji}\) is counted before \(J_{ij}\) if the sum of the relative distances between the members of the moment-quadruple that define \(B_{ijji}\) is less than two times the distances between the pair that defines \(J_{ij}\). For \({\mathbf {H}}_2\), the factor two is increased to a factor four. Essentially the two dimensional problem with possible different length cutoffs for bilinear and higher order interactions is reduced to one dimension by the use of a fixed ratio. The drawback of this simplification is of course that the possible predictive power of the more complex models might be underestimated. These factors may be optimised and made specific to the material and the chosen sampling of non-collinear states using the cross validation procedure. However an optimisation using our data on bcc-Fe resulted in similar ratios between the cutoff lengths (four and two), so these rules are used in this work in order to simplify the comparisons to the standard Heisenberg model. In general it is recommended to do the full model optimisation since it may identify the cases where the ground state is stabilised by higher order interactions and will further increase the predictive power of the model.

Computational details

The main results presented in this paper are obtained with the new formalism described above as implemented in the full potential APW+lo Elk code46. The computational details for the three systems bcc Fe, fcc Ni and FeCo are the same, except for the number of \({\mathbf {k}}\)-points. The local spin density approximation (LSDA) functional by Perdew and Wang47 is used throughout this work. For the band energy calculations a \(31\times 31\times 31\) \({\mathbf {k}}\)-point mesh is used for fcc Ni and FeCo while a \(41\times 41\times 41\) \({\mathbf {k}}\)-point mesh is used for bcc Fe. The constraining fields are obtained using a \(15\times 15\times 15\) \({\mathbf {k}}\)-point mesh for fcc Ni and FeCo, while a \(21\times 21\times 21\) \({\mathbf {k}}\)-point mesh is used for bcc Fe. The muffin-tin radius was set to 1.23 Å for all atoms. The angular cutoff for the APW functions, for the muffin-tin potential and the density was set to 9. The maximum length of the \(\{{\mathbf {G}}+{\mathbf {k}}\}\)-vectors was regulated by fixing its product with the average muffin-tin radius to 9. The maximum length of the \(\{{\mathbf {G}}\}\)-vectors describing the interstitial potential and the density was set to 9.5 Å\(^{-1}\). We utilise unrestricted intra-atomic non-collinearity in the calculations. The experimental lattice constants of 2.87 Å, 3.50 Å and 2.83 Å were considered for bcc Fe, fcc Ni and B2 FeCo, respectively.

For the calculations of exchange parameters, 100 spin-spirals were considered with randomly generated wave-vectors for bcc Fe, while 90 spin-spirals were considered for fcc Ni and FeCo. Cone-angles of 0.25 rad on one or two magnetic sub-lattices were used for all the results in this paper, while angles up to 0.5 rad were considered to ensure that our results did not depend sensitively on the chosen angles. The constraining fields were considered converged when the change of the constraining fields between two successive iterations was less than 0.5%. For FeCo we also calculate the exchange parameters using the Korringa–Kohn–Rostocker (KKR) method and the MFT method as developed by Lichtenstein et al.9 and implemented in the Münich SPR-KKR band structure program package48,49. The disordered local moment (DLM) model has been employed for describing the paramagnetic state of Fe28,29. This model treats spin-up and spin-down components in equal concentration, assuming a completely random distribution of magnetic moments for each magnetic element, in the sense that the correlation between site occupancies of spin-up and spin-down moments is absent. The electronic structure of the DLM-state has been obtained using the coherent potential approximation (CPA)50,51, which accurately describes disordered systems in the single-site approximation.

In order to sample a large variety of non-collinear states for the different beyond-Heisenberg models and to avoid issues of linear dependency between higher order parameters, the cubic 16-atom unit cell was considered for bcc Fe. Here 125 \({\mathbf {k}}\)-points were evaluated and otherwise the same set of computational parameters used previously, was employed. The constraining fields and magnetic moments were converged for five sets of magnetic configurations. The cone angles \(\theta _{i}\) for the 16 atoms were randomised in the intervals \((n-1)\pi /10< \theta _{i} < n\pi /10\) for the five sets defined by \(1 \le n \le 5\). Furthermore, the phase factors \(\phi _{i}\) were randomised between 0 and \(2\pi\) and a spin-spiral with a random wave-vector applied for each configuration. With increasing n increasingly disordered states are considered from the almost collinear states of \(n=1\) to the almost paramagnetic \(n=5\). In total, 100 different magnetic configurations were considered.

Results

In order to evaluate the accuracy of the spin-spiral total energy differences using the MFT with and without the use of constraining fields, spin-spiral total energies calculated fully self-consistently were used as a benchmark. Three systems were considered, bcc Fe, fcc Ni and B2 FeCo, with results presented here. For each system, three different methods were used for calculation of the total energy: (i) non-self-consistent calculations without any constraining fields (referred in the following as NSC and marked with blue circles); (ii) non-self-consistent calculation with constraining fields (referred to as NSC-F, green triangles) and (iii) self-consistent calculations (referred to as SC, red squares). It should be noted that, only for bcc Fe and fcc Ni are the spin-spiral dispersions directly related to the adiabatic magnon dispersion through a scaling factor since they—unlike FeCo—have a single magnetic sub-lattice. The exchange parameters are obtained from the regular Heisenberg exchange field, \({\mathbf {H}}_{0}\) (Eq. (7)). The NSC calculations employ the conventional frozen magnon method, while the NSC-F employ the corrected frozen magnon method and the SC calculations employ the transverse field method (see “Introduction”).

Bcc Fe

We calculated the spin-spiral energy dispersion in bcc Fe along the H–\(\Gamma\)–P direction for all the three cases mentioned before: NSC, NSC-F and SC (Fig. 2a). The magnetic field \({\mathbf {B}}^\mathrm {eff}({\mathbf {r}})\) is aligned closer to the z-axis when the constraining fields are neglected and thus, it corresponds to a situation of smaller cone-angles. This results in lower total energy differences with respect to \({\mathbf {q}}=0\) and it is why the NSC-F energies lie above the NSC curve.

Figure 2
figure 2

(a) Spin-spiral energy dispersion in bcc Fe. The blue circles and green triangles represent non-self-consistent total energies calculated without- (blue) and with (green) constraining fields. The red squares represent self-consistent total energies. Dashed lines represent fitted polynomials. (b) Exchange parameters for bcc Fe. The blue circles and green triangles represent exchange parameters calculated non-self-consistently without (blue) and with (green) constraining fields. The red squares represent exchange parameters calculated with the transverse field method.

The behaviour is consistent with the results of previous studies where quantities such as exchange parameters, magnon energies and the Curie temperatures \(T_{C}\) are increasing when the MFT calculations are corrected for the LWA23,24,25,26. Indeed, we also obtained a higher \(T_{C}\) for bcc Fe with the corrected frozen magnon method than with the conventional one as can be seen in Table 1.

Table 1 The Curie temperatures (\(T_{C}\) in K) for bcc Fe, fcc Ni and B2 FeCo.

It is worth noting that we have intra-atomic collinearity in our non-self consistent calculations for bcc Fe (and for fcc Ni) since we have only a single magnetic sub-lattice in this case and therefore, have a collinear \(\Gamma\)-point reference state. In a previous study,24 it was shown that suppressing intra-atomic non-collinearity in bcc Fe gives rise to an increase of the calculated \(T_{C}\) when derived from self-consistent spin spiral total energies. This observation is consistent with our results since it suggests that the non-self consistent calculations—where we have intra-atomic collinearity—should get higher total energy differences compared to the self consistent calculations. This may explain why the agreement with self-consistent spin-spiral total energies decreases somewhat with the introduction of the fields in the non-self-consistent calculations. This is likely due to the fact that the error cancellation is lost between the neglect of the fields (leading to smaller effective cone-angles and thus energies) and the imposition of intra-atomic collinearity (that increases the energies of the spin-spirals).

The most pronounced difference in the exchange parameters (Fig. 2b) is the larger nearest neighbour interaction of the NSC-F case compared to the NSC case. This difference is of similar size to the one obtained in previous studies between methods that worked in the LWA and those that went beyond, using linear response theory23 or self-consistent spin-spirals24. Similarly, the error cancellation can explain the excellent match obtained between self-consistently converged exchange parameters and the exchange parameters from previous studies using the MFT53,54,55. In fact, an almost perfect match is obtained with the results of Bergqvist53.

It should be noted, however, that bcc Fe is not very well described in a strict Heisenberg model. The exchange parameters are configuration-dependent and specifically the ratio between the dominating nearest- and second nearest neighbour interactions is known to depend sensitively on the cone-angle of the spin-spirals56. Some differences between the results of various calculations available in the literature can be expected for that reason. To avoid the configuration-dependence, a more complex Hamiltonian has to be considered45. This issue will be discussed in further detail later in the article.

Fcc Ni

As in the case of bcc Fe, we have intra-atomic collinearity in the non-self-consistent calculations for fcc Ni as well. Introducing the pre-converged constraining fields into the non-self-consistent calculations gives results that match excellently with self-consistent calculations. This is exemplified by the green and red curves in Fig. 3a. In the meantime, not including the constraining fields produces a dispersion curve that is significantly lower.

Figure 3
figure 3

(a) Spin-spiral energy dispersion in fcc Ni. The blue circles and green triangles represent non-self-consistent total energies calculated without- (blue) and with (green) constraining fields. The red squares represent self-consistent total energies. Dashed lines represent fitted polynomials. (b) Exchange parameters in fcc Ni. The green triangles and blue circles represent exchange parameters calculated non-self-consistently with- (green) and without (blue) constraining fields. The red squares represent exchange parameters calculated with the transverse field method.

This means that correcting for the LWA removes almost entirely the large differences between non-self consistent and self-consistent results. As a consequence, the exchange parameters of the corrected frozen magnon method correspond more closely to the exchange parameters obtained by the transverse field method, shown in Fig. 3b. There is a substantial difference between the size of the nearest neighbour interaction obtained in our study when derived from the transverse field or the corrected frozen magnon method and previous self-consistent results24. We obtain also a substantially weaker nearest neighbour interaction compared to results of linear response theory23. However, it is clear from Fig. 3a that the NSC-F energy differences are close to the self-consistent results, which indicates that the mapping of the self-consistent total energy landscape is correct (proven for the high symmetry lines probed here). This is further reinforced by the close agreement between the exchange parameters calculated self-consistently from the transverse field and the ones obtained from the corrected frozen magnon method. Note that our results for the \(T_{C}\) give a decent match to previous time-dependent density functional simulations57. It is well-known that the \(T_{C}\) of fcc Ni is underestimated by MFT calculations that operate in the LWA and employ the LSDA functional23,24,25,30,52. The transverse field method and corrected frozen magnon method give a much improved prediction of the \(T_{C}\) (Table 1), but still fall short of the experimental value.

Constraining fields were also calculated for configurations with larger angles and it was found that the dominating nearest neighbour interaction tends to weaken with the onset of magnetic disorder, while no other interactions grew substantially in size. This indicates that a more thorough sampling of non-collinear states will rather decrease (than increase) the predicted transition temperature. The discrepancy between theory and experiments should be sought elsewhere. This further strengthens the case that the LSDA functionals underestimate the exchange interactions or that non-adiabatic processes and longitudinal fluctuations not captured by the Heisenberg model also play an important role in determining the \(T_{C}\) in fcc Ni57,58.

B2 structured FeCo

In FeCo, we have two different magnetic sub-lattices present. When analysing the energy dispersions for this compound, two different magnetic configurations were considered: (i) when the moments are tilted simultaneously on both magnetic sub-lattices (Fig. 4a) and (ii) when the moments are only titled on one magnetic sub lattice either Fe (Fig. 4b) or Co (Fig. 4c), while the orientations of the moments on the other sub-lattice are kept fixed.

Figure 4
figure 4

Spin-spiral energy dispersion curves in B2 FeCo, when: (a) both the magnetic moments on the Co and Fe sub-lattices are tilted, (b) with only the magnetic moments on the Fe magnetic sub-lattice tilted and (c) with only the magnetic moments on the Co magnetic sub-lattice are tilted. The blue circles and green triangles represent non-self-consistent total energies calculated without- (blue) and with (green) constraining fields. The red squares represent self-consistent total energies. Dashed lines represent fitted polynomials. Also shown are exchange interactions between: (d) the two magnetic sub-lattices, Fe and Co, (e) within the Fe magnetic sub-lattice and (f) within the Co sub-lattice. The green triangles and blue circles represent exchange parameters calculated non-self-consistently with (green) and without (blue) constraining fields. The red squares and black diamonds represent exchange parameters calculated with the transverse field method and the method of Lichtenstein et al.9, respectively.

The NSC-F dispersion curves correspond considerably more accurately to the SC case than the NSC for FeCo. The difference becomes especially pronounced qualitatively when only the Fe (Fig. 4b) or the Co (Fig. 4c) sub-lattice is tilted. It needs to be pointed out that the energy scale is much smaller in this case compared to the case where both magnetic sub-lattices are tilted (Fig. 4a). The negative values of the energies arise due to the fact that the \(\Gamma\)-point configuration is a non-collinear state. The energy differences of the spin spirals in Fig. 4b,c do not depend on the inter-lattice exchange parameters since the angles between atomic moments belonging to different sub-lattices are constant. Hence, we can expect significant differences in the intra-lattice exchange parameters. Indeed the NSC-F intra-lattice parameters, obtained by the corrected frozen magnon method, in Fig. 4e,f, correspond much closer to the exchange parameters obtained by the transverse field method than the NSC– exchange parameters obtained by the conventional frozen magnon method. Also the nearest neighbour Fe-Co interaction is significantly stronger when obtained by methods that go beyond the LWA. It can be noted that the energy scales in Fig. 4d–f are similar to the corresponding graphs in Fig. 4a–c. We remind the reader that in the case of B2 FeCo, the spin-spiral energy dispersion cannot be directly related to the adiabatic magnon dispersions, due to the fact that we have two magnetic sub-lattices present in the system. Exchange parameters were also obtained by KKR calculations using the Liechtenstein-Katsnelson-Antropov-Gubanov (LKAG) method9 and found to be similar to the ones obtained by the conventional frozen magnon method (the black vs the blue symbols in Fig. 4d–f). Also previous KKR calculations by MacLaren et al.59 match our KKR results well. This shows that the conventional MFT calculations fail to describe the self-consistent energy landscape of FeCo due to the LWA, regardless of the specifics of the implementation. In the case of FeCo, there is no experimental value to compare to, since this compound undergoes a structural phase transition before losing the magnetic ordering.

Beyond the Heisenberg model

To assess the predictive power of the different models presented above (“Introduction”), a leave-one-out cross validation analysis is employed and it is shown that the extended models give a significantly increased accuracy for bcc Fe.

The predicted mean squared error (P-MSE) obtained from the differences between the predicted and actual data points provides a better measure of the predictive power of the models than the regular MSE. While the MSE decreases more or less monotonously with respect to the number of parameters regardless of model and material under study, the P-MSE has a global minimum that marks the onset of overfitting.

Figure 5
figure 5

(a) The mean squared differences between \({\mathbf {B}}^{\mathrm {C}}_{i} \cdot {\mathbf {B}}^{\mathrm {C}}_{i}\) as obtained from direct ab initio calculations and \(\mathbf {B'}^{\mathrm {C}}_{i} \cdot \mathbf {B'}^{\mathrm {C}}_{i}\) as obtained from the parameterised models derived from the data-set excluding \({\mathbf {B}}^{\mathrm {C}}_{i}\). The regular Heisenberg model is labeled \(H_{0}\); \(H_{0}\) with bi-quadratic interactions added is labeled \(H_{1}\) and \(H_{1}\) extended by three- and four-spin interactions is labeled \(H_{2}\). (b) The ratio \({\mathbf {B}}^{\mathrm {C}}_{i}/\mathbf {B'}^{\mathrm {C}}_{i}\), where \(\mathbf {B'}^{\mathrm {C}}_{i}\) is obtained from the parameterised models derived from the data-set excluding \({\mathbf {B}}^{\mathrm {C}}_{i}\). The color scheme in (b) is the same as in (a) for the different models.

As seen in Fig. 5a, it is clear that the extended Heisenberg models are significantly superior to the regular Heisenberg model and that the accuracy of the models follow the expected complexity order, with \({\mathbf {H}}_{2}\) more accurate than \({\mathbf {H}}_{1}\) and \({\mathbf {H}}_{1}\) more accurate than \({\mathbf {H}}_{0}\). In Fig. 5a, the global minimum of the P-MSE of the \(\mathbf{H}_{2}\)-model is not clear, contrary to the cases of \({\mathbf {H}}_{0}\) and \({\mathbf {H}}_{1}\). However, if the number of parameters is increased beyond 200 it is found that 198 parameters is the global minimum using the fixed ratios between the cutoff lengths for the bilinear and higher order terms. The result of Singer et al.45, that a single bilinear and bi-quadratic parameter give a better fit than any number of bilinear parameters, is reproduced as seen by the comparison between \({\mathbf {H}}_{1}\) and \({\mathbf {H}}_{0}\). Although it should be noted that their analysis is based on MSE rather than the P-MSE and a different sampling of non-collinear states. It is interesting to note that the predictive power of the multi-spin model \({\mathbf {H}}_{2}\) is significantly improved compared to \({\mathbf {H}}_{1}\) by the inclusion of hundreds of small three- and four-spin interactions. When the ratios are optimised it is found that the P-MSE cannot be significantly improved within cutoff-limits that give a parameterisation from the available data. It is possible that the optimisation for a more extensive collection of data would make a larger difference between optimised and fixed ratios.

The ratio between the predicted (\(\mathbf {B'}^\mathrm {C}_{i}\)) and directly calculated (\({\mathbf {B}}^\mathrm {C}_{i}\)) size of the transverse exchange fields is shown in Fig. 5b, in order to easily evaluate the accuracy of the different models. Here small fields might of course have a ratio significantly different from one and still not contribute with a large absolute error. But an overall sense of the accuracy for the bulk of the data points can nevertheless be obtained. For the extended models, the fields are predicted within \(10\%\) of the directly calculated fields and for the regular Heisenberg model the fields are predicted within \(20\%\). The ratios have been evaluated with a number of parameters determined by the minimum of the P-MSE, i.e. 74, 38 and 198 parameters for \({\mathbf {H}}_{0}\), \({\mathbf {H}}_{1}\) and \({\mathbf {H}}_{2}\) respectively.

The variations in the sizes of the magnetic moments were less than \(15\%\) for the data-set and insignificant for the smaller angles. It was investigated whether the Hamiltonians could be improved by including a bilinear scaling with respect to the amplitudes of the moments. A version of every Hamiltonian where the parameters where multiplied by the moments \({\mathbf {m}}_{j}\) rather than the directional vectors \({\mathbf {e}}_{j}\) were tested and found to give significantly worse accuracy for all cases, perhaps contrary to expectations. A more thorough investigation of this matter is postponed for a future work.

The parameterisation for the \({\mathbf {H}}_{0}\) model is shown in Fig. 6a and for the \({\mathbf {H}}_{1}\) and \({\mathbf {H}}_{2}\) models in Fig. 6b–c. In general, the interactions are short ranged with the nearest neighbour being considerably larger than the rest. It is found that the DLM results match the self-consistent results for bcc Fe perfectly when employing the regular Heisenberg model as seen in Fig. 6a. The regular Heisenberg model does not capture features in the local spin density approximation (LSDA) total energy landscape that depends upon this particular use of the coherent potential approximation (CPA). Those features may be described by the higher order interactions and most importantly the strong bi-quadratic nearest neighbour interaction (see Fig. 6b–c).

In the work by Szilva et al.60, collinear exchange parameters for the bilinear and bi-quadratic spin Hamiltonian were derived. The obtained values in the case of bcc Fe compare very well qualitatively to our calculated exchange interactions for the same system, when using \({\mathbf {H}}_1\) model. It is expected that the two approaches give some quantitative differences since our approach is self-consistent while theirs is non-self consistent and since we also sample states far away from the collinear state when parameterising the model.

Figure 6
figure 6

The parameterisation of three different models are shown for bcc Fe. (a) shows the result for the \({\mathbf {H}}_{0}\) model together with exchange parameters calculated using Lichtenstein et al.9 method, with and without the DLM formalism. (b) and (c) shows the result for the \({\mathbf {H}}_{2}\) and \({\mathbf {H}}_{1}\) models respectively. For the \({\mathbf {H}}_{2}\) and \({\mathbf {H}}_{1}\) models, the sums of the distances between the moment-quadrupoles are divided by four in order to make the distances of the exchange parameters \(J_{ij}\) comparable with the \(B_{ijkl}\). (d) Total energy for a set of different non-collinear states with increasing cone-angle towards the right. For each cone-angle, there are five different configurations with random azimuthal angle.

The Hamiltonian can be extended in the transverse-field method. To this end, we aim to model the magnetic phase space for itinerant magnetic systems by a single model. For fcc Ni, four different models are considered. Applying any of the exchange fields, \({\mathbf {H}}_{ni}\), defined in “Introduction” and sum over all moments \({\mathbf {m}}_i\) yields

$$\begin{aligned} E_n=\sum _i{\mathbf {H}}_{ni}\cdot {\mathbf {m}}_i. \end{aligned}$$
(12)

Adding to two of these models, \(E_0\) and \(E_2\), a correction inspired by anisotropic and Stoner interaction

$$\begin{aligned} \sum _i \Big ( \frac{I_i}{2} + \frac{K_i m_i^2}{4}\Big ) m_i^2, \end{aligned}$$

we obtain two additional models \(E_0'\) and \(E_2'\). The results of these models are presented in Fig. 6d which shows 15 different configurations for three different cone-angles (\(\theta\)) with random azimuthal angles (\(\phi\)). It is seen that including the ansiotropic and Stoner parameters in the model considerably improve the result, with the most complex model being more or less identical to the self-consistent results. The difference, of course, becomes larger with increasing non-collinearity.

Discussion and outlook

Much of the discussions regarding the accuracy of the exchange parameters determined with the MFT has revolved around the results for bcc Fe. This is a material where conventional MFT calculations in the LWA and self-consistent total energy calculations agree very well for small deviations in the magnetic structure and where the introduction of any correction scheme therefore has little room for improving the results. This excellent agreement can be understood to be partly due to the error cancellation between (i) the neglect of the constraining fields and (ii) the assumption of intra-atomic collinearity in the calculations. Therefore, the bcc Fe exchange parameters of the transverse field method agree very well with the results of calculations using the MFT in the LWA, while the corrected frozen magnon method gives slightly different quantitative predictions for the exchange parameters. But for any material and state where \({\mathbf {B}}^\mathrm {C}({\mathbf {r}}) \gtrsim {\mathbf {B}}^\mathrm {XC}({\mathbf {r}})\), significant differences can be expected between non-self-consistent and self-consistent results. This is the case for the spin-spiral total energy calculations for fcc Ni and B2 FeCo, where the conventional use of the MFT does not result in a correct description of the total energy as a function of the directions of the moments. The neglect of the contributions of the constraining fields to the potential is crucial for this discrepancy as we can see from the comparison between our two different frozen magnon methods.

We can see that going beyond the LWA through the use of the corrected frozen magnon- or transverse field method does result in a critical temperature that is significantly closer to experiments for fcc Ni using LSDA. When larger angles were applied, the exchange tended to weaken which indicates that either the LSDA itself fails to give an accurate description of fcc Ni or a more complex treatment beyond the Heisenberg model is needed. This finding is consistent with previous studies that go beyond the LWA. Besides, turning to more accurate electronic structure methods, it would be interesting to see whether the theoretical predictions could be made more accurate by introducing longitudinal fluctuations of the magnetic moments into the effective Hamiltonian.

The corrected frozen magnon method gives an overall accurate description of the self-consistent energy landscape of all the three materials covered in this study, while the conventional method only works well for bcc Fe. This indicates that non-self consistent spin-spiral calculations are, in general, not the most efficient approach for the purpose of parameterising the total energy of non-collinear states in itinerant magnetic systems. This is due to the fact that the constraining fields that remove the differences to self-consistent total energies need to contain information of the parameterisation itself. However, the procedure can still make sense if the constraining fields are not fully self-consistently converged. In our corrected frozen magnon method the potential is kept fixed while the constraining fields are converged. This results in computational costs for the corrected frozen magnon method comparable with the transverse field method for the small magnetic systems considered in this study, while the transverse field method scales better and therefore is clearly favourable to use for larger magnetic systems.

The formalism of the transverse field method is simple to implement in any code that handles non-collinear magnetism. It can be used with or without the generalised Bloch theorem. The advantage of using the spin-spiral formalism is the possibility of calculating interactions between pairs of atoms that are not both contained in the unit cell. The method has several clear advantages over the conventional frozen magnon method besides the improved scaling and avoidance of the LWA. It is reasonable to expect the method to produce more accurate results than non-self-consistent approaches for systems where inter-atomic non-collinearity results in sizeable intra-atomic non-collinearity. Furthermore, more accurate results can be expected for systems with large induced moments, that depend sensitively on the directions of the surrounding moments. Simple rotations of the potentials at the sites of the magnetic sub-lattices are in these cases not likely to capture the intricate changes of the self-consistent potentials and corresponding total energy differences.

When the Heisenberg model is expanded and applied to supercell calculations, it is clear that higher order interactions play a crucial role in an accurate description of the non-collinear total energy landscape of bcc Fe. A significant increase in accuracy is obtained already when only bi-quadratic interactions are added to the regular Heisenberg model. The promising results obtained by the full multi-spin model indicates that the ambition to model the magnetic phase space for itinerant magnetic systems in a single model is reasonable. The advantages of the transverse field method is evident in the scaling and the possibility of evaluating and optimising the predictive power of the derived model against self-consistently derived quantities using the cross-validation analysis. The present study supplies a fully automatic, affordable and self-consistent method to systematically investigate exchange interactions beyond the standard Heisenberg description. This approach lends itself naturally to massive high-throughput calculations to find new useful and exotic magnetic materials.

The perfect match between the self-consistent results for the regular Heisenberg Hamiltonian and the DLM results for bcc Fe is a subject for further studies. It would be interesting to see how general this phenomenon is. For materials where the longitudinal fluctuations are strong, this equality probably breaks down. A natural continuation of the present study is the formulation of a general framework that makes use of the transverse-field method and includes longitudinal fluctuations as well. In this way, one can properly describe the magnetic properties of real materials including temperature effects.

The present work is important for the field of magnon-spintronics since it provides an efficient way of accurately describing the exchange interactions. These interactions are crucial for designing realistic and efficient magnon-spintronic devices.