Magnetism plays a crucial role in many physical and technological phenomena, ranging from magnetic storage devices to superconductivity. Determining the presence of long-range magnetic ordering in materials is therefore essential for designing new functional materials with tailored magnetic properties. Neutron scattering is a powerful tool for directly determining magnetic order and is functional across a wide range of temperatures and pressures. However, neutron scattering experiments typically require access to specialized facilities, such as nuclear reactors or spallation sources, which can be costly. Additionally, it mandates a relatively large size and high-quality sample. The elements with high neutron absorption cross-sections also hinder clear scattering signals.

Despite the availability of direct measurement methods, the limitations mentioned make it challenging to identify magnetic order. Therefore, it would be beneficial to have a method for determining magnetic order that is more accessible and less expensive, even if it is not as direct as neutron scattering. For instance, specifying magnetic order based on the density of states (DOS), which can be accessed by various experimental methods, can be a compelling alternative. In principle, magnetic orders is closely connected with the particle–hole excitation spectrum and the DOS displays distinct features of the corresponding order. The challenge is how to extract and quantify the correlation effectively.

The recent advancement of machine learning has had a significant impact in uncovering hidden correlations in the field of condensed matter physics1,2,3,4,5,6,7,8,9. This technology has also been applied to the study of magnetism, enabling for the prediction of physical quantities without the need for direct measurement or calculations10,11,12,13,14,15,16,17,18,19,20,21,22,23 or probing orders from the data24,25,26,27,28,29,30.

Motivated by the capability of machine learning to uncover complex relationships within numerical data, we explore the use of decision tree algorithms for identifying magnetic order from the density of states. We also examine an alternative probe through momentum-resolved spectra, as the integration over the momentum space in the local density of states may mask crucial differences between various forms of magnetic order.

For the classification of magnetic order, a dataset comprising inputs and their respective magnetic order is necessary. We selected BaOsO\(_3\) as a target system for our machine learning study. The polycrystalline BaOsO\(_3\) samples show metallic behavior31,32, but its high symmetry allows us to induce various distinct magnetic orders by lowering the symmetry. We employ Hartree–Fock (HF) mean-field theory to generate the data sets with multiple magnetic order candidates for machine learning. The system is ordered by local Coulomb interaction depending on antiferromagnetic order parameters we set. The resulting DOS, momentum-resolved spectra, and antiferromagnetic orders are used to construct the data sets.

This paper is organized as follows. In “Model Hamiltonian”, we describe the model Hamiltonian and Hartree–Fock approximation we employed. The data preparation for machine learning and the performance of the trained model are discussed in “Machine learning”. “Conclusion” is devoted to conclusion and outlook.

Model Hamiltonian

Figure 1 shows the unit cell and electronic structure of BaOsO\(_3\). The first-principles calculations were performed using density functional theory using projector augmented wave potentials within PBE exchange–correlation functional as implemented in Vienna Ab initio Simulation Package33,34. The crystal field lifts the degeneracy of the d-orbitals in Os atoms, placing the Fermi level in the \(t_{2g}\) levels, which are well separated from other bands based on first-principles calculations. We used Wannier9035 to construct Maximally-Localized Wannier Functions (MLWF) based tight-binding Hamiltonians for the \(t_{2g}\) bands.

Figure 1
figure 1

(a) Atomic structure of BaOsO\(_3\). (b) First Brillouin zone with the path for band energies and projected density of states (PDOS) for t\(_{2g}\) subshells. In (b), the three t\(_{2g}\) orbitals have identical density of states due to the cubic symmetry of BaOsO\(_3\). The dotted line denotes Fermi level at \(N=4\).

The model Hamiltoinan is represented as \(H = H_\mathrm{0} + H_\mathrm{int}\), where the bilinear part is given by

$$\begin{aligned} H_\mathrm{0} = \sum _{ij}\sum _{ll^{'}}\sum _{\sigma }t_{ij}^{ll^{'}}c^{\dagger }_{il\sigma }c_{jl^{'}\sigma }, \end{aligned}$$

where \(c_{il\sigma }^{\dagger }\)(\(c_{il\sigma }\)) creates (annihilates) an electron with spin \(\sigma\) in orbital l at site i. The hopping amplitude \(t_{ij}^{ll^{'}}\) is adapted from the Wannier Hamiltonian. The hopping parameters have ideal cubic symmetry, allowing us to induce various magnetic orders for this machine learning study. In a \(2\times 2\times 2\) supercell, three types of antiferromagnetic (AF) orders were considered: A-, C-, and G-type, as illustrated in Fig. 2. In Fig. 2, we display the three magnetic structures and their associated noninteracting band energies in the first Brillouin zone of the original lattice, which has been modified to reflect the periodicity when the respective antiferromagnetic order is stabilized.

Figure 2
figure 2

Schematic view and band energies with periodicity of (a) A-, (b) C- and (c) G-type antiferromagnetic order in a \(2 \times 2 \times 2\) supercell. The corresponding reduced Brillouin zones are represented as cyan-colored polyhedra. The number of bands is double that shown in Fig. 1 due to the display in the first Brillouin zone of the original lattice, ignoring the unit cell doubling for AF ordering. The Fermi level at N = 4 is indicated by the dotted line.

The two-body interaction part is represented as,

$$\begin{aligned} H_\mathrm{int}&= \frac{1}{2}U\sum _{i}\sum _{l}\sum _{\sigma } n_{il\sigma }n_{il{\bar{\sigma }}}\nonumber \\&\quad + \frac{1}{2}U^{'}\sum _{i}\sum _{l\ne l^{'}}\sum _{\sigma } n_{il\sigma }n_{il^{'}{\bar{\sigma }}}\nonumber \\&\quad + \frac{1}{2}U^{''}\sum _{i}\sum _{l\ne l^{'}}\sum _{\sigma } n_{il\sigma }n_{il^{'}\sigma }\nonumber \\&\quad + \frac{1}{2}J\sum _{i}\sum _{l\ne l^{'}}\sum _{\sigma } c^{\dagger }_{il\sigma }c^{\dagger }_{il^{'}{\bar{\sigma }}}c_{i l{\bar{\sigma }}}c_{il^{'}\sigma }\nonumber \\&\quad + \frac{1}{2}J\sum _{i}\sum _{l\ne l^{'}}\sum _{\sigma }c^{\dagger }_{il\sigma }c^{\dagger }_{il{\bar{\sigma }}} c_{il^{'}{\bar{\sigma }}}c_{il^{'}\sigma }, \end{aligned}$$

where \(n_{il\sigma }\equiv c_{il\sigma }^{\dagger }c_{il\sigma }\) is a number operator, U is the on-site intra-orbital Coulomb interaction and J is the Hund’s coupling. We assume the rotational invariance, setting inter-orbital interaction terms, \(U^\prime =U-2J\) for two different spins and \(U^{\prime \prime }=U-3J\) for the same spins. In order to stabilize an AF order, we introduce the Hartree–Fock approximation, which allows us to handle the many-body problem in Hint by using the following mean-field ansatz. The precise wave vectors are defined as

$$\begin{aligned} \langle c^{\dagger }_{jl\sigma }c_{j^{'}l^{'}\sigma ^{'}} \rangle = \frac{1}{2} (n_{l} + \sigma m_{l}e^{i\textbf{q}_\alpha \cdot \textbf{r}_{j}})\delta _{jj^{'}}\delta _{ll^{'}}\delta _{\sigma \sigma ^{'}}, \end{aligned}$$

where \(n_l\)(\(m_l\)) is an electron occupancy (staggered magnetization) of orbital l, \(\textbf{r}_{j}\) is the position vector of site j, \(\textbf{q}_\alpha\) is the wave vector corresponding an AF type \(\alpha\)=A, C, and G. To be precise, \(\textbf{q}^{}_\mathrm{A}=(\pi , 0, 0)\), \(\textbf{q}^{}_\mathrm{C}=(\pi , \pi , 0)\) and \(\textbf{q}^{}_\mathrm{G}=(\pi , \pi , \pi )\). Note that the Kronecker deltas force any nonlocal, interorbital, and interspin terms to be zero in this calculation, but introducing additional off-diagonal order parameters does not alter the applicability of this machine learning study.

Figure 3
figure 3

Hartree–Fock phase diagrams with A-, C- and G-type antiferromagnetic ordering are shown for (a) \(J/U=0.0\) and (b) \(J/U=0.15\). The horizontal axis is the number of electrons per site and the vertical axis is the local Coulomb interaction U. The black bars indicate an insulating phase, while the color represents the staggered magnetization.

We perform the self-consistent calculation with the HF ansatz Eq. (3) for the three AF types and obtain the phase diagrams as shown in Fig. 3. In general, the stronger interaction leads to larger staggered magnetization, but metal-insulator transition exhibits different behavior depending on the number of electrons per site N and the AF ordering. The G-type order is metallic in a broad range of parameter space except the half-filled (\(N=3\)) case. This is due to the symmetry constraint of the AF order (\(\textbf{q}_\mathrm{G}=(\pi ,\pi ,\pi )\)), which requires the orbital occupancies of the three t\(_{2g}\) orbitals to be the same. To open a gap, the number of electrons per unit cell must be an integer, and this symmetry constraint requires it to be a multiple of three. Therefore, the only possible case for an insulator is half-filling, as confirmed by HF calculations.

The A- and C-type antiferromagnetic orders have fewer restrictions compared to the G-type, requiring only two out of three orbital occupancies to be equal. The different combinations of the \(2+1\) occupancy splitting can result in different final solutions for a Hartree–Fock calculation. To account for the possibility of converging to metastable states instead of the ground state, multiple independent Hartree–Fock iterations are performed with various initial conditions and the solution with the lowest energy is selected to construct the phase diagrams.

To increase diversity in the machine learning samples, we have included data with nonzero values of J/U. The Hund’s coupling has two opposing effects on the metal-insulator transition in transition metal oxides36. At half-filling, it reduces the critical value of U, whereas for electron fillings other than half-filling, it increases the critical U. This is because the maximum total spin induced by the Hund’s coupling is much larger at half-filling, making gap formation easier. This unique behavior makes the half-filled case special, indicating that it may be challenging to obtain accurate predictions for this case unless a sufficient number of samples for the half-filling are included in the training set.

Machine learning

Data preparation

Figure 4
figure 4

(a) Unfolded band structure to restore the original periodicity for G-type order with \(N=3\) and \(U=2\). Unlike the nonmagnetic bands whose weights are identical over all momenta, the unfolded bands are weighted ranging from 0 to 1. The color and the size of circle indicate the weights. (b) Corresponding local density of states \(\rho _\mathrm{LDOS}(\omega )\) and (c) \(\textbf{k}\)-projected density of states \(\rho _{\textbf{k}}(\omega )\) at high symmetry points with a broadening factor \(\eta =0.1\).

We created the dataset for our machine learning study from the Hartree–Fock results discussed in the previous section. We obtained three phase diagrams for each of the three antiferromagnetic types, with J/U values of 0.0, 0.1, and 0.2. For each phase diagram we collected \(9\times 59\) data points, where U is varied from 0 to 8 with increments of 1 and N ranges from 0.1 to 5.9 with a step of 0.1. This resulted in a total of \(3 \times 3\times 9\times 59 = 4779\) samples. The data generation took approximately 3 h using a single CPU (Intel Xeon Platinum 8360Y with 2.40 GHz). Still, this time could be reduced to a fraction using parallel production with multiple CPUs without sacrificing performance.

The antiferromagnetic order in HF calculations is determined by the selected HF ansatz which assumes that symmetry breaking occurs and is quantified by non-zero staggered magnetizations (\(m_l\)). When \(U = 0\), the \(m_l\) does not contribute to the Hamiltonian and the self-consistent solution would result in \(m_l\) = 0, which is equivalent to the original Hamiltonian without magnetic order. Labeling such cases as antiferromagnetic data could negatively impact the training process. Additionally, even if \(m_l\) is not zero, extremely small values of \(m_l\) only result in minimal changes to the Hamiltonian, which can confuse the machine learning model. To improve the efficiency of the model, we included only those samples in the dataset where \(m_l\) is greater than 0.1.

As this study aims to identify magnetic orders based on spin-insensitive measurements, the input data for the machine learning should not reflect the modified periodicity in the magnetic orders, even though the spin-integrated data contains crucial information to distinguish the AF orders. We restore the original periodicity by applying band unfolding transformations37 as illustrated in Fig. 4a.

In Hartree–Fock, a single particle approach, the energy axis is represented by bands as delta functions, and broadening is required to calculate the non-divergent density of states. Despite the broadening, each peak retains the same width, as the weights of the delta functions are unity, unless two band energies are in close proximity. This is not the case post the band unfolding process. The local density of states (LDOS) in Fig. 4b remains unaffected by the unfolding as the weights are integrated over the Brillouin zone. However, the momentum-resolved density of states (\(\rho _\textbf{k}\)) in Fig. 4c shows a noticeable variation from its folded counterpart. The energy-dependent variations in the weights of the peaks are related to the antiferromagnetic order.

The selection of optimal features is crucial because redundant features can introduce noise or bias during the learning process, potentially leading to poor generalization or overfitting. We test three different features in this work as described below. We first evaluate two sets of features, one is the LDOS and the other is \(\rho _\textbf{k}\) on the high-symmetry points (\(\textbf{k}=\Gamma\), X, M, and R). The LDOS is calculated by integrating the \(\rho _\textbf{k}\) over the entire Brillouin zone, as follows.

$$\begin{aligned} \rho ^{}_\mathrm{LDOS} (\omega )&= \sum _{\textbf{k}\in \textrm{BZ}} \rho _\textbf{k}(\omega ) \nonumber \\&= \sum _{\textbf{k}\in \textrm{BZ}} \sum _n -\frac{1}{\pi } \textrm{Im} \Big [ \frac{1}{\omega + i \eta - \varepsilon _n(\textbf{k})} \Big ], \end{aligned}$$

where \(\eta\) is a Lorentzian broadening factor and n is the band index. The calculated DOS is a continuous function of energy. However, in order to utilize it as input features for machine learning, the DOS should be expressed as a set of numerical values. We discretize the DOS into \(N_\mathrm{bin}\) points,

$$\begin{aligned} \rho ^{}_\mathrm{LDOS}(\omega ^{}_I) = \int _{\omega ^{}_I - \delta \omega }^{\omega ^{}_I + \delta \omega } \rho ^{}_\mathrm{LDOS}(\omega ) d\omega , \end{aligned}$$

where \(I=1,2,\ldots ,N_\mathrm{bin}\) are frequency indices, \(\omega _I = -8 + I \delta \omega\), and \(\delta \omega = 16/N_\mathrm{bin}\). The original data is generated using 1024 grid points over the energy range [− 8:8], and the features for the given \(N_\mathrm{bin}\) are extracted by integrating the cubic-interpolated LDOS. The \(\rho _\textbf{k}\) at each high-symmetry point is also extracted in a similar manner, but the number of bins is reduced by 1/4 to ensure a fair comparison between the two sets of features, LDOS and \(\rho _\textbf{k}\).

We also design the third features based on the peak structure of the \(\rho _\textbf{k}\) feature, which will be introduced later. An additional advantage of the third feature is its reduced dependence on the broadening scheme. Broadening can arise from various sources, including instrumental resolution, thermal fluctuations, and excitations with finite lifetimes. Given the inability to control all sources of broadening in experiments, it becomes necessary to incorporate broadening effects accurately through theoretical approaches. The third feature is motivated by various test evaluation in “Results of decision tree algorithms”. The features we utilized for the machine learning and testing procedure, performed in “Results of decision tree algorithms”, are summarized in Fig. 5.

Figure 5
figure 5

Machine learning and data preparation roadmap. The combination of the features we used for training and testing is summarized with the resulting figures shown in this paper.

Results of decision tree algorithms

Figure 6
figure 6

Confusion matrices and precision–recall curves of the machine learning models (Random Forest, XGBoost, LightGBM and CatBoost) trained on the (ad) \(\rho ^{}_\mathrm{LDOS}\) features and (eh) \(\rho _{\textbf{k}}\) at high symmetry points features with a broadening factor \(\eta =0.2\). The color scale of the matrices represents accuracy, calculated by dividing each element of the matrix by the sum of its respective row, with the value written in small text below each element. In the precision–recall curves, the values for the three AF orders A, C, and G are displayed as a red solid line, a green dashed line, and a blue dash-dotted line, respectively, while their micro-average is denoted by a bold black line. The dotted line indicates the iso-\(\textrm{F}_1\) curve, which contains all points in the precision–recall space with the same \(\textrm{F}_1\) scores.

We used decision tree ensemble algorithms, including Random Forest, a bagging method available in scikit-learn38, as well as boosting methods such as XGBoost39, LightGBM40 and CatBoost41. We divided the samples into a training and a test set in a 7:3 ratio and trained the model using Random Forest, XGBoost, LightGBM, and CatBoost algorithms. Figure 6 shows the confusion matrices and the precision–recall curves. An element of the confusion matrix is defined as \(C_{\alpha \beta }\) = (number of samples predicted as \(\beta\) order while the true label is \(\alpha\) in the test set). The diagonal parts of the matrices represent cases that the trained model correctly predicts the AF labels for the test set. The off-diagonal parts indicate the number of incorrect answers, where the column represents which labels were incorrectly assigned.

The precision–recall curve illustrates how the balance between precision and recall changes with varying thresholds. Precision (P) represents the ratio of true positives (\(T_p\)) to the total number of cases predicted as positive; \(P = \frac{T_p}{T_p + F_p}\), and recall (R) denotes the proportion of \(T_p\) to the total number of actual positive samples; \(R = \frac{T_p}{T_p + F_n}\). \(F_p\) and \(F_n\) stand for false positives and false negatives, respectively. \(F_1\) score is the harmonic mean of precision and recall; \(F_1 = 2\frac{P\times R}{P+R}\). High precision and recall scores indicate good classification results, and the curve tends to be closer to the upper-right corner.

The performance of the models is generally good, however, errors are more frequent in cases where the filling is half or the value of \(m_l\) is small. The half-filled case is different from the others with non-zero Hund’s coupling, making it difficult to accurately predict. Small values of \(m_l\) result in small mean-field corrections, not providing sufficient information to distinguish different AF orders. Because this method captures the pattern of features, detecting weak magnetism using this method is a fundamental challenge. Adding more half-filled samples to the training set, however, can effectively mitigate errors for gapped cases.

The accuracy of the LDOS model is relatively lower compared to the \(\rho _\textbf{k}\) model as expected, because the integration process during the calculation of the LDOS results in the loss of momentum-resolved information. Despite both models having the same number of features, the performance difference suggests that feature selection is key to the success of this machine learning problem.

Figure 7
figure 7

Spectral weights of G-type order at \(N=4\) and \(U=6\) with (a) a constant broadening factor \(\eta =0.2\) and (b) linearly increasing broadening as the energy lowers, \(\eta (|E-E_{F}|) = 0.1 + 0.4 |E-E_{F}| / 8\).

Figure 8
figure 8

Confusion matrix of machine learning classifier trained on \(\rho _\textbf{k}\) at the high-symmetry points with \(\eta _\mathrm{train}=0.2\). The results of Random Forest, XGBoost, LightGBM, and CatBoost classifiers are shown when the test set is broadened using (ad) constant broadening (\(\eta _\mathrm{test} =0.2\)) and (eh) energy-dependent broadening (\(\eta _\mathrm{test} = 0.1 + 0.4 |E-E_{F}| / 8\)). The color scale of the matrices represents accuracy, calculated by dividing each element of the matrix by the sum of its respective row, with the value written in small text below each element.

The method has an advantage in that it enables feature design optimization for specific applications. For example, angle-resolved photoemission spectroscopy (ARPES) measures hole excitation spectra, so a model that focuses on energy ranges with \(E < E_F\) is necessary. Testing the model trained within the restricted energy range would provide valuable information for analyzing the spectra. Additionally, actual experiments can be influenced by various environmental noises, such as thermal broadening, which cannot be replicated precisely in theoretical calculations. Thus, validating the model with different types of noise is crucial for practical applications.

Figure 7 illustrates two spectra produced from the same HF solution but with different broadening methods. A constant broadening is applied to Fig. 7a, whereas the broadening in Fig. 7b increases as the energy decreases below the Fermi level as \(\eta (|E-E_{F}|) = 0.1 + 0.4 |E-E_{F}| / 8\). For validation purpose, the training set consists of DOS generated using constant broadening, while the test set is constructed using the linearly increasing broadening as the energy lowers below the Fermi level. The machine learning faces a more difficult situation as the test set, which is distinct from the training set, is now broadened using a different pattern from the training set. Note that we only use the \(\rho _\textbf{k}\) at high-symmetry points for machine learning, even though the spectra are visualized over a path connecting high-symmetry points.

We present the resulting confusion matrices in Fig. 8. Figures  6e–h and 8a–d are similar in terms of accuracy, indicating that limiting the energy range does not negatively impact the performance of the machine learning models. However, we observe a substantial drop in accuracy in Fig. 8e–h. This decline suggests that the broadening strength significantly affects the decision trees’ performance, as the trees make decisions based on numerical values that become smaller when the broadening strength \(\eta\) increases.

Suppose a trained decision tree checks whether a given sample has an LDOS value above a certain threshold in a specific energy range at the root node. If the LDOS is larger than the threshold, it returns True; otherwise, it returns False. When the test set is broadened by a larger broadening factor, it reduces the height of the peak in LDOS. Consequently, the root node sends the sample to the False branch, and the decision tree misclassifies the sample.

Figure 9
figure 9

Accuracies of random forest model trained by data with (a,b) \(\eta _\mathrm{train}=0.1\), (c,d) \(\eta _\mathrm{train}=0.2\), and (e,f) \(\eta _\mathrm{train}=0.3\), as a function the broadening factor applied to the test set. The features used for the model are either \(\rho _\textbf{k}\) (displayed in (a,c,e)) or the lowest excitation energies (displayed in (b,d,f)).

Figure 10
figure 10

An illustration of the feature section for the lowest excitations of the \(\rho _\textbf{k}\) at high-symmetry points, for G-type order with \(N=3\) and \(U=2\). The red and blue arrows indicate the selected values as the lowest electron and hole excitation features, respectively. The length of bars represents the \(\rho _\textbf{k}\) feature at each bin, extracted the HF solution shown in Fig. 4. The vertical axes are \(\rho _\textbf{k}\) feature indices ranged from \(\textbf{k}\)01 to \(\textbf{k}\)64 for each \(\textbf{k}\). From the original \(\rho _\textbf{k}\) features, 256 values (4 \(\times\) 64 numbers per \(\textbf{k}\)), we choose eight features (two per \(\textbf{k}\), one positive and one negative) located above and below the Fermi level, marked by dotted gray line.

To investigate the impact of different broadening applied to the training and test set, we perform a systematic cross-test of \(\eta _\mathrm{train}\) and \(\eta _\mathrm{test}\). We divide the HF solutions into training and test sets, and apply \(\eta _\mathrm{train}\) and \(\eta _\mathrm{test}\) to each set respectively. The resulting accuracy is presented in Fig. 9a,c,e. As expected, the best performance is achieved when \(\eta _\mathrm{train} = \eta _\mathrm{test}\). When \(\eta _\mathrm{train}\) and \(\eta _\mathrm{test}\) are not equal, the incorrect predictions increase, especially when \(\eta _\mathrm{train}\) is smaller than \(\eta _\mathrm{test}\). This is because the decision trees choose a branch to follow at every nodes, based on the comparison between a certain feature and a threshold value. A larger \(\eta _\mathrm{train}\) reduces these threshold values, making samples with smaller \(\eta _\mathrm{test}\) more likely to be classified accurately than vice versa.

Based on the results of the cross-broadening tests, which suggest that the key aspect of the model is whether a feature (i.e. an averaged DOS within a bin) exceeds a threshold value, we devised a compact set of features. Despite the variation of the height and width of a peak on the energy axis as \(\eta\) changes with Lorentzian broadening in Eq. (4), the energy at which the weight reaches the local maximum remains unchanged. This energy, originating from the corresponding band energy (\(\varepsilon _n(\textbf{k})\) in Eq. (4)), represents the lowest excitation from the ground state and is measured by the distance from the Fermi level to the peak position.

Figure 10 shows the extraction of features from the original \(\rho _\textbf{k}\) features. For each high-symmetry point \(\textbf{k}\), the closest peak energy to the Fermi level for both electron and hole excitations. We use the energy difference as features, where a positive value represents an electron excitation and a negative value represents a hole excitation. If a peak of \(\rho _\textbf{k}\) is located on the Fermi level, the corresponding features are set to zeroes. This feature selection significantly reduces the number of features, from 256 to 8, which is 32 times smaller than the original size, enabling more efficient training.

The utilization of the lowest excitation energy as a feature contributes to not only efficient training but also accurate results, as demonstrated in Fig. 9b,d,f. The problematic \(\eta\)-dependence of the cross-\(\eta\) test vanishes, highlighting the significance of feature selection for classification based on spectral information. The test set encompasses both metallic and insulating solutions, with the model tending to produce more incorrect predictions for metallic samples. However, tests performed solely on insulating cases show over 95% accuracy, encouraging potential application for classifying antiferromagnetic insulators by this method.


In this study, we examined the application of a machine learning model for classifying antiferromagnetic (AF) orders in a model Hamiltonian targeting BaOsO\(_3\). The dataset was created using Hartree–Fock calculations with selected AF orders in a \(2\times 2\times 2\) supercell. Converged solutions were used to generate the local density of states (LDOS) and the momentum-resolved density of states (\(\rho _\textbf{k}\)). We trained the model using various features, including LDOS, \(\rho _\textbf{k}\), and the lowest excitations, and evaluated its ability to identify AF orders in test samples. While both LDOS and \(\rho _\textbf{k}\) were designed to have the same number of features, the latter demonstrated superior performance, highlighting the importance of feature selection. The \(\rho _\textbf{k}\) features performed well when test samples had comparable broadening levels, but different broadening methods weakened the model’s performance. In contrast, the lowest excitations, with only 8 features per sample, surpassed these limitations and exhibited excellent performance across most samples.

We considered only three types of orders in this paper, but in principle, the approach can be extended to include more diverse types of orders. The Hartree–Fock calculation is computationally very cheap and enables us to have a flexible design of candidate orders followed by prompt testing for desired samples. This method will be useful for materials where conventional methods to identify magnetic orders are not applicable, such as two-dimensional materials, non-collinear orders.

The training based on mean-field level calculations may encounter challenge in application to real materials, in case the correlation effects lead the electronic structure to beyond the mean-field correction. Therefore, to ensure a successful application, it would be beneficial to validate the performance for test samples generated by methods incorporate beyond mean-field fluctuations. For instance, identifying the AF order in dynamical mean-field calculations would be a reasonable test as a bridge toward applications to real materials.