Introduction

Permanent magnets are of great interest in today’s economy. Especially green energy applications, such as in wind turbines and hybrid/electric vehicles demand high-performance permanent magnets. The performance of permanent magnets is mainly determined by intrinsic magnetic properties and microstructural features. The intrinsic properties are adjusted by including rare earth (RE) elements, which mainly increase the magnetocrystalline anisotropy. Due to economical and environmental considerations, there is a high interest in reducing the use of RE elements1. MnAl-C contains no RE or other critical raw materials and its intrinsic magnetic properties make it an attractive alternative to certain types of RE-based permanent magnets2. The microstructure of MnAl-C magnets is known to contain a range of defects, such as grain boundaries, twins, and antiphase boundaries3,4,5,6,7. Kronmüller and Goll discussed microstructural properties of advanced hard magnetic materials and its relation to the quality of permanent magnets8. By investigating the various boundary types of MnAl-C, their distribution and the effect on the coercivity by micromagnetic simulations and machine learning, we hope to optimize the overall performance of MnAl-C magnets.

Micromagnetic simulations can be a vital tool to improve the understanding of permanent magnets and their macroscopic properties. Widely used measures to determine the quality of permanent magnets are the nucleation field Hn, the coercive field Hc and the energy density product \(B{H}_{\max }\)9. Here we denote the critical value of the external field at which the magnetization starts to switch irreversibly as nucleation field Hn10. First thing to know is the microstructure of the magnetic material. In a second step an accurate theory is needed to determine the quality measures from this microstructure. Micromagnetic simulations can calculate demagnetization processes and further on Hn, Hc, and \(B{H}_{\max }\) with the use of the present microstructure and the intrinsic magnetic properties of the material. Intrinsic magnetic properties can be obtained either from ab initio simulations or from the measurements of the anisotropy constant, the magnetization and the exchange constant. In addition to the intrinsic magnetic properties the microstructure of permanent magnets influences magnetization reversal. A useful technique to get local crystallographic orientations from a microstructure is Electron Backscatter Diffraction (EBSD) microscopy11. It allows to analyze large areas in comparison to transmission electron microscopy (TEM), with the disadvantage of a worse resolution. The smallest spatial resolution of standard EBSD is about 30–50 nm. Creating an EBSD image with about half a million pixels takes a few hours on modern systems.

In order to create a micromagnetic simulation from an EBSD image, several steps are necessary. The EBSD raw data need to be scanned, the microstructure needs to be extracted and the underlying crystallographic orientation needs to be stored. The microstructure is then meshed with very small finite elements (tetrahedrons). Only now a micromagnetic calculation can be performed. The accuracy of micromagnetic simulations with a known microstructure of a magnet is well described, as for example in ref. 12. The simulations nicely reproduce trends, yet the absolute value of the simulation results is always given with a certain offset to the experiments. The simulations cannot incorporate all effects, as for example inaccurate or unknown phase distributions. The size limitation in the simulations affects coercivity as well, because with larger samples (larger grains), the coercivity is decreasing13. The same offset from experiments to the simulations has been observed for MnAl-C6,14. In this work we are interested in trends of the nucleation field distribution and to find weak spots in the microstructure, which can still be done in spite of this offset.

The size of a micromagnetic simulation model is limited. The intrinsic length to represent magnetization reversal, the so-called characteristic length scale, need to be chosen such, that nucleation processes or domain wall movements can be reflected within the finite element mesh. The largest allowed length of a finite element is typically in the range of a few nm only. There are two ways to calculate the demagnetization curve numerically with micromagnetic solvers:

  1. (1)

    With state-of-the-art micromagnetic energy minimization codes it is possible to calculate the demagnetization curve of a permanent magnet with a size of 250 × 250 × 250 nm3 within 1–2 days15. Using a finite differences code, optimized for massively parallel computing on a supercomputer, a cube with the size of about 1 × 1 × 1 μm can be solved within a reasonable time. For example, using 256 CPUs the authors report a simulation time of 6.68 days for computing a demagnetization curve16.

  2. (2)

    Solving time-dependent Landau–Lifshitz–Gilbert (LLG) equations reduces the practical model size even further. With the software applied in this work17 the computation of the demagnetization curve of a cube with a size of 200 × 200 × 200 nm3 on a single CPU lasts about 3–6 days depending on the magnetic configuration.

Alternatively to the simulation of single large systems, several small parts, each covering a specific aspect of the microstructure, can be simulated. However, computing many small samples can be challenging again in terms of available computational resources. Predicting the coercive field directly from a microstructural image would speed up the computation tremendously. A promising approach to reduce the computational costs is the use of machine learning to predict the coercivity of permanent magnets18. Even if it is only a rough approximation, one can gain qualitative information of the influence of local microstructural properties on the coercivity. In material sciences the use of machine learning has helped determine microstructure-properties relationships, automates and increases the speed in materials characterization or fosters materials discovery19. Size limitations can be overcome by e.g., bridging the gap between molecular dynamics and macroscopic materiel sciences20. Machine learning has been used for the characterization of e.g., steel microstructures21. In Mg-alloys, Orme et al. analyzed the formation and propagation of twinning boundaries using decision trees22.

Typically in machine learning a large amount of data is required. After acquisition, the data need to be prepared to fit a chosen machine learning model that is suitable for the specific problem. The model is fed with the data to incrementally improve the predictive ability. Evaluation is performed on parts of the data, that have not been used for training. Finally, a completely new set of input data is tested. The errors are computed for this test set, which is data the model have not seen before. In this work we obtain the training data directly from EBSD data of MnAl-C and micromagnetic simulations23. Using automated geometry construction and discretization of the EBSD map, we can compute local nucleation fields for many small image extracts (selections) using micromagnetic simulations. Thus a large amount of input data can be acquired. We tried random forest and gradient boosting regressors24 for learning local nucleation fields Hn. We also show that, when fitting Hn to a set of selected properties derived from the EBSD image, a simple linear regression model gives good results. The models were evaluated and their hyperparameters optimized by averaging the error of a 5-fold cross-validation. Afterwards we test the prediction performance of the initial training data on a completely new EBSD map. Generating the necessary data for a machine learning model once takes very long (weeks or months), yet training and predicting then only takes a few minutes. Even though the prediction contains errors, the quality of a magnet can be analyzed and good or bad spots in the magnet’s microstructure can be found.

In other fields of material science it is common practice to use machine learning for characterizing the influence of microstructure on materials properties19. This work is a first attempt to bring machine learning to the magnetism community. We think our method is useful from several points of view:

  1. (1)

    Machine learning is a building block for multiscale simulations.

  2. (2)

    Machine learning helps to identify weak spots with local nucleation fields in the microstructure.

  3. (3)

    Machine learning can contribute and act as an additional tool that complements magnetic measurements and micromagnetic simulations.

Results

In the Results section we show feature importance of our machine learning model using different approaches:

  1. (1)

    We predict nucleation fields for different feature vectors (single features or feature combinations).

  2. (2)

    Partial dependence plots show the relation from features to predicted results.

  3. (3)

    A linear regression model highlights the strength of important derived features.

Further on we discuss problems with complex structures such as grain boundary junctions. And we test the trained machine on a new EBSD dataset (EBSD-B) to demonstrate the predictive ability. In the end we suggest a way to use the predicted nucleation fields for obtaining bulk magnetic properties.

Feature importance by feature vector

In section Feature engineering we defined five features from our microstructural EBSD dataset. Not all features might be equally important to accurately predict the nucleation field. In order to see the feature importance we predicted the local nucleation field via a single feature and combinations of features. Table 1 shows the mean absolute error (MAE) and root mean squared error (RMSE) for the various models trained with different feature vectors. Please note that the hyperparameters of the regressors were optimized just once using all features. Afterwards they are applied to all feature combinations. A separate optimization for each feature combination gave no significant improvement of the prediction performance.

Table 1 Machine learning scores.

We compare predictions for the same map (EBSD-A) as well as for a new map (EBSD-B) as outlined in Fig. 1a, b, respectively. The lowest MAE is reached with the feature GRAIN, in which the location information of the grains is not included. The PIXEL feature, which contains the additional position information, performs a little worse. Probably redundant data and the high number of features (large feature vector F) make the training more difficult25. The feature TWIN produces still an acceptable score, whereas, SW and SIZE should only be used in addition to other features.

Fig. 1: Training and testing.
figure 1

a A 5-fold cross-validation is performed for the dataset EBSD-A (Fig. 3). b The full dataset EBSD-A is used for predictions in a new map EBSD-B (Fig. 5).

The combination of several features introduces more possibilities for accurately predicting the nucleation field. The best feature combinations are shown in Table 1. Although a combination of many features should increase the prediction accuracy, the best performance is given with the features GRAIN, SIZE, SW, and TWIN. Due to the dependency of PIXEL and GRAIN, which introduces a lot of redundant information to the feature vector F25, the prediction accuracy is not improving, when adding the PIXEL feature.

In Fig. 2a we show predictions for the local nucleation fields in EBSD-A calculated by our voting regressors using all five features (PIXEL, GRAIN, SIZE, SW, and TWIN). Local nucleation fields are nicely reconstructed with lower values at grain boundaries and higher values for bulk as shown in the simulations (Fig. 3c). For visual comparison to the previous image we are using only GRAIN for predicting the local coercivity of the same EBSD map (Fig. 2b). Although the differences in MAE and RMSE are quite large in Table 1, the visual comparison shows a very similar trend. This trend is also found in simplified hysteresis curves constructed from the predicted nucleation fields in section Bulk magnetic properties.

Fig. 2: EBSD-A prediction maps.
figure 2

Nucleation fields Hn predicted for EBSD-A (Fig. 3) by using a PIXEL, GRAIN, SIZE, SW, TWIN, and b GRAIN as training features.

Fig. 3: Training map EBSD-A.
figure 3

a, b EBSD map of 600 × 400 pixels with a pixel edge length of 0.3 μm divided in two halves. a The left half of the figure shows an inverse pole figure (IPF) map converted with Dream3D (dream3d.bluequartz.net, last visited on 17/12/2019). b The right half of the figure shows crystallographic twin boundaries: true-twins (red), order-fault-twins (green), pseudo-twins (blue), and others (gray). c The same map is used to depict the boundaries and the micromagnetically computed nucleation fields Hn. Each square corresponds to a unique selection (10 × 10 pixels) color-coded with Hn. Red squares show weak spots in the microstructure.

Feature importance by partial dependence plots

A partial dependence plot highlights the effect of a given feature, while averaging the contributions of the other features26. In Fig. 4a we show one-way partial dependence plots of the features SW and TWIN, whereby the model was trained with a feature vector containing all features: PIXEL, GRAIN, SIZE, SW, and TWIN. We know from literature that SW is important, clearly the SW dependence plot shows an almost linear relation to the predicted values. For the TWIN feature we observe a similar trend, yet with a much stronger influence on the predictions. Large misalignment between adjacent grain’s orientations cause a drastic reduction in Hn. A two-way partial dependence curve of SW and TWIN in Fig. 4b shows, that best predicted nucleation fields can be expected with a large feature value of both, SW and TWIN. Here again the strong influence of the TWIN feature is visible, strong misalignment drastically reduces Hn even for a large SW value. Partial dependence plots for PIXEL, GRAIN, and SIZE are not reasonable, as each one of it contains multiple features (see section Feature engineering).

Fig. 4: Partial dependence plots.
figure 4

a One-way partial dependence plots of the features SW and TWIN showing the effect on the predictions. b The interaction of SW and TWIN is shown in a two-way partial dependence plot (contour lines with predicted Hn). The model was trained with a feature vector containing all features (PIXEL, GRAIN, SIZE, SW, and TWIN).

Feature importance by linear regression

Based on a qualitative understanding of magnetization reversal a simple model for nucleation fields would be a linear regressor having SW and TWIN as features. The importance of these two features is also shown in Fig. 4. Therefore we trained a linear regressor with an ordinary least square fit. We applied a 5-fold cross-validation to see the performance of the model on EBSD-A (Fig. 1a) and we tested the model on the new map EBSD-B (Fig. 1b). In Table 2 we show the MAE and RMSE scores of the linear regression model with the features SW and TWIN as input feature vector. The mean cross-validation scores are slightly worse than that obtained for the optimal feature combination with the voting regressor (see Table 1). However, when tested with previously unseen data, the linear regressor provides results comparable to the voting regressor. The MAE and RMSE on the test set (EBSD-B in Fig. 5) were 0.166 T/μ0 and 0.245 T/μ0 with the feature combination SW and TWIN. Values for predicting the new EBSD map are close to those of the voting regressor, which uses GRAIN, SIZE, SW, and TWIN as features.

Table 2 Linear regression scores.
Fig. 5: Comparison of computation and prediction for EBSD-B.
figure 5

EBSD map of 600 × 270 pixels with 0.3 μm pixel edge length. a Nucleation fields Hn computed and b predicted by using the data from EBSD-A (Fig. 3) with the features GRAIN, SIZE, SW, and TWIN. c Absolute deviation between calculated and predicted nucleation fields.

The comparison of the results for the optimized voting regressor and the linear model shows that the voting regressor fits better to the training data. Nevertheless both models achieve similar performance measures for previously unseen data. The very good results from the linear regressor suggests as well, that the features SW and TWIN are very powerful in the proposed machine learning model.

Microstructural complexity

Right now we have used all the simulation data of EBSD-A (Fig. 3) for training and testing. For better understanding of the MAE and RMSE values, we can split the data according to the number of grains in the selections. Each dataset of EBSD-A with the same number of grains is individually used for training and testing (80% training data, 20% test data). With increasing number of grains, the MAE of the prediction is increasing as well, which is shown in Table 3. A large number of grains in the selections increases the systems complexity with many possibilities of magnetization reversal. Here most likely the number of input data has to be extended, in order to improve the prediction accuracy. The RMSE value is rather high with more than 1 grain, which indicates again the lack of data.

Table 3 Prediction accuracy according to the number of grains in the selections.

Prediction accuracy of new EBSD maps

With the already trained and validated machine learning model from EBSD-A (Fig. 3), any new EBSD map from the same material can be predicted in a matter of seconds rather than weeks as needed for micromagnetic simulations. To determine the prediction accuracy for new maps we test the model trained by EBSD-A with selections of a new map EBSD-B (Fig. 5). Firstly, the machine learning model is trained with the data from EBSD-A. Afterwards the machine learning model is tested with an entirely new dataset from EBSD-B, which is obtained from a different sector, but from the same MnAl-C sample as in EBSD-A with a size of 180 × 81 μm2 (600 × 270 pixels). For this new data we compare micromagnetic simulations with the predictions of the machine learning model. We use the same sampling method as from EBSD-A. On a regular grid every second selection with a size of 10 × 10 pixels is picked and a finite element mesh is created for the simulations. Duplicate selections are removed from the dataset. Figure 5a shows the micromagnetically computed nucleation fields Hn of EBSD-B (about 400 simulations).

Now we can predict the nucleation fields Hn on the very same selections by using the trained regression trees from the original data of EBSD-A. We use the feature vector with the highest precision (see Table 1): GRAIN, SIZE, SW, and TWIN. Figure 5b illustrates those predicted nucleation fields Hn, which show less variation in Hn than the simulated map. Consequently the MAE is larger than in previous results of EBSD-A with a value of 0.157 T/μ0 (see Table 1). In Fig. 5c we show the absolute deviation ΔHn of the computed and predicted nucleation fields. There are regions where the machine learning model nicely predicts the computed results, whereas in other regions relevant training data is missing. More data need to be produced to better model these underrepresented areas. And probably a large number of grains in the selections causes some of the deviations as well, which is discussed in section Microstructural complexity.

We created a residual plot according to ref. 27, where we show the difference Hn,comp − Hn,pred versus the predicted nucleation fields Hn,pred (Fig. 6). Most residuals lie in a range of ±1 T/μ0 and show no trend in the distribution, which is an important indicator for the quality of the machine learning model. In this work we are interested in weak regions of the microstructure, hence absolute values are not that important. With the help of the machine learning model, trends in the distribution of the nucleation fields can be obtained very fast.

Fig. 6: Residual plot.
figure 6

Residuals are given for predictions of the nucleation fields in EBSD-A (Fig. 3) and EBSD-B (Fig. 5). For the values of EBSD-A we use the validation set of each fold, whereas for EBSD-B we use the full dataset (see Fig. 1a, b, respectively). The model for predicting the nucleation fields used GRAIN, SIZE, SW, and TWIN as features.

Bulk magnetic properties

A trained machine learning model has the ability to predict the local nucleation fields of a full EBSD map within seconds (see EBSD-A in Fig. 2). From the prediction information we suggest a method to construct the hysteresis curve of the whole EBSD sample. In Fig. 7 we show a comparison of two demagnetization curves computed from predictions of EBSD-A with two different feature sets (shown in Fig. 2). We construct the curve with a decreasing external field from 0 to −3 T/μ0. At each field value we count the selections with a predicted nucleation field greater and lower than this particular field value. As soon as the field value is greater than a locally predicted nucleation field, we assume that this selection is magnetically reversed (switched). With the ratio from unswitched selections to all selections normalized from 1 to −1 we obtain the full curve. Due to the simplification we see a rather high coercive field for both curves (Hc ≈ 2.35 T/μ0). In reality the coercivity is much lower6. In the future we plan to include stray field and exchange coupling between the selections to improve the predicted hysteresis curves.

Fig. 7: Full EBSD-A hysteresis curves.
figure 7

Constructed hysteresis curves neglecting stray fields and exchange coupling between selections. The local nucleation fields are taken from Fig. 2a (model with PIXEL, GRAIN, SIZE, SW, and TWIN) and Fig. 2b (model with GRAIN).

Discussion

We have demonstrated a fast way to predict local nucleation fields for MnAl-C. An automated script scans a large EBSD dataset, selects hundreds of unique samples and creates quasi-3D finite element meshes accordingly. For each selection the local nucleation field Hn is computed by a fast Landau–Lifshitz–Gilbert micromagnetic solver. The microstructural features of the EBSD selections and its nucleation fields are used to train random forest and gradient boosting regressors. A voting regressor combines those two methods for an improved prediction accuracy. We use a 5-fold cross-validation to test the performance of single features and feature combinations on an EBSD map. We are using features like PIXEL, that are a set of multiple features, the data points of the EBSD map. On one hand, training the machine with PIXEL includes the orientation of the grains, the geometry as well as the location information. On the other hand, quite a lot of redundant information is in the dataset. The GRAIN feature includes only the orientation of the single grains and has no geometrical nor location information. This might be the reason that none of the single or combined features reach a mean absolute error below 0.1 T/μ0.

The trained machine learning model from the initial map EBSD-A is used to predict local nucleation fields of a new map EBSD-B. Comparison between predictions and computations show that nucleation fields in large areas of the new EBSD map can be nicely reconstructed. Some regions show larger deviations, which might be caused by missing relevant training data and the complexity of the microstructure. Partial dependence plots and a linear regression model showed the strengths of the features SW (minimum Stoner–Wohlfarth switching field) and TWIN (measuring the maximum misalignment between adjacent grains in a selection). Both are derived features based on the physics of magnetization reversal. The feature SW is the classical minimum switching field according to the Stoner–Wohlfarth theory. Exchange interactions between neighboring grains reduce the switching field with increasing misalignment angle28.

Predicting the nucleation fields on a new map with the linear regressor (active features: SW and TWIN) showed a very good result, coming close to those values, predicted with the best feature combinations of the regression decision trees. Even for differing resolutions of the EBSD datasets the machine learning model could be used, because the geometrical information is not absolutely necessary. In such a case, the error caused by the different pixel size need to be quantified. Using the trained machine learning model a large EBSD dataset can be fully scanned and a map of local nucleation fields can be created within seconds. This information can serve as an input for constructing a hysteresis curve of the whole EBSD sample.

Machine learning for magnetic materials faces the following challenges:

  1. (1)

    Usually machine learning is related to big data whereas in material science the amount of data is typically low. Since it is difficult to get lots of experimental datasets, the use of micromagnetic simulations is a way to generate training data. For the future, we hope that data assimilation methods that combine computed and experimental data will form the basis for datasets that relate coercivity and microstructure. The term data assimilation refers to any systematic procedure that uses experimental measurements to actually improve model simulation29. For example, Matsumoto et al. merges ab initio simulations and experimental data to develop light-rare-earth-based permanent magnets30.

  2. (2)

    Micromagnetic simulations with the currently available input and computational resources are not accurate enough to give quantitative predictions of nucleation fields and coercivity. Nevertheless, it can be used to see trends in the nucleation field distribution or to find weak spots in the microstructure.

In the following we give prospective applications of machine learning in magnetism:

  1. (1)

    Machine learning can be used to build better models. It can be seen as a reduced order model that spans over various length scales.

  2. (2)

    The proposed method can be used to identify weak spots in a magnets microstructure.

  3. (3)

    Most significant features can give a handle to improve the development process of permanent magnets.

Machine learning in material science is commonly used to characterize the materials’ microstructure. In this work we hope to bridge the gap between machine learning and the magnetism community and to foster research in magnetism.

Methods

In the following section we explain the basics of machine learning and the regression decision trees we used. We give insights in the selected microstructural features and our optimized labels for training the machine learning model. The input data are generated by an automated micromagnetic meshing and simulation routine. It consists of many small selections of a large Electron Backscatter Diffraction (EBSD) image. For each selection we compute the local nucleation field with an external applied field in y-direction (see coordinate system in all the figures). To improve the prediction accuracy we optimize the hyperparameters of the decision trees.

Machine learning

Machine learning is a statistical approach to automatically analyze large datasets (explained in detail for instance in ref. 24). The dataset is structured in features and labels. Features are the input for the machine learning model in form of a vector. Labels are the quantities, which we want to teach the model to predict. A model is fitted to the data (training), while it is tested with a subset of the data to get the predictive performance (validation). The trained and validated model is than used to predict the results of new datasets (test). We apply supervised learning where the input data are already given with the true solutions (labels). In this work we use microstructural features of MnAl-C obtained by Electron Backscatter Diffraction (EBSD) microscopy. We generate many small quasi-3D sub-models (selections) of the EBSD dataset and compute the magnetic nucleation fields Hn, the corresponding labels.

In a first step we create input data from EBSD-A (Fig. 3), where we apply a 5-fold cross-validation (Fig. 1a). Training and testing is applied five times with different subdivisions of the dataset, and the prediction accuracy is averaged to obtain an overall training performance. In a second step the machine learning model can be trained again with the full data of EBSD-A (Fig. 1b). For testing we apply this model to EBSD-B (Fig. 5), which is obtained from a different sector, but from the same MnAl-C sample. The data associated with this new EBSD map were not seen before by the machine learning model. For quantitative analysis of the predicted values we are using the mean absolute error (MAE) and the root mean squared error (RMSE). When using the RMSE, large errors are penalized considerably, while using the MAE makes the results easier to interpret.

Classification and regression are the most commonly used tasks in supervised learning. While the labels are split into classes (classification problem), the real values are trained and predicted with regression machine learning models. We use regression decision trees to predict the nucleation field Hn. With linear regression the data are assumed to correspond linearly, yet in real case scenarios, which are usually much more complex, labels depend nonlinearly on the features. Similar to Exl et al.18, we use random forest (RF) and gradient boosting (GB) decision trees to account for the nonlinearity, which we observed in our results. RF trees combine predictions of individual decision trees trained over randomly generated sub-training samples (often called as ensemble learning). GB trees combine again predictions from several generated sub-training samples, but this time each sample tree learns from residual errors from the previous one. A voting regressor (VR) is then used to average the individual predictions on the whole dataset and to form a final prediction.

We are using Python with the Scikit-Learn framework24,31 to model the decision trees and to have the respective scores, hyperparameter optimization and visual representations available. The hyperparameters of our RF and GB regressors are optimized with the GridSearchCV method including the full feature vector (see details in section Feature engineering) and a 5-fold cross-validation (see Fig. 1a). As a score we use the negative mean squared error. Details on the optimized hyperparameters can be found in Supplementary Note 1. We tested also a Bayesian optimizer32 with 5-fold cross-validation to tune the hyperparameters for each possible feature combination (see section Feature importance by feature vector), yet it showed no significant effect on the results. In other words, our regressors are very robust when changing the hyperparameters. Therefore we stick with the hyperparameters from the GridSearchCV approach for the rest of this work.

Dataset generation

Machine learning usually requires a large amount of data to accurately predict new datasets. Using micromagnetic simulations, a reasonable amount of data can be generated for training our regression decision trees. We are computing the nucleation fields Hn of MnAl-C selections obtained from EBSD data. The original data are represented as crystallographic orientations on a regular grid, which are split into two files. One file contains the grain index of each pixel, and in the second the corresponding crystallographic orientation matrices are stored. We translate each orientation matrix into a unit vector, since we calculate micromagnetic simulations on uniaxial grains. EBSD-A (Fig. 3) has a size of 180 × 120 μm2 (600 × 400 pixels). Since this sample is far too large to simulate, we will simulate about 600 10 × 10 pixel selections of the entire image. These selections are picked on a regular grid. In order to limit the number of simulations, only every second selection is taken. We extrude and triangulate each such selection into a finite element mesh with elements much smaller than a pixel and determine their nucleation fields by micromagnetic simulations.

All selections taken from inside the same grain, i.e., the selections include no boundaries and have exactly the same magnetocrystalline orientation, result in the same nucleation field. This is obvious because for all such selections the features are the same. Thus taking more than one selection from the interior of one grain just duplicates the datasets. Here we want to understand how the local structure influences the local nucleation field. Therefore, we can drop the duplicates. Indeed, tests showed the following: Picking randomly 1500 selections from EBSD-A (Fig. 3) without removing duplicates, the prediction accuracy was poor. Sampling on a regular grid with removing duplicates gave the best prediction accuracy. This is consistent with a Monte Carlo study of the influence of duplicate records on regression estimates33. Duplicates were found to increase the RMSE depending on their number and position. Figure 3c shows the grain boundaries of EBSD-A and the simulation selections (about 600) colored by their computed nucleation field Hn. Due to the uniqueness constraint, the selections are preferentially located at the grain boundaries. Since nucleation usually starts near defects, which for MnAl-C are most often grain or twin boundaries (Fig. 3b), those locations are the most important determining factors for the bulk coercivity.

Feature engineering

We are using supervised learning, which means, that the input data consists of features and labels (see section Machine Learning). As stated in ref. 34, feature engineering is an essential step in establishing a machine learning model. It is a process of transforming the raw data into good quality features, which improves the overall performance of the machine learning model. Features depend a lot on the underlying problem and on the raw data. In our case the features are extracted from the microstructure of MnAl-C, whereas the labels are the result of micromagnetic simulations (nucleation field Hn).

For each selection of the large EBSD image we know the crystallographic orientation. We can use each individual pixel of an EBSD selection, with its orientation, as a feature and/or we can use each grain, with its orientation. While pixels also include the location coordinates, this information is not given in the list of grains. In our micromagnetic simulations we assume grains with uniaxial magnetocrystalline anisotropy. Hence, it is sufficient to define the magnetocrystalline orientation for each grain as unit vector that points in the hemisphere of the positive y-direction (Fig. 3). The size of the grains can affect the nucleation field and is therefore also considered as a microstructural feature.

The coercivity of permanent magnets is often described by

$${H}_{{\rm{c}}}=\alpha {H}_{{\rm{N}},\min }-{N}_{{\rm{eff}}}{M}_{{\rm{s}}},$$
(1)

where \({H}_{{\rm{N}},\min }\) is the minimum switching field of misoriented grains8. The coefficient α expresses the reduction in coercivity due to defects and intergrain exchange interactions. The microstructural parameter Neff is related to the effect of the local demagnetization field near sharp edges and corners of the microstructure. By using the Stoner–Wohlfarth (SW) model for single-domain ferromagnets35 we analytically calculate the minimum switching field overall grains i:

$${H}_{{\rm{N}},\min }=\mathop{\min }\limits_{i}\left\{2K/{J}_{{\rm{s}}}{\left(\cos {({\beta }_{i})}^{2/3}+\sin {({\beta }_{i})}^{2/3}\right)}^{-3/2}\right\},$$
(2)

with the misorientation angle βi between the easy axis of a grain i and the external field. We assume that the external field is applied parallel to the y-axis of a Cartesian coordinate system (Fig. 3). The lowest field value of all grains gives the irreversible switching event for this selection, if other effects like stray field and grain boundary exchange coupling are neglected. The minimum switching field \({H}_{{\rm{N}},\min }\) is used as another feature called SW.

Twin boundaries are known to have a strong influence on the quality of permanent magnets. Especially in MnAl-C true-twins, order-fault-twins and pseudo-twins are frequently found6, which have a clearly defined misorientation angle. In previous work we computed hysteresis curves for selections of MnAl-C and showed a significant effect of twinning boundaries on the coercive field14,23. From manual observation of the simulation results and from partial dependence plots in section Feature importance by partial dependence plots we observe a negative influence on the coercivity, if the adjacent grains’ orientations have a large enclosed angle. Therefore we incorporate the twin angles into a feature for machine learning by calculating the minimum dot product of adjacent grains’ orientation unit vectors. A large angle implicitly means that both of the adjacent grains are disadvantageously orientated to the applied external field direction. Whereas the SW model incorporates only the misorientation angle of a single grain’s orientation with respect to the external field.

From the microstructural attributes and analytical equations we acquire the following features used for machine learning. In brackets the short feature name is given as well as the number of features as seen by the machine.

  • Crystallographic orientation for each data point as unit vector (PIXEL, 300)

  • Crystallographic orientation for each grain as unit vector (GRAIN, 60)

  • Size of grains (SIZE, 20)

  • Minimum Stoner–Wohlfarth switching field \({H}_{{\rm{N}},\min }\) (SW, 1)

  • Minimum dot product of adjacent grains’ orientation (TWIN, 1)

All the given features form the feature vector F:

$${\bf{F}} = [\underbrace {{{\rm{P}}_{1xyz}}\, \ldots \,{{\rm{P}}_{100xyz}}}_{{\rm{PIXEL}}}\,\underbrace {{{\rm{G}}_{1xyz}}\, \ldots \,{{\rm{G}}_{20xyz}}}_{{\rm{GRAIN}}}\,\underbrace {{\rm{S}}{{\rm{I}}_1}\, \ldots \,{\rm{S}}{{\rm{I}}_{20}}}_{{\rm{SIZE}}}{\rm{SW}}\,{\rm{TWIN}}]$$
(3)

PIXEL’s features are three components of the orientation unit vector for each of the 10 × 10 pixels. The Scikit-Learn software requires a fixed feature vector length, so we set an arbitrary maximum number of grains to 20, which is higher than the maximum number of grains in our selections. Therefore SIZE has 20 entries in the feature vector. And GRAIN has 20 times three entries, which are the components of the orientation unit vectors. We pad the remaining GRAIN and SIZE entries with zeros to fill up the required feature vector length.

Each small selection of the large EBSD image is using the feature vector F as input for learning. It is possible to use only a single feature or feature combinations, but the feature vector length must be kept equal within a training set. In other words, our machine learning model is always trained with a single training set with a well defined set of features. Typical feature correlation or feature elimination methods cannot be used in such a configuration, because each component of the unit vectors in PIXEL and GRAIN are already a feature, hence cannot be reasonably compared to other features, like SW or TWIN. In models based on decision trees feature scaling is less important. However, we found that some features contribute more to a change in the nucleation field Hn than others. Feature importance will be discussed in the Results section.

Labeling

Training the machine using supervised learning requires a well defined set of features as well as a label. We defined the microstructural features in the previous section. For the label we chose the nucleation field Hn defined as the value of the external field at which the first irreversible switching event occurs along the demagnetization curve. Irreversible switching can be recognized by a kink in the demagnetization curve, where a discontinuity of magnetization and susceptibility occurs10. Often, the switching field is sufficiently defined by the coercivity, i.e., the field which is needed to reverse the sign of the magnetization projected on the field direction as for example the green line in Fig. 8. For such a curve the magnet is clearly in a reversed state after the step. The blue line is much flatter and shows three small steps.

Fig. 8: Sample hysteresis curves.
figure 8

Hysteresis curves may contain single (green) or multiple (blue) switching events. a First step, b last step, and c largest step of the curve.

There might be more than one irreversible switching event along the demagnetization curve of a selection. Therefore we have different possibilities in defining a label. It could be either the value of the external field at the (a) first step, (b) the last step, or (c) the largest step in the curve. In order to find the label with the best prediction performance we created a simple classification problem, which tries to predict the 20% percentile of the lowest switching fields. We obtained 90, 85, and 86% correct predictions for using (a), (b), or (c) as label, respectively. Therefore we decided to stick to the first step of the curve (Fig. 8a), the nucleation field Hn, as definition of our label.

Micromagnetic simulations

We generate the input data for our machine learning model with micromagnetic simulations (see section Dataset generation). The resolution of the EBSD dataset is not high enough to be directly used for micromagnetic simulations. Grain boundaries, which are small compared to the grid resolution are not accurately spatially resolved. Sharp angles need to be smoothed for the simulations in order to reduce numerical instabilities. In previous work we showed an automated meshing procedure for the fast generation of simulation models23. We combine multiple steps of the toolchain into a single meshing tool using the pre and postprocessing software Salome (salome-platform.org, last visited on 29/07/2019). We load the pixelated EBSD data into the software, smooth the grain boundaries and create a high quality finite element mesh (Supplementary Algorithm 1). The software basically finds all intersection points of the grain boundaries and reconnects them with Bezier curves, replacing the stepped lines of the original pixelated EBSD image. Figure 9 shows an example of an original EBSD selection, the creation of a smoothed 2-dimensional surface and the extruded 3-dimensional mesh. The smoothing operation creates many more data points (pixels) compared to the original selection. A higher number of data points increases the number of components in the feature vector F (see section Feature engineering), which slows down the learning process, but does not significantly improve the machine learning accuracy. Therefore we stick to the discretization points of the original EBSD data for training the machine learning model.

Fig. 9: Automated mesh generation.
figure 9

a Original EBSD data selection, b smoothed grain boundaries, and c tetrahedral finite element mesh with arbitrary colors according to the grain id. Simulation parameters are given in Table 4.

The finite element mesh obtained from the automated meshing routine is used by a hybrid finite element boundary method for magnetostatics17. We are computing magnetization reversal curves, solving the Landau–Lifshitz–Gilbert (LLG) equation. Intrinsic magnetic properties listed in Table 4 are obtained from bulk MnAl-C36 at 300 K. The largest edge length of the tetrahedral finite elements, i.e., mesh size, is set to be smaller than the smallest characteristic length of the material23. Bance et al. downsized the original EBSD dataset to reduce computational costs by 1:500 and 1:5014. Here we use a scaling ratio of 1:15, so that 1 μm in the original EBSD dataset corresponds to 67 nm in the simulations. Please note that scaling might only change the magnetostatic interactions, whereas the influence of misalignment and exchange interactions between different grains on the nucleation is fully taken into account. The final mesh of a single selection has a size of 200 × 200 × 40 nm3 with about 370,000 tetrahedrons, consisting of 30,000 triangles or 70,000 nodes. A simulation on a single CPU takes several hours depending on the magnetic configuration.

Table 4 Micromagnetic parameters for the simulations.

The nucleation fields Hn, obtained by the micromagnetic simulations, are the labels for the machine learning model. We compute the hysteresis curves with an external field aligned in y-direction (Fig. 9) from 4 to −4 T/μ0 with a sweep rate of 40 mT/ns and a Gilbert damping constant of 1. For each selection we apply a free boundary, which creates demagnetizing fields and may also cause edge nucleation. However we have discovered, that misalignment between neighboring grains (twins) leads to a much larger reduction of coercivity than demagnetizing fields at the edges. The reversed domains nucleate at the interface between strongly misoriented regions (twin boundaries). In other words, the effect of the demagnetizing field is smaller than that of misoriented grains (see section Feature importance by partial dependence plots). Nevertheless we are clear that this gives only an approximation of the local nucleation field. Each squared selection in Fig. 3c is color-coded according to its nucleation field Hn from red (1 T/μ0) to green (3 T/μ0). Please be aware of our definition of Hn being the first step in the demagnetization curve as explained in section Labeling.