Machine Learning Guided Discovery of Gigantic Magnetocaloric Effect in HoB$_{2}$ Near Hydrogen Liquefaction Temperature

Magnetic refrigeration exploits the magnetocaloric effect which is the entropy change upon application and removal of magnetic fields in materials, providing an alternate path for refrigeration other than the conventional gas cycles. While intensive research has uncovered a vast number of magnetic materials which exhibits large magnetocaloric effect, these properties for a large number of compounds still remain unknown. To explore new functional materials in this unknown space, machine learning is used as a guide for selecting materials which could exhibit large magnetocaloric effect. By this approach, HoB$_{2}$ is singled out, synthesized and its magnetocaloric properties are evaluated, leading to the experimental discovery of gigantic magnetic entropy change 40.1 J kg$^{-1}$ K$^{-1}$ (0.35 J cm$^{-3}$ K$^{-1}$) for a field change of 5 T in the vicinity of a ferromagnetic second-order phase transition with a Curie temperature of 15 K. This is the highest value reported so far, to our knowledge, near the hydrogen liquefaction temperature thus it is a highly suitable material for hydrogen liquefaction and low temperature magnetic cooling applications.


Introduction
The magnetocaloric effect (MCE) is becoming a promising approach into environmental friendly cooling, as it does not depend on the use of hazardous or greenhouse gases [1][2][3][4] while being in principle able to attain a higher thermodynamic cycle efficiency 1,5,6 where this cycle makes use of the magnetic entropy change (SM) and the adiabatic temperature change (Tad) through the application/removal of a magnetic field in a material.
Since large values of SM are usually achieved near the magnetic transition temperature (Tmag), the working temperature range is confined around the Tmag of the material. MCE was first used to achieve ultra-low cryogenic temperatures (below 1 K) 7 and has been widely used for liquefying He 8 where the main component is the gadolinium gallium garnet Gd3Ga5O12(GGG) 9 . The remarkable discovery of giant MCE near room temperature in materials such as Gd5Si2Ge2 10 , La(Fe,Si)13 11 and MnFeP1-xAsx 3 families, shifted the main focus of research into finding and tuning new materials, such as NiMnIn Heusler alloys 12 , working around this temperature range due to its potential economic and environmental impact. 2 On the other hand, there is an increasing demand for cooling systems around hydrogen liquefaction temperature (T = 20.3 K), since liquid hydrogen is one of the candidates as green fuel for substitution of petroleum-based fuels 13 and also widely needed as a rocket propellant and space exploration fuel 14,15 . It has been shown that MCE based refrigerators prototypes have shown to be highly appropriate for this task. 5 In this context, the discovery of materials exhibiting remarkable MCE response near the liquefaction temperature of hydrogen is highly anticipated.
One way of tackling such a problem is by taking advantage of data-driven approaches, such as machine learning (ML), as it has been successfully applied from the modeling of new superconductors 16,17 and thermoelectric 18,19 to the prediction of synthesizability of inorganic materials 20 . However, in the case of MCE, this kind of approach has not been extensively tried, being limited to first principle calculations which have been restricted to non rare-earth systems. 21 As a result of extensive research in magnetocaloric materials, accumulated data about MCE properties of diverse types of materials can be accessed in recent reviews 2,22 . In addition, recent efforts into extracting the Tmag of materials from research reports led to the creation of the MagneticMaterials 23 , an auto-generated database of magnetic materials built by natural language processing which contains a vast number of materials where their magnetic properties are known. Among these materials, there are still many in which its MCE properties have not been evaluated. Therefore, by combining the known and unknown data, a data-driven trial can be used as a guide for finding materials with high MCE response.
In this work, we attempt a novel approach by using ML to screen and select compounds which might exhibit a high MCE performance, focusing on ferromagnetic materials with Curie temperature (TC) around 20 K. For this purpose, we collected data from the literature 2,23 in order to train an ML algorithm for an attempt to predict SM of given material composition. By this method, we singled out HoB2 (TC = 15 K 24 ) as a possible candidate, leading us to the experimental discovery of a giant MCE of |SM| = 40.1 J kg -1 K -1 (0.35 J cm -3 K -1 ) for a field change of 5 T in this material. Figure 1 shows the schematic flow exhibiting the construction of the machine-learned model for MCE materials. We started with the screening of magnetocaloric relevant papers from the MagneticMaterials 23 database and gathered the reported MCE properties contained therein, mostly focusing on the reported values of |SM| for a given field change (0H) of given material composition. Combining the obtained data with the data reported by a recent review, 2 1644 data points were obtained. For an attempt at the prediction of the MCE property of a material, namely |SM|, it was chosen to use compositional based features (such as the atomic mass of constituent elements, their specific heat at 295 K, number of valance electrons, etc.), which was directly extracted from the material compositions by using the XenonPy package 25 . Combining the extracted features with the reported values of |SM| and 0H a gradient boosted tree algorithm implemented on the XGBoost 26 package was trained over 80% of the total data. To further improve the prediction power, model selection, and hyperparameter tuning was done by using a Bayesian optimization technique implemented into the HyperOpt 27 package by minimizing the mean absolute error (MAE) 10 fold crossvalidation score. The resulting model achieved an MAE of 1.8 J kg -1 K -1 while tested on the remaining 20% of the data. For more details about the data acquisition, data processing and model building see the Supplementary Information sections 1-3. After the model construction, we examined 818 unknown SM text-mined compositions with TC ≤150 K contained in the MagneticMaterials using the following criteria: the predicted value of |SM| higher than 15 J kg -1 K -1 , alloys only, chemical composition containing heavier rare earth elements (Gd-Er) and free of toxic elements, such as arsenic. As a result, HoB2 (AlB2 type, space group P6/mmm) was selected, for having the highest predicted |SM| (16.3 J kg -1 K -1 ) for 0H = 5 T among the binary candidates followed by its synthesis and characterization of its MCE properties.

Sample Synthesis
Polycrystalline samples of HoB2 were synthesized by an arc-melting process in a watercooled copper heath arc furnace. Stoichiometric amounts of Ho (99.9% purity) and B (99.5% purity) were arc melted under Ar atmosphere. To ensure homogeneity, the sample was flipped and melted several times followed by annealing in an evacuated quartz tube at 1000 o C for 24 hours and it was finally water quenched. X-ray diffraction was carried out and HoB2 was confirmed as the main phase structure of the obtained sample (Fig. S2a).

Magnetic Measurements
Magnetic measurements were carried out by a superconducting quantum interference device (SQUID) magnetometer contained in the MPMS XL (Magnetic Property Measurement System, Quantum Design)

Specific Heat Measurement
Specific heat measurement was carried out in a PPMS (Physical Property Measurement System, Quantum Design), equipped with a heat capacity option.   For further evaluation of the MCE performance of HoB2, specific heat measurement was carried out (Fig. S1b) to calculate the entropy curves (see Supplementary Information for the details) and the adiabatic temperature change (Tad) was obtained (Fig. 3b), revealing a maximum Tad of 12 K for the field change of 5 T.

Discussion
In order to compare the performance of HoB2 with other candidates for refrigeration applications near hydrogen liquefaction temperature, such as ErAl2 5 , representative large SM (for 0H = 5 T) materials around T = 20 K are displayed in Table 1. We also show the values of | SM| in units of J cm -3 K -1 , which is the ideal unit for the application point of view. 6 Except for single-crystalline ErCo2 which exhibits a first-order phase transition (FOPT), HoB2 manifests the largest |SM| (in both J kg -1 K -1 and J cm -3 K -1 ) and Tad for a field change of 5 T around hydrogen liquefaction temperature. Among SOPT materials, it also exhibits the largest volumetric entropy change (SM in J cm -3 K -1 ) in the temperature range from liquid helium (4.2 K) to liquid nitrogen (77 K) (see Fig. S5). It is important to emphasize that, this gigantic magnetocaloric effect is observed in the vicinity of a SOPT. SOPT materials have the advantage of being free of magnetic and thermal hysteresis while having broader SM peaks. Thus they are likely to be more suitable for refrigeration purposes than FOPT which tend to be plagued by these problems 1,28 . In other words, HoB2 is a high-performance candidate material for low-temperature magnetic refrigeration applications such as hydrogen liquefaction.

Material
Tmag  Table 1 Comparison of MCE-related properties in HoB2 and other materials exhibiting large magnetocaloric response around the liquefaction temperature of hydrogen for field change of 5 T. The data was taken from the references 31-39 in J kg -1 K -1 and also converted into J cm -3 K -1 by using the ideal density according to the AtomWork 40 database. Asterisk (*) indicates an unreported value.

Conclusions
In summary, by using a machine-learning aided approach, we have successfully unveiled a ferromagnet that would manifest a high magnetocaloric performance with a transition temperature around the hydrogen liquefaction temperature. By synthesizing and evaluating its MCE properties, we discovered a gigantic magnetocaloric effect of HoB2 in the vicinity of a SOPT at TC = 15 K, where the maximum obtained magnetocaloric entropy change was 40.1 J kg -1 K -1 (0.35 J cm -3 K -1 ) with an adiabatic temperature change of 12 K for a field change of 5 T, the highest value reported until now, to our knowledge, for materials working near liquefaction temperature of hydrogen.

Authors contributions
Y T conceived the project idea. P B C did the data acquisition, preprocessing, machine learning model building and material predicting with assistance of H Z. P B C and K T did the sample synthesis with assistance of H T. K T did the magnetization measurements and heat capacity measurements with T D Y. K T and P B C analyzed the experimental data. P B C and KT wrote the manuscript. All authors discussed the manuscript together. 17 This supplementary information consists of 8 sections. The first three sections (1-3) will be focused on describing the details of the data acquisition and model building procedure. Here we will not show any mathematical description of the machine learning model, but we will cite the corresponding references.
The following four sections (4-7) will be focused on discussing experimental data. In the fourth section, we show the M-T curves and the specific heat data used for obtaining |SM| and the entropy curves of the main text sample (hereafter labeled as Sample #1). In the fifth, we make a brief discussion regarding the reproducibility of |SM| value as well as XRD pattern from sample to sample. For this purpose, a second sample was synthesized (hereafter labeled as Sample #2). The data of Sample #2 is also used to check the validity of the entropy curve deduced from the magnetization measurements in the main text, and the result is shown in the sixth section where we compare the magnetocaloric effect calculated from both magnetization and specific heat data. At the seventh, we discuss the nature of the magnetic transition by examining the universal scaling as suggested by Refs 1-4 using the |SM| data of Sample #1. In the last section (8), we will compare the obtained value of SM (J cm -3 K -1 ) with representative materials with Tmag in the range from liquid helium (4.2 K) to liquid nitrogen (77 K) and whose SM is greater than 0.15 J cm -3 K -1 .

Section 1 -Data Acquisition and Processing
In order to obtain the data needed for training the machine learning model, two different sources were used: the text-mined autogenerated database of magnetic materials, MagneticMaterials 5 , and the recent report 6 by Franco et. al which contained a comprehensive table of magnetocaloric materials, wherein the values of SM and Tmag were promptly available.
For the case of the MagneticMaterials database, we screened out magnetocaloric papers by filtering the transition temperature with values lower than 100 K and by filtering the journal titles using the following keywords: magnetocaloric, magnetic refrigeration, cryogenic, caloric, refrigeration resulting in 219 total different journal titles from where the magnetocaloric properties were manually extracted.
To remove any possible duplicates in the final dataset, the materials which contained more than one value of SM for a given H had their values averaged and this average was used as the final value. Lastly, we selected the data within the range of H ≤ 5 T, for compatibility with our experimental setup. This procedure gave us the final 1644 data points used for the model construction

Section 2 -Feature Construction
To obtain the physical properties for the modeling of SM, the XenonPy 7 package was used. For this, first the materials composition were transformed into the appropriate format taken by XenonPy with the aid of the pymatgen 8 Python package. Namely, the material composition was converted into a python dictionary where the keys are the atomic compositions and the values are their amount. For example, the suitable input in the case of HoB2 would be the python dictionary: {"Ho": 1.0, "B": 2.0}. Given this input, the compositional based features can be then automatically calculated by XenonPy from 58 element-level properties. Below, we list a few examples of the element-level properties: • lattice constant: bulk atomic material lattice constant (for ex. bulk Ho lattice constant) • specific heat at 295 K: atomic specific heat (ex. bulk Ho specific heat at 295 K)

• number of valence p electrons • atomic number
For the generation of the compositional features for a candidate compound, the elementlevel properties were combined using 7 different featurizers (included in the XenonPy package) such as: Where * and * are the normalized composition ( * + * = 1) and , and , are the ith atomic feature of compound A and B. In the case of HoB2, the weighted average of atomic number compositional feature would be calculated as:* The full list of available element-level properties and featurizers can be found in the package website at: https://xenonpy.readthedocs.io/en/latest/features.html#data-access.
Another feature calculator used was the counting of the 94 atomic elements (from H to Pu, implemented in XenonPy as Counting featurizer). For example, in the case of HoB2, the final features would be given as Ho: 1.0, B:2.0, while all other elements would be zero.
Features which all values were zero, which contained infinite values or not a number (NaN) were removed. In the end, 408 features were used, where 407 were generated by XenonPy and the last was the field change (H). This workflow is summarized below:

Section 3 -Model Selection and Building
As a first attempt for building the machine learning model, two different models were first tried: a LASSO 9 (least absolute shrinkage and selection operator) linear model implemented on the scikit-learn 10 python package and the gradient boosted tree implemented into the XGBoost 11 package. In order to compare their performance the three following metrics were used: coefficient of determination (R²), root mean squared error (RMSE) and mean absolute error (MAE), which are defined as: where is the measured value and is the predicted value. Each model metrics is shown in Table 1. The training and testing of these models were done using the standard machine learning practices: the training/testing data set were split into an 80%/20% ratio using the scikit-learn package, the model was trained in the training data and the results were done in the testing data.  Table S1: Metrics values for two different machine learning models for the testing set.
As we can see from Table S1, LASSO does not perform well, indicating that this problem is a highly non-linear one. In the present work, we decided to use XGBoost as it has been shown to have great performance in diverse machine learning problems, being robust and widely used in production. The complete mathematical description of these algorithms can be found at Ref. 9 for LASSO and Ref. 11 for XGBoost. In the XGBoost model, there are several important hyperparameters that control how the tree model is constructed and trained. The amount of trees (denoted as n_estimators) subsample for bagging (subsample ratio of the training instance, denoted as subsample), subsample ratio of features(denoted as colsample_bytree), maximum depth of a tree (denoted as max_depth), minimum number of instances needed in each node (denoted as min_child_weight) and so on. The complete list of parameters and their definitions can be found at the XGBoost documentation at: https://xgboost.readthedocs.io/en/latest/parameter.html.
For the selection of suitable hyperparameters, we tried exhaustive random and grid search, implemented at the scikit-learn package, and the python package HyperOpt 12 . HyperOpt implements a bayesian approach optimization which usually can find local/global minimums faster than the exhaustive approaches. We found that in our case, the hyperparameters found by using the HyperOpt package for optimizing the MAE in a 10 fold cross-validation procedure of the training set gave the best results compared to random and grid search. Crossvalidation is one of the standard techniques used to test the effectiveness of a machine learning model and its in-depth explanation can be found in the scikit-learn documentation and in freely available machine learning books 9 .
The hyperparameters found by HyperOpt used for the final model are shown in Table S2. By using these parameters, the final R² score was 0.85 while MAE was 1.78 for the testing set, and this model was used for the predictions of SM;  Figure S1a shows the magnetization data for various applied fields in the range from 0 to 5 T for Sample #1. The magnetic entropy change (SM) shown in Figure 2d of the main text was calculated from this data through the numerical integration of M/T(T, H). Figure  S1b shows the specific heat data of Sample #1 under zero field. The entropy curve under zerofield shown in Figure 3a of the main text was calculated by using this specific heat data. Note that in addition to a peak at TC ~15 K, a second peak appears around 11 K in the specific heat data. This is probably due to a spin reorientation transition as discussed in a related compound DyB2 13 . At this stage, we keep its physical origin as an open question to be clarified in future work, since it is not the main focus of present work.

Section 5 -Reproducibility
In order to evaluate the reproducibility of our findings, Sample #2 was synthesized by using the same arc melting procedure as in the main text. The XRD patterns of both samples are shown in Figure S2a. For both samples, HoB2 is the majority phase (~95 %) while there is a slight difference in the observed impurity peaks. This difference might be associated with the presence of a small minority phase of unreacted Ho in Sample #1 while Sample #2 exhibits an impurity peak associated with HoB4. For comparison, we have also evaluated |SM| in Sample #2 as shown in Figure S2b. While Sample #1 peaks at |SM|(H = 5 T, T = 15 K)| = 40.1 J kg -1 K -1 , Sample #2 peaks at 39.1 J kg -1 K -1 , showing the reproducibility of the gigantic magnetocaloric effect in this material. The small difference in |SM| value between Sample #1 and #2 is probably due to the difference in the impurity phase contained in each sample.

Section 6 -Entropy from specific heat measurements and simulated curve under field from magnetization measurements
In this section, we show the comparison of the entropy curve under magnetic field deduced from two experimental methods. One is the entropy obtained from specific data taken at 0H = 0, 2, 5 T for Sample #2 using the following equation: where Tmin is 1.8 K, and the obtained entropy curves are shown as red circles in Figure S3. Another is the simulated entropy under field (blue circles in Figure S3), that is obtained simply adding |SM| evaluated by M-T curves to the 0 T entropy. As we can see in Figure S3, both show a good agreement with each other, ensuring a gigantic magnetocaloric effect in HoB2. We have found that this simulation to obtain the entropy curve under field is efficient and fast. Since we were working within a limited machine operation time constraint, we used the same simulation approach to obtain the entropy under field shown in Figure 3(a) of the main text. Figure S3. Entropy as a function of temperature obtained from the specific heat (red circle) and those by combining the specific heat at 0 T with |SM| from magnetization data (blue circle) for Sample #2 Section 7 -Universal scaling of S curve and order of magnetic transition.
To further understand the nature of the magnetic transition at TC = 15 K, we've examined a so-called universal scaling curve proposed in Refs. 1-4 that depicts the normalized entropy change |SM|/|SM peak | as a function of normalized temperature  defined as: For the case of a second-order transition, these curves are expected to collapse into each other, exhibiting a universal behavior, while for the first-order transition, there would be a dispersion between the curves (See for eg. Ref. 4 ). As can be seen in Figure S4, all the curves for different fields fall into the same universal curve. The divergence around = − 1 is due to the presence of a second magnetic transition at a lower temperature, similarly observed for other materials exhibiting more than one transition 14,15 . This universal curve suggests that the nature of the magnetic transitions in HoB2 are indeed of second-order, although further experiments would be required to conclude the nature and origin of the second peak at lower temperature regime. Figure S4. Universal scaling curve of |SM| in HoB2, obtained from the data in Figure 2(d) of the main text. Section 8 -Comparison of volumetric entropy change at low temperatures range for representative materials. Figure S5 shows diverse magnetocaloric materials enumerated by Tmag in the temperature range of liquid helium (4.2 K) to liquid nitrogen (77 K). These materials were selected from our database for exhibiting a |SM| ≥ 0. 15 J cm -3 K -1 for 0H = 5 T. In here we emphasize again the importance of comparison in the volumetric unit, as it is the most appropriate unit when considering the material as a candidate for applications such magnetic refrigeration 16,17 .
HoB2, exhibits the highest volumetric |SM| for all SOPT materials in the temperature range of 4.2 K to 77 K, being comparable to FOPT materials such as ErCo2 (#27 in the figure) and Gd5(Si0.33Ge3.67) (#35,37 in the figure). Furthermore, its closeness to the hydrogen liquefaction temperature (indicated by the dashed line) combined with this gigantic volumetric SM solidifies it as a remarkable candidate for magnetic coolant especially aimed at the hydrogen liquefaction stage and low-temperature applications.