Introduction

Machine learning and artificial intelligence (AI)-related techniques have been rapidly implemented in daily lives and industries, such as electronic commerce, manufacturing, and automated driving. This situation is not an exception for various scientific fields; in particular, materials science combined with information science is referred to as ‘materials informatics’1,2,3,4,5,6,7. Materials informatics has been successful in predicting and finding new functional materials8,9,10,11,12,13 or optimising devices and material microstructures14,15.

In general, machine learning techniques, such as deep learning, require substantial data to learn. Therefore, in materials informatics, theoretical calculations, such as first-principles calculations or molecular dynamics simulations, are useful methods for accumulating materials property data to construct a materials database16,17,18. However, substantial problems in materials science, e.g., the coercivity in permanent magnets, catalytic reactions, and the charge and discharge of batteries, are difficult to completely describe with theoretical calculations. These physical and chemical phenomena have complexities that span multi-temporal and spatial scales. These problems are difficult to model and require a tremendous computational cost. Therefore, an experimental approach is essential for understanding these phenomena. First, data acquisition by experiments incurs various costs related to real specimen preparation, human-operated measurement equipment, trial-and-error analyses, etc. There is a strong demand to promote the efficiency of experimental methods to accelerate materials science beyond traditional processes. In addition to materials informatics, ‘measurement informatics’, a research area that integrates measurements and machine learning, has been explored to date. With this approach, the measurement19,20,21 and data analysis efficiency can be enhanced22,23,24,25,26.

Active learning (AL) is a machine learning scheme used to obtain predictive models with high precision at a limited cost through the sophisticated selection of samples for labelling27. Similar to other machine learning methods, AL is utilised in materials informatics28,29,30,31,32,33,34 and measurement informatics19,20. Ueno et al. developed a method for X-ray magnetic circular dichroism (XMCD) spectral measurement with AL19. XMCD spectroscopy is a variation of X-ray absorption spectroscopy (XAS) using circularly polarised X-rays to investigate the element-specific magnetic properties of materials35. In particular, magneto-optical sum rules relate the XMCD spectrum to the spin and orbital magnetic moments, which are the fundamental physical parameters of elements36,37,38.

Measurements of XAS and XMCD spectra are usually performed in a step-by-step manner, i.e., the measurement of the X-ray absorption intensity and tuning of the X-ray energy is repeated over the energy range of the spectrum (Fig. 1b). In conventional XAS and XMCD measurements, the numbers of energy points and energy steps are predetermined by the experimenter, and usually, several hundred points are measured. These measurement conditions are set based on experimenters’ experience and intuition. This type of conventional DoE has possibility to be optimised by a machine learning technique. The XMCD spectral measurement with AL uses Gaussian process regression (GPR)39 to predict the whole spectral shape from the measured data (Fig. 1a). The standard deviation of the prediction is used to determine the optimal energy point to measure. The measurement stops if the convergence criterion is satisfied. Compared to conventional measurements, this method succeeded in reducing the number of measurement energy points.

Fig. 1: Concept of the spectral measurement with active learning.
figure 1

a Schematic of the spectral measurement with active active learning. b Schematic of the spectral measurement with conventional design of experiments. Red markers, blue curves, and grey bars indicate the sampled data, predicted mean of Gaussian process regression, and energy points, respectively. Green arrows represent the sampling order that is determined with active learning in a or sequential sampling in b. c Flowchart of the spectral measurement with active learning. (1) A sampling of the initial data. (2) Gaussian process regression (GPR) is used to predict the spectrum and evaluate the standard deviation. The measurement is finished when the stopping criterion is satisfied. (3) If the stopping criterion is not satisfied, then the next sampling point is determined based on the acquisition function. (4) A sampling of new data and back to the process (2).

However, the XMCD spectral measurement with AL depends on the magnetic moment evaluated from the predicted spectrum as the stopping criterion. This condition means that the method is only applicable when the relation between the spectrum and the physical parameter is known. In addition, the physical parameter must be evaluated quickly from the spectrum to fit in the AL cycle of the measurement and GPR. In the common situation of a spectral measurement, the evaluation of physical parameters from the spectra is not straightforward; e.g., a comparison between the experimental and theoretical spectra is often needed. Therefore, a stopping criterion without the physical parameter is required to apply the spectral measurement with AL to general types of spectra.

The optimal stopping problem is a long-standing problem in AL27. Without an appropriate stopping criterion, the active learner would require many unnecessary samples to be labelled or too few samples, resulting in poor predictive performance. Despite its importance, there have been few studies on timing considerations for AL40,41,42, mainly because of its problem dependency. From the viewpoint of materials science, an optimal stopping rule is highly desired to avoid useless and costly experiments. In the typical problem setting of AL in materials and measurement informatics, the experiment is stopped when the budget is exhausted or the experimenter is satisfied with the results28. The former assumes the experiment of one who needs the best result in a limited time. In such a situation, the experimenter will set the maximum iteration of experiments within the limited time43. The latter suppose the situation that the experiment is iterated until the experimenter satisfies with the target property, e.g. search for material with the highest melting point44 and convergence of the magnetic moments within the predetermined threshold19.

In this paper, we applied a universal stopping criterion for XAS spectral measurement with AL. The stopping criterion is based on the stability of the expected generalisation errors45. This stopping criterion is completely evaluated on a mathematical basis; therefore, it is applicable to general spectral measurements whose relation to the physical parameter is unknown. We applied the method to the spectral measurement with AL of several types of simulated and experimental XAS spectra. It is revealed that the automated stopping criterion of the spectral measurement gives an approximated XAS spectrum with sufficient accuracy. The proposed method can be applied not only to spectral measurements but also to other types of measurements and improves the efficiency of high-throughput experiments in the era of materials informatics.

Results

Spectral measurements with AL

Figure 1c shows the flowchart of a spectral measurement with AL. Spectral measurement with AL includes the following steps: (1) First, initial spectral data Y0(X0) are sampled. Here, Y0 = (yi, …, yj) represents the spectral intensity at energy points X0 = (xi, …, xj). (2) Subsequently, GPR is applied to the initial data, and we obtain a mean μn and standard deviation σn for each energy point. n = 0, 1, … represents the number of samplings (n = 0 for the initial sampling). The mean μn is regarded as a predicted spectrum. The stopping criterion is evaluated with the result of the GPR fitting. (3) If the stopping criterion is not satisfied, then the next sampling point xnext is automatically determined based on an acquisition function. (4) Subsequently, a new energy point is sampled and added to the measured data Yn = (yi, …, ynext, …, yj). Finally, repeating process (2), GPR is applied to Yn(Xn) again. Processes (2) to (4) are repeated until the stopping criterion is satisfied.

In our previous work19, a physical parameter of interest was evaluated by comparing the fitted spectrum and the reference spectrum of a known material. Then, the physical parameter converges to a predetermined accuracy range, and the measurement is terminated. However, this method has some limitations for general use, i.e., only applicable to materials whose physical parameter can be evaluated with the similarity measure, and the reference value of the physical parameter is available. Moreover, different results are generated for different similarity measures, and the threshold must be properly determined. Therefore, we adopt the criterion that depends only on the fitted spectra to promote the application of spectral measurement with AL.

The choice of a covariance function (kernel) \(K({{{x}}},{{{x}}}^{\prime} )\) is essential in GPR. We adopt the standard power exponential correlation function defined by

$$K({{{x}}},{{{x}}}^{\prime} )={\theta }_{1}\exp \left(-{\left(\frac{| {{{x}}}-{{{x}}}^{\prime} | }{{\theta }_{2}}\right)}^{2}\right)$$
(1)

with the amplitude parameter θ1 and bandwidth parameter θ2. The power exponential or Gaussian correlation function performs better than the Matérn correlation function for the fitting of the XAS spectra19.

Stopping criterion

Approximating the spectrum by the GPR is considered a problem of supervised learning. In this setting, the goodness of approximation is evaluated by the average prediction error of the intensity at an unseen energy point, which is called the generalisation error defined by

$${{{\mathcal{L}}}}(f)=\int {\mathrm{d}}y\int {\mathrm{d}}x{(y-f(x))}^{2}p(x,y).$$
(2)

where p(x, y) is a joint density function of (x, y). Note that f(x) is a stochastic predictor sampled from the fitted Gaussian process; hence, \({{{\mathcal{L}}}}(f)\) is a random variable. In a problem setting of the spectral measurement, x, y and f(x) are the energy, the ground truth (unknown) spectrum and the fitted curve by the GPR model, respectively.

Let pt(f) be the posterior distribution of the predictive model f(x) obtained by fitting a Gaussian process to the observation up to time t. Then, the posterior average of the generalisation error is defined by

$${{{{\mathcal{L}}}}}_{t}=\int {\mathrm{d}}f\,{p}_{t}(f){{{\mathcal{L}}}}(f).$$
(3)

If the gap \(| {{{{\mathcal{L}}}}}_{t}-{{{{\mathcal{L}}}}}_{t+1}|\) of the posterior average of generalisation errors at t and t + 1 is small enough, then there is only a small gain by performing an additional observation, and the experiment should be stopped. It is not possible to directly calculate the generalisation error because we do not access the distribution p(x, y). A method to estimate the upper bound of \({{{{\mathcal{L}}}}}_{t}-{{{{\mathcal{L}}}}}_{t+1}\) is proposed in ref. 46, and a stopping criterion of AL based on the convergence of the gap \(| {{{{\mathcal{L}}}}}_{t}-{{{{\mathcal{L}}}}}_{t+1}|\) is developed in ref. 45. Suppose \({{{\mathcal{L}}}}(f)\in [a,b]\), then

$$| {{{{\mathcal{L}}}}}_{t}-{{{{\mathcal{L}}}}}_{t+1}| \le (b-a)\{r({p}_{t},{p}_{t+1})+r({p}_{t+1},{p}_{t})\}$$
(4)

holds where

$$r(p,q)=\exp \left\{{W}_{0}\left(\frac{{{{\rm{KL}}}}(p| | q)-1}{{{{\rm{e}}}}}\right)+1\right\}-1.$$
(5)

In the above equation, W0 is the main branch of the Lambert W-function47, \({{{\rm{KL}}}}(p| | q)=\int p(x){{\mathrm{log}}}\,\frac{p(x)}{q(x)}{\mathrm{d}}x\) is the Kullback–Leibler divergence48, and e is the base of the natural logarithm. The Kullback-Leibler divergence between posterior distributions pt and pt+1 is shown to be exactly calculated by using the observed points \({\{({x}_{i},{y}_{i})\}}_{i = 1}^{t}\); hence, the upper bound of \(| {{{{\mathcal{L}}}}}_{t}-{{{{\mathcal{L}}}}}_{t+1}|\) is computable from the data at hand. To align the scale of r(pt, pt + 1) + r(pt + 1, pt) and remove the constant term b − a, we consider the ratio \({\lambda }_{t}=\frac{r({p}_{t},{p}_{t+1})+r({p}_{t+1},{p}_{t})}{r({p}_{1},{p}_{2})+r({p}_{2},{p}_{1})}\). When this ratio is smaller than a certain threshold λ [0, 1], the experiments are stopped. Figure 2 shows the schematic of the stopping criterion with the error ratio. Intuitively, the threshold λ is regarded as the expected rate of the improvement of fitting by one additional observation. The stopping time is not very sensitive to the value of λ when it is set to be small enough, but it is possible to determine this parameter by a simulation study in advance. We take this strategy and report the experimental results in the following section.

Fig. 2: Schematic of the stopping criterion with the error ratio.
figure 2

The data size increases with measurement time t. The error ratio λt is evaluated after every Gaussian process regression . The measurement is stopped when λt falls below the threshold λ. The original conceptual diagram appears in ref. 45.

Application to the simulated spectrum

First, we applied the method to the simulated noise-free Ni L2,3 XAS of divalent nickel ions (Ni2+) to verify the effectiveness of the overall strategy. Details of the simulation are described in the Methods section. All spectra used in the study are presented in Supplementary Fig. 1. Figure 3a–f shows snapshots of the GPR fitting of the simulated L2,3 XAS of Ni2+. In Fig. 3a, randomly selected initial data points (data size = 10) and GPR fitting are shown. The L3 peak is measured occasionally in the initial sampling. The covariance function exhibits a large standard deviation between the sampled data points, and the next sampling point is chosen from these x values with a large value of the acquisition function (Eq. (11)). The GPR fitting after several samplings (data size = 20) is shown in Fig. 3b. The whole spectral shape appears, but the standard deviation is still large, approximately one-third of the intensity at the L3 peak between the sampled data points. Fig. 3c shows the GPR fitting after 50 samplings (data size = 60). Satellite peaks around the L3 main peak and multiplet structure around the L2 peak appear at this degree of sampling density. Figure 3d–f shows the results of the GPR fitting for the different stopping timings, i.e., different thresholds. The standard deviation is relatively small compared to the intensity of the L3 and L2 peaks in the whole energy range. The intensity relations of the multiplet structure around the L2 peak are correctly approximated. In Fig. 3d, e, the difference between the GPR fitting for different thresholds only appears in the standard deviations around non-peak regions. In Fig. 3f, the GPR fitting for a data size = 178 is almost the same as the GPR fitting for a data size = 116, but an increase in sampling points around the peak region is visible.

Fig. 3: Spectral measurement of simulated Ni L2,3 XAS for Ni2+ with active learning.
figure 3

Gaussian process regression (GPR) a for data size = 10 (initial sampling), b data size = 20, c data size = 60, d data size = 94 (stopping criterion with λ = 0.1), e data size = 116 (stopping criterion with λ = 0.05) and f data size = 178 (stopping criterion with λ = 0.025). Black full circles, blue solid curves and blue shades indicate the measured data points, mean function of the GPR, and covariance function of the GPR, respectively. g Error ratio versus data size. Black and red curves indicate error ratio and their minimum value, respectively. h Test error versus data size. Black and blue curves indicate test error and difference of test error between n-th and (n − 1)-th sampling. In g and h, red, green and blue vertical dashed lines indicate stopping timings for λ of 0.1, 0.05 and 0.025, respectively. i Heat maps of the predicted mean and j standard deviation versus data size. Red, green and blue horizontal dashed lines indicate stopping timings for λ of 0.1, 0.05 and 0.025, respectively. Orange markers indicate sampled energy points in each AL cycle.

To visualise the progress of the spectral measurement and the stopping timing, the error ratio and the test error versus data size are shown in Fig. 3g, h, respectively. In Fig. 3g, the error ratio λt is plotted as a function of the data size, i.e., the number of measurement points x. The error ratio decreases with increasing data size and converges to an almost constant value. Figure 3h shows the test error, i.e., the posterior average of the generalisation errors (Eq. (3)), versus the data size. The test error steeply decreases with increasing data size as compared to the error ratio in the initial stage of the measurement and also converges to a constant value. The spikes in error ratio in Fig. 3g come from the subtraction of the test errors at steps to evaluate the upper bound of the test error in Fig. 3h. The spikes in error ratio coincide with the spikes in the difference of test error between the n-th and (n − 1)-th sampling. The spikes are inevitable because the discontinuous improvement of the test error occasionally arises in the AL. Alternatively, the minimum value of the error ratio is plotted in Fig. 3g. It is revealed that the minimum value of the error ratio gradually decreases with the increase of data size. Therefore, immoderate early stopping of a measurement can be avoided. The vertical dashed lines in Fig. 3g, h indicate the stopping timings for different thresholds that correspond to the GPR fitting shown in Fig. 3d–f. These results indicate that the automated stopping of the XAS measurement based on the generalisation error gives the GPR fitting with small errors from the ground truth spectrum with the reduced data size.

To perceive the overall picture of the spectral measurement with AL, the predicted mean μ and standard deviation σ versus data size are visualised as heat maps in Fig. 3i, j, respectively. The stopping timings for different thresholds are represented as horizontal dashed lines. In Fig. 3i, the peak structures around L3 and L2 regions are shown as thick coloured vertical lines. The peak structures seem to be explicit with increasing data size. On the other hand, the standard deviation decreases with increasing data size, as shown in Fig. 3j. Sampled energy points indicated by orange markers in Fig. 3i, j, are nearly uniformly distributed in the whole spectral energy range. We confirmed a nearly uniform distribution by plotting a histogram of sampled energy points as shown in Supplementary Fig. 6a. Thus, it seems that biased sampling, e.g., intensive sampling around the peak region, does not occur in the spectral measurement with AL with the acquisition function with the form of Eq. (11).

Application to the experimental spectrum

Next, we applied the method to the experimental data to demonstrate its applicability to actual spectral measurements. Experimental data inherently include measurement noise; therefore, it is essential to ascertain the noise tolerance of the present method. Moreover, the experimental XAS spectrum of Ni metal has a finite background that comes from a continuum state approximated by the double step-like function. Figure 4a–f shows snapshots of the GPR fitting of the experimental Ni L2,3 XAS of nickel metal. In Fig. 4a, randomly selected initial data points and the GPR fitting are shown. The standard deviation σ is large in sparsely sampled energy regions, as is the usual behaviour of the GPR. The overall spectral shape appears after 50 samplings, as shown in Fig. 4c. In addition to the L3 and L2 main peaks, the so-called 6 eV satellite49 can be seen at ~ 858 eV at this sampling density. Figure 4d–f shows the results of the GPR fitting for the different stopping timings with a specific threshold. Similar to the results for the simulated Ni2+ XAS spectrum, the L3 main peak is properly approximated by the GPR fitting with a stopping timing of λ = 0.1. The standard deviation around the L2 peak region and other non-peak energy regions decreases with increasing measurements, as shown in Fig. 4e, f.

Fig. 4: Spectral measurement of experimental Ni L2,3 XAS with active learning.
figure 4

Gaussian process regression (GPR) a for data size = 10 (initial sampling), b data size = 20, c data size = 60, d data size = 69 (stopping criterion with λ = 0.1), e data size = 75 (stopping criterion with λ = 0.05) and f data size = 84 (stopping criterion with λ = 0.025). Black full circles, blue solid curves and blue shades indicate the measured data points, mean function of the GPR, and covariance function of the GPR, respectively. g Error ratio versus data size. Black and red curves indicate the error ratio and their minimum value, respectively. h Test error versus data size. Black and blue curves indicate test error and difference of test error between n-th and (n − 1)-th sampling. In g and h, red, green and blue vertical dashed lines indicate stopping timings for λ of 0.1, 0.05 and 0.025, respectively. i Heat maps of the predicted mean and j standard deviation versus data size. Red, green and blue horizontal dashed lines indicate stopping timings for λ of 0.1, 0.05 and 0.025, respectively. Orange markers indicate sampled energy points in each AL cycle.

Figure 4g, h shows the error ratio and the test error versus the data size. As shown in Fig. 4g, the error ratio decreases with increasing data size and converges to constant values. Occasionally appearing spikes originate likewise the results for Ni2+ shown in Fig. 4g. Figure 4h shows the data size dependence of the test error. The test error decreases with increasing data size and converges to a constant value, as in the case of Ni2+. The stopping timing (vertical dashed lines in Fig. 4g, h) appears in a data size from 69 to 84. The GPR fittings shown in Fig. 4d–f indicate that relevant stopping timing gives accurate spectral shapes, including main peaks and multiplet structures. These results indicate that the present method also works for the experimental XAS spectrum with noise. The animation of GPR fittings and the evolution of the error ratio and the test error presented above are shown in the Supplementary Movie.

Heat maps of the predicted mean μ and standard deviation σ versus the data size are shown in Fig. 4i, j. In Fig. 4i, peak structures around the L3 and L2 regions appear as thick coloured vertical lines and become explicit with increasing data size as with the case of the simulated Ni2+ XAS. The sampling tendency is also similar to the case of the simulated Ni2+ XAS measurement, and sampling energy points are uniformly distributed in the whole spectral energy range (see also a histogram in Supplementary Fig. 6d).

Time-cost evaluation

It is essential to estimate the time cost, i.e. total measurement time, with the spectral measurement with AL for realistic application to measurements. The time cost for the spectral measurement with AL tAL is defined as

$${t}_{{{{\rm{AL}}}}}=\mathop{\sum}\limits_{n}\left({t}_{{{{\rm{ene}}}}}\times {{\Delta }}{E}_{n}^{{{{\rm{Init}}}}}+{t}_{{{{\rm{meas}}}}}+{t}_{{{{\rm{GP}}}}}\right)+\mathop{\sum}\limits_{n}\left({t}_{{{{\rm{ene}}}}}\times {{\Delta }}{E}_{n}^{{{{\rm{AL}}}}}+{t}_{{{{\rm{meas}}}}}+{t}_{{{{\rm{GP}}}}}+{t}_{{{{\rm{SC}}}}}\right),$$
(6)

where tene, tmeas, tGP and tSC are the time to change unit energy, e.g. 1 eV, time to measure single spectral intensity, time to compute the GPR and time to evaluate the stopping criterion, respectively. \({{\Delta }}{E}_{n}^{{{{\rm{Init}}}}}\) is a distance between energies in the initial sampling and \({{\Delta }}{E}_{n}^{{{{\rm{AL}}}}}\) is a distance between energies of the n-th and the (n − 1)-th sampling. Thus, tAL consists of time costs for the initial sampling and the sampling in AL. On the other hand, the time cost for the conventional DoE tCDoE is defined as

$${t}_{{{{\rm{CDoE}}}}}=\mathop{\sum}\limits_{n}\left({t}_{{{{\rm{ene}}}}}\times {{\Delta }}{E}^{{{{\rm{CDoE}}}}}+{t}_{{{{\rm{meas}}}}}\right).$$
(7)

For simplicity, let ΔECDoE be a constant by assuming a measurement that constant energy step is used in all spectral energy ranges.

Figure 5 shows the time cost evaluated for Ni2+L2,3 XAS spectral measurement with various tmeas/tene ratios, which is variable with the experimental conditions. In this evaluation, we assume the signal-to-noise ratio (SNR) of the XAS spectrum does not depend on tmeas, i.e. we ignore the effect of SNR on the spectral measurement with AL, which will be explored in a further study. Meanwhile, it is supposed that tGP/tene = 0.1 for all evaluation because the computational time for GPR tGP is much shorter than tene and tmeas, when the number of measurements is in the order of <1000. We note that the computational cost for evaluating the stopping criterion is negligible. Time cost in Fig. 5 exhibits arbitrary units, one can read vertical axis in seconds, for example, tmeas = tene = 1 [sec] and tGP = 0.1 [sec] in case of Fig. 5a. Both in the spectral measurement with AL and with the conventional DoE, time cost monotonically increases with data size. The slope of the time cost dependence of the conventional DoE is constant because ΔECDoE is constant as mentioned above. On another hand, the slope of the time cost dependence of the spectral measurement with AL changes because \({{\Delta }}{E}_{n}^{{{{\rm{ADoE}}}}}\) changes in each sampling. At the stopping timing of the spectral measurement with AL, an experimenter can observe the whole spectral shape as shown in Fig. 3d–f. In the conventional DoE, the whole spectral shape appears only after the measurement is finished. The spectral measurement with AL outperforms the conventional DoE, i.e. lower time cost is achieved in most cases, except for the case of tmeas/tene = 1 with the threshold λ = 0.025. A similar tendency is observed for the measurement of Co2+L2,3 XAS; however, in other cases, the spectral measurement with AL always realises lower time cost than that of the conventional DoE as shown in Supplementary Figs. 711.

Fig. 5: Time cost estimation of the Ni L2,3 XAS spectral measurement for simulated Ni2+.
figure 5

Time cost versus data size is plotted for various ratio between time to measure single spectral intensity tmeas and time to change of unit energy tene, a tmeas/tene = 1, b tmeas/tene = 10, c tmeas/tene = 100 and (d) tmeas/tene = 1000, respectively. Red and grey curves indicate time cost for the spectral measurement with AL and that with conventional DoE, respectively. Red, green, blue and grey circles indicate time cost at stopping timings of λ of 0.1, 0.05 and 0.025, and the conventional DoE respectively.

Discussion

In this paper, we proposed the application of an automated stopping criterion for spectral measurement with AL to enhance the efficiency of spectral measurements. The method was applied to the simulated and experimental Ni L2,3 XAS spectra. Predicted spectra with GPR fitting demonstrate satisfactory accuracy at different stopping timings with several thresholds. The method was applied to XAS spectra other than Ni. The GPR fittings and data size dependency of the error ratio and the test error for simulated and experimental Mn and Co L2,3 XAS are shown in Supplementary Figs. 25. These results show a similar tendency to the simulated and experimental Ni L2,3 XAS results. Time costs for various XAS spectral measurements with tmeas/tene = 1 are summarised in Fig. 6. The GPR fittings give a reasonable approximation of the XAS spectra in the automated stopping timing whose number of measurements is dramatically reduced as compared to the conventional experimental design. It is revealed that the automated stopping criterion works well for the spectral measurement with AL in general.

Fig. 6: Time costs at stopping timings for various XAS spectral measurements.
figure 6

Comparison of time costs for the spectral measurement with AL at automated stopping timing with several thresholds and the conventional DoE for a simulated L2,3 XAS spectra of Ni2+, Mn2+ and Co2+ and b experimental L2,3 XAS spectra of Ni, MnO2, and Co. The ratio between time to measure single spectral intensity tmeas and time to change unit energy tene is set to tmeas/tene = 1.

Based on the estimation of time cost, the spectral measurement with AL outperforms that with the conventional DoE in many cases as shown in Figs. 5 and 6. In particular, the advantage of the spectral measurement with AL is emphasised in the cases of large tmeas/tene ratio. Therefore, the spectral measurement with AL especially is effective in measurements with long measurement time per energy or other scanning parameters such as XAS measurement of a very dilute system, e.g. single molecule on the surface or inelastic neutron scattering, generally known as a neutron-hungry experiment. It is also effective to experiment like scanning transmission X-ray microscopy (STXM). In the STXM experiment, two-dimensional spatial scan is performed at each X-ray energy, so the measurement time per energy becomes much longer than that of changing X-ray energy. Thus, advantage of the reduction of the measurement point by the spectral measurement with AL becomes prominent for experiments with long measurement time per scanning parameter. Note that the so-called ‘on-the-fly’ scan is used as a very quick measurement technique in particular for a single XAS spectral measurement. It is important to use properly such techniques and the spectral measurement with AL depending on the experimenter’s purpose to improve the efficiency of the measurement.

Here, we discuss why the automated stopping criterion for the spectral measurement with AL works. In Fig. 1a, the measurement energy points in the spectral measurement with AL jump backward and forward in the whole spectral energy range. Thus, GPR fitting can approximate the rough shape of the spectrum in the early stage of the measurement, and the generalisation error becomes small. Fine spectral features, such as satellite peaks, appear as the measurement progresses; however, these measurements are not very relevant to the improvement in the test error between different thresholds. In other words, the method works with a balance between ‘exploration’ and ‘exploitation’. Exploratory sampling is more important than exploitative sampling to reduce the generalisation error and stop the experiment with a minimum number of measurements in the spectral measurement with AL. Alternatively, exploitative sampling becomes effective when one wants to measure detailed spectral features. This type of sampling becomes possible by using an acquisition function proposed in the literature20. The utilisation of prior knowledge regarding spectra has the potential to design an acquisition function; however, this subject is a topic for future research.

In conclusion, we applied the stopping criterion based on the stability of the expected generalisation errors for the XAS spectral measurement with AL. This stopping criterion can be evaluated from the self-contained information of the GPR fitting. It is revealed that the automated stopping criterion of the spectral measurement gives an approximated XAS spectrum with sufficient accuracy. The implementation utilises the application of the state-of-the-art theory of the optimal stopping problem in AL to actual measurements. The proposed method can be applied not only to spectral measurements but also to other types of measurements. An enhancement of the spectral measurement efficiency enables the high-throughput characterisation of materials for the construction of an experimental materials database in the era of materials informatics.

Methods

Simulation and measurement of X-ray absorption spectra

The simulation of XAS spectra was performed using CTM4XAS software50.

Ni, Mn and Co L2,3 XAS spectra were calculated for Ni2+, Mn2+, and Co2+ ions. Crystal field parameters were set to Oh symmetry with 10Dq = 1.0. The calculated multiplets were broadened with Lorentzian and Gaussian functions of 0.2 eV half-width at half-maximum each.

The XAS experiment was performed at the BL-19B at the Photon Factory, Institute of Materials Structure Science, High Energy Accelerator Research Organization, Japan51. Three types of samples, manganese dioxide (MnO2) powder and pieces of bulk cobalt (Co) and nickel (Ni), were mounted at the sample manipulator in the vacuum chamber. Mn, Co and Ni L2,3 XAS were obtained at room temperature by the total electron yield method, which measures the sample drain current. XAS spectra were obtained by dividing the sample current I by the mirror current I0 to negate the intensity variation in the incident X-ray. All simulated and experimental spectra used in this study are shown in Supplementary Fig. 1.

Gaussian process regression

The fundamental idea of spectral measurement with AL is identifying the spectral measurement problem with supervised curve fitting or the regression problems. Energy point x is considered as an explanatory variable in the regression, and the corresponding intensity y is the response variable in terms of the regression analysis. In this section, we briefly explain GPR, which was adopted in the present study to realise AL. The details of GPR are thoroughly described in ref. 39.

For energy points xi, we assume that the intensity yi at xi is modelled as yi = f(xi) + εi, where \({\varepsilon }_{i} \sim {{{\mathcal{N}}}}(0,{\xi }^{2})\) is the observation noise. In Gaussian process modelling, function \({f}{{(x)}}\) itself is assumed to be a random variable under a Gaussian distribution with mean \({{\mu_0}{(x)}}\) and covariance function \({{{\bf{K}}}}({{{x}}},{{{x}}}^{\prime} )\); hence, y is also a realisation of the Gaussian random variable with mean \({{\mu_0}{(x)}}\) and variance \({{\sigma}^2(x)}={\bf{K}}{(x,x)}+\xi^2\). Given a collection of observations y corresponding to the collection of energy points Xn = (x1, …, xn), the mean function and the variance function of the Bayesian posterior distribution are denoted by \(\hat{{{{\mu}}}}({{{x}}})\) and \({\hat{{{{\sigma}}}}}^{2}({{{x}}})\), respectively. To obtain the mean and variance values at a newly observed point x*, we consider the joint distribution of y and f(x*), which is expressed as

$$\left(\begin{array}{l}{{{\bf{y}}}}\\ f({x}^{* })\end{array}\right) \sim {{{\mathcal{N}}}}\left(\left(\begin{array}{l}{{{{\boldsymbol{\mu }}}}}_{0}({{{{\bf{X}}}}}_{n})\\ {\mu }_{0}({x}^{* })\end{array}\right),\left(\begin{array}{ll}{{{{\bf{K}}}}}_{n,n}+{\xi }^{2}{{{{\bf{I}}}}}_{n}&{{{{\bf{K}}}}}_{n,* }\\ {{{{\bf{K}}}}}_{n,* }^{\top }&{{{\bf{K}}}}({x}^{* },{x}^{* })\end{array}\right)\right)$$
(8)

where x, \({{{{\bf{K}}}}}_{n,n}={{{\bf{K}}}}({{{{\bf{X}}}}}_{n},{{{{\bf{X}}}}}_{n})\in {{\mathbb{R}}}^{n\times n}\), and \({{{{\bf{K}}}}}_{n,* }={{{\bf{K}}}}({{{{\bf{X}}}}}_{n},{x}^{* })\in {{\mathbb{R}}}^{n}\). The mean function value of the posterior distribution of f(x*) is obtained as

$$\hat{\mu }({x}^{* })={\mu }_{0}({x}^{* })+{{{{\bf{k}}}}}_{n}{({x}^{* })}^{\top }{({{{{\bf{K}}}}}_{n,n}+{\xi }^{2}{{{{\bf{I}}}}}_{n})}^{-1}({{{\bf{y}}}}-{{{{\boldsymbol{\mu }}}}}_{0}({{{{\bf{X}}}}}_{n}))$$
(9)

where kn = (K(x1, x*), …, K(xn, x*)). Moreover, the posterior variance at the new energy point x* is obtained as

$${\hat{\sigma }}^{2}({x}^{* })={{{\bf{K}}}}({x}^{* },{x}^{* })-{{{{\bf{k}}}}}_{n}{({x}^{* })}^{\top }{({{{{\bf{K}}}}}_{n,n}+{\xi }^{2}{{{{\bf{I}}}}}_{n})}^{-1}{{{{\bf{k}}}}}_{n}({x}^{* }).$$
(10)

The acquisition function is defined as follows

$$a\left(\hat{\sigma },\hat{\mu },t\right)=\frac{\hat{\sigma }({x}^{* })}{{\sigma }_{\max }}+\frac{1}{t}\sqrt{\frac{\hat{\mu }({x}^{* })}{{\mu }_{\max }}},$$
(11)

where \({\sigma }_{\max }\) and \({\mu }_{\max }\) are maximum standard deviation and mean among consequent measurements at time 1, …, t.

The amplitude θ1 and the bandwidth θ2 of the covariance function \({{{\bf{K}}}}({{{x}}},{{{x}}}^{\prime} )\) (Eq. (1)) and noise variance are predetermined by maximising the marginal likelihood of the Gaussian process model for a similar dataset measured in the past by the same device.

Both the simulated and experimental XAS spectra used in the study have 2000 data points in total. In the present implementation of AL, the data points were divided into three parts: the initial sampling, the pool data and the test data for evaluating the generalisation error, those data sizes were set to 10, 900 and 1090, respectively. Therefore, we assumed the total energy points measured in the conventional DoE are same as the size of the pool data (N = 900).