Automated stopping criterion for spectral measurements with active learning

The automated stopping of a spectral measurement with active learning is proposed. The optimal stopping of the measurement is realised with a stopping criterion based on the upper bound of the posterior average of the generalisation error of the Gaussian process regression. It is revealed that the automated stopping criterion of the spectral measurement gives an approximated X-ray absorption spectrum with sufficient accuracy and reduced data size. The proposed method is not only a proof-of-concept of the optimal stopping problem in active learning but also the key to enhancing the efficiency of spectral measurements for high-throughput experiments in the era of materials informatics.


INTRODUCTION
Machine learning and artificial intelligence (AI)-related techniques have been rapidly implemented in daily lives and industries, such as electronic commerce, manufacturing, and automated driving. This situation is not an exception for various scientific fields; in particular, materials science combined with information science is referred to as 'materials informatics' [1][2][3][4][5][6][7] . Materials informatics has been successful in predicting and finding new functional materials [8][9][10][11][12][13] or optimising devices and material microstructures 14,15 .
In general, machine learning techniques, such as deep learning, require substantial data to learn. Therefore, in materials informatics, theoretical calculations, such as first-principles calculations or molecular dynamics simulations, are useful methods for accumulating materials property data to construct a materials database [16][17][18] . However, substantial problems in materials science, e.g., the coercivity in permanent magnets, catalytic reactions, and the charge and discharge of batteries, are difficult to completely describe with theoretical calculations. These physical and chemical phenomena have complexities that span multi-temporal and spatial scales. These problems are difficult to model and require a tremendous computational cost. Therefore, an experimental approach is essential for understanding these phenomena. First, data acquisition by experiments incurs various costs related to real specimen preparation, human-operated measurement equipment, trial-and-error analyses, etc. There is a strong demand to promote the efficiency of experimental methods to accelerate materials science beyond traditional processes. In addition to materials informatics, 'measurement informatics', a research area that integrates measurements and machine learning, has been explored to date. With this approach, the measurement [19][20][21] and data analysis efficiency can be enhanced [22][23][24][25][26] .
Active learning (AL) is a machine learning scheme used to obtain predictive models with high precision at a limited cost through the sophisticated selection of samples for labelling 27 . Similar to other machine learning methods, AL is utilised in materials informatics [28][29][30][31][32][33][34] and measurement informatics 19,20 . Ueno et al. developed a method for X-ray magnetic circular dichroism (XMCD) spectral measurement with AL 19 . XMCD spectroscopy is a variation of X-ray absorption spectroscopy (XAS) using circularly polarised X-rays to investigate the element-specific magnetic properties of materials 35 . In particular, magneto-optical sum rules relate the XMCD spectrum to the spin and orbital magnetic moments, which are the fundamental physical parameters of elements [36][37][38] .
Measurements of XAS and XMCD spectra are usually performed in a step-by-step manner, i.e., the measurement of the X-ray absorption intensity and tuning of the X-ray energy is repeated over the energy range of the spectrum (Fig. 1b). In conventional XAS and XMCD measurements, the numbers of energy points and energy steps are predetermined by the experimenter, and usually, several hundred points are measured. These measurement conditions are set based on experimenters' experience and intuition. This type of conventional DoE has possibility to be optimised by a machine learning technique. The XMCD spectral measurement with AL uses Gaussian process regression (GPR) 39 to predict the whole spectral shape from the measured data (Fig. 1a). The standard deviation of the prediction is used to determine the optimal energy point to measure. The measurement stops if the convergence criterion is satisfied. Compared to conventional measurements, this method succeeded in reducing the number of measurement energy points.
However, the XMCD spectral measurement with AL depends on the magnetic moment evaluated from the predicted spectrum as the stopping criterion. This condition means that the method is only applicable when the relation between the spectrum and the physical parameter is known. In addition, the physical parameter must be evaluated quickly from the spectrum to fit in the AL cycle of the measurement and GPR. In the common situation of a spectral measurement, the evaluation of physical parameters from the spectra is not straightforward; e.g., a comparison between the experimental and theoretical spectra is often needed. Therefore, a stopping criterion without the physical parameter is required to apply the spectral measurement with AL to general types of spectra.
The optimal stopping problem is a long-standing problem in AL 27 . Without an appropriate stopping criterion, the active learner would require many unnecessary samples to be labelled or too few samples, resulting in poor predictive performance. Despite its importance, there have been few studies on timing considerations for AL [40][41][42] , mainly because of its problem dependency. From the viewpoint of materials science, an optimal stopping rule is highly desired to avoid useless and costly experiments. In the typical problem setting of AL in materials and measurement informatics, the experiment is stopped when the budget is exhausted or the experimenter is satisfied with the results 28 . The former assumes the experiment of one who needs the best result in a limited time. In such a situation, the experimenter will set the maximum iteration of experiments within the limited time 43 . The latter suppose the situation that the experiment is iterated until the experimenter satisfies with the target property, e.g. search for material with the highest melting point 44 and convergence of the magnetic moments within the predetermined threshold 19 .
In this paper, we applied a universal stopping criterion for XAS spectral measurement with AL. The stopping criterion is based on the stability of the expected generalisation errors 45 . This stopping criterion is completely evaluated on a mathematical basis; therefore, it is applicable to general spectral measurements whose relation to the physical parameter is unknown. We applied the method to the spectral measurement with AL of several types of simulated and experimental XAS spectra. It is revealed that the automated stopping criterion of the spectral measurement gives an approximated XAS spectrum with sufficient accuracy. The proposed method can be applied not only to spectral measurements but also to other types of measurements and improves the efficiency of high-throughput experiments in the era of materials informatics. Figure 1c shows the flowchart of a spectral measurement with AL. Spectral measurement with AL includes the following steps: (1) First, initial spectral data Y 0 (X 0 ) are sampled. Here, Y 0 = (y i , …, y j ) represents the spectral intensity at energy points X 0 = (x i , …, x j ).

Spectral measurements with AL
(2) Subsequently, GPR is applied to the initial data, and we obtain a mean μ n and standard deviation σ n for each energy point. n = 0, 1, … represents the number of samplings (n = 0 for the initial sampling). The mean μ n is regarded as a predicted spectrum. The stopping criterion is evaluated with the result of the GPR fitting.
(3) If the stopping criterion is not satisfied, then the next sampling point x next is automatically determined based on an acquisition function. (4) Subsequently, a new energy point is sampled and added to the measured data Y n = (y i , …, y next , …, y j ). Finally, repeating process (2), GPR is applied to Y n (X n ) again. Processes (2) to (4) are repeated until the stopping criterion is satisfied.
In our previous work 19 , a physical parameter of interest was evaluated by comparing the fitted spectrum and the reference spectrum of a known material. Then, the physical parameter converges to a predetermined accuracy range, and the measurement is terminated. However, this method has some limitations for general use, i.e., only applicable to materials whose physical parameter can be evaluated with the similarity measure, and the reference value of the physical parameter is available. Moreover, different results are generated for different similarity measures, and the threshold must be properly determined. Therefore, we   Fig. 1 Concept of the spectral measurement with active learning. a Schematic of the spectral measurement with active active learning. b Schematic of the spectral measurement with conventional design of experiments. Red markers, blue curves, and grey bars indicate the sampled data, predicted mean of Gaussian process regression, and energy points, respectively. Green arrows represent the sampling order that is determined with active learning in a or sequential sampling in b. c Flowchart of the spectral measurement with active learning. (1) A sampling of the initial data. (2) Gaussian process regression (GPR) is used to predict the spectrum and evaluate the standard deviation. The measurement is finished when the stopping criterion is satisfied. (3) If the stopping criterion is not satisfied, then the next sampling point is determined based on the acquisition function. (4) A sampling of new data and back to the process (2).
adopt the criterion that depends only on the fitted spectra to promote the application of spectral measurement with AL.
The choice of a covariance function (kernel) Kðx; x 0 Þ is essential in GPR. We adopt the standard power exponential correlation function defined by with the amplitude parameter θ 1 and bandwidth parameter θ 2 .
The power exponential or Gaussian correlation function performs better than the Matérn correlation function for the fitting of the XAS spectra 19 .

Stopping criterion
Approximating the spectrum by the GPR is considered a problem of supervised learning. In this setting, the goodness of approximation is evaluated by the average prediction error of the intensity at an unseen energy point, which is called the generalisation error defined by where p(x, y) is a joint density function of (x, y). Note that f(x) is a stochastic predictor sampled from the fitted Gaussian process; hence, Lðf Þ is a random variable. In a problem setting of the spectral measurement, x, y and f(x) are the energy, the ground truth (unknown) spectrum and the fitted curve by the GPR model, respectively.
Let p t (f) be the posterior distribution of the predictive model f(x) obtained by fitting a Gaussian process to the observation up to time t. Then, the posterior average of the generalisation error is defined by If the gap jL t À L tþ1 j of the posterior average of generalisation errors at t and t + 1 is small enough, then there is only a small gain by performing an additional observation, and the experiment should be stopped. It is not possible to directly calculate the generalisation error because we do not access the distribution p(x, y). A method to estimate the upper bound of L t À L tþ1 is proposed in ref. 46 , and a stopping criterion of AL based on the convergence of the gap jL t À L tþ1 j is developed in ref. 45 . Suppose Lðf Þ 2 ½a; b, then In the above equation, W 0 is the main branch of the Lambert Wfunction 47 , KLðpjjqÞ ¼ R pðxÞlog pðxÞ qðxÞ dx is the Kullback-Leibler divergence 48 , and e is the base of the natural logarithm. The Kullback-Leibler divergence between posterior distributions p t and p t+1 is shown to be exactly calculated by using the observed points fðx i ; y i Þg t i¼1 ; hence, the upper bound of jL t À L tþ1 j is computable from the data at hand. To align the scale of r(p t , p t + 1 ) + r(p t + 1 , p t ) and remove the constant term b − a, we consider the ratio λ t ¼ rðp t ;p tþ1 Þþrðp tþ1 ;p t Þ rðp 1 ;p 2 Þþrðp 2 ;p 1 Þ . When this ratio is smaller than a certain threshold λ ∈ [0, 1], the experiments are stopped. Figure 2 shows the schematic of the stopping criterion with the error ratio. Intuitively, the threshold λ is regarded as the expected rate of the improvement of fitting by one additional observation. The stopping time is not very sensitive to the value of λ when it is set to be small enough, but it is possible to determine this parameter by a simulation study in advance. We take this strategy and report the experimental results in the following section.
Application to the simulated spectrum First, we applied the method to the simulated noise-free Ni L 2,3 XAS of divalent nickel ions (Ni 2+ ) to verify the effectiveness of the overall strategy. Details of the simulation are described in the Methods section. All spectra used in the study are presented in Supplementary Fig. 1. Figure 3a-f shows snapshots of the GPR fitting of the simulated L 2,3 XAS of Ni 2+ . In Fig. 3a, randomly selected initial data points (data size = 10) and GPR fitting are shown. The L 3 peak is measured occasionally in the initial sampling. The covariance function exhibits a large standard deviation between the sampled data points, and the next sampling point is chosen from these x values with a large value of the acquisition function (Eq. (11)). The GPR fitting after several samplings (data size = 20) is shown in Fig. 3b. The whole spectral shape appears, but the standard deviation is still large, approximately one-third of the intensity at the L 3 peak between the sampled data points. Fig. 3c shows the GPR fitting after 50 samplings (data size = 60). Satellite peaks around the L 3 main peak and multiplet structure around the L 2 peak appear at this degree of sampling density. Figure 3d-f shows the results of the GPR fitting for the different stopping timings, i.e., different thresholds. The standard deviation is relatively small compared to the intensity of the L 3 and L 2 peaks in the whole energy range. The intensity relations of the multiplet structure around the L 2 peak are correctly approximated. In Fig. 3d, e, the difference between the GPR fitting for different thresholds only appears in the standard deviations around non-peak regions. In Fig. 3f, the GPR fitting for a data size = 178 is almost the same as the GPR fitting for a data size = 116, but an increase in sampling points around the peak region is visible.
To visualise the progress of the spectral measurement and the stopping timing, the error ratio and the test error versus data size are shown in Fig. 3g, h, respectively. In Fig. 3g, the error ratio λ t is plotted as a function of the data size, i.e., the number of measurement points x. The error ratio decreases with increasing data size and converges to an almost constant value. Figure 3h shows the test error, i.e., the posterior average of the generalisation errors (Eq. (3)), versus the data size. The test error steeply decreases with increasing data size as compared to the error ratio in the initial stage of the measurement and also converges to a constant value. The spikes in error ratio in Fig. 3g come from the subtraction of the test errors at steps to evaluate the upper bound of the test error in Fig. 3h. The spikes in error ratio coincide with the spikes in the difference of test error between the n-th and The data size increases with measurement time t. The error ratio λ t is evaluated after every Gaussian process regression . The measurement is stopped when λ t falls below the threshold λ. The original conceptual diagram appears in ref. 45 .
(n − 1)-th sampling. The spikes are inevitable because the discontinuous improvement of the test error occasionally arises in the AL. Alternatively, the minimum value of the error ratio is plotted in Fig. 3g. It is revealed that the minimum value of the error ratio gradually decreases with the increase of data size. Therefore, immoderate early stopping of a measurement can be avoided. The vertical dashed lines in Fig. 3g, h indicate the stopping timings for different thresholds that correspond to the GPR fitting shown in Fig. 3d-f. These results indicate that the automated stopping of the XAS measurement based on the generalisation error gives the GPR fitting with small errors from the ground truth spectrum with the reduced data size.
To perceive the overall picture of the spectral measurement with AL, the predicted mean μ and standard deviation σ versus data size are visualised as heat maps in Fig. 3i, j, respectively. The stopping timings for different thresholds are represented as horizontal dashed lines. In Fig. 3i, the peak structures around L 3 and L 2 regions are shown as thick coloured vertical lines. The peak structures seem to be explicit with increasing data size. On the other hand, the standard deviation decreases with increasing data size, as shown in Fig. 3j. Sampled energy points indicated by orange markers in Fig. 3i, j, are nearly uniformly distributed in the whole spectral energy range. We confirmed a nearly uniform distribution by plotting a histogram of sampled energy points as shown in Supplementary Fig. 6a. Thus, it seems that biased sampling, e.g., intensive sampling around the peak region, does not occur in the spectral measurement with AL with the acquisition function with the form of Eq. (11).
Application to the experimental spectrum Next, we applied the method to the experimental data to demonstrate its applicability to actual spectral measurements. Experimental data inherently include measurement noise; therefore, it is essential to ascertain the noise tolerance of the present method. Moreover, the experimental XAS spectrum of Ni metal has a finite background that comes from a continuum state approximated by the double step-like function. Figure 4a-f shows snapshots of the GPR fitting of the experimental Ni L 2,3 XAS of nickel metal. In Fig. 4a, randomly selected initial data points and the GPR fitting are shown. The standard deviation σ is large in sparsely sampled energy regions, as is the usual behaviour of the GPR. The overall spectral shape appears after 50 samplings, as shown in Fig. 4c. In addition to the L 3 and L 2 main peaks, the socalled 6 eV satellite 49 can be seen at~858 eV at this sampling density. Figure 4d-f shows the results of the GPR fitting for the different stopping timings with a specific threshold. Similar to the results for the simulated Ni 2+ XAS spectrum, the L 3 main peak is properly approximated by the GPR fitting with a stopping timing of λ = 0.1. The standard deviation around the L 2 peak region and other non-peak energy regions decreases with increasing measurements, as shown in Fig. 4e, f. Figure 4g, h shows the error ratio and the test error versus the data size. As shown in Fig. 4g, the error ratio decreases with increasing data size and converges to constant values. Occasionally appearing spikes originate likewise the results for Ni 2+ shown in Fig. 4g. Figure 4h shows the data size dependence of the test error. The test error decreases with increasing data size and converges to a constant value, as in the case of Ni 2+ . The stopping timing (vertical dashed lines in Fig. 4g, h) appears in a data size from 69 to 84. The GPR fittings shown in Fig. 4d-f indicate that relevant stopping timing gives accurate spectral shapes, including main peaks and multiplet structures. These results indicate that the present method also works for the experimental XAS spectrum with noise. The animation of GPR fittings and the evolution of the error ratio and the test error presented above are shown in the Supplementary Movie.
Heat maps of the predicted mean μ and standard deviation σ versus the data size are shown in Fig. 4i, j. In Fig. 4i, peak structures around the L 3 and L 2 regions appear as thick coloured vertical lines and become explicit with increasing data size as with the case of the simulated Ni 2+ XAS. The sampling tendency is also similar to the case of the simulated Ni 2+ XAS measurement, and sampling energy points are uniformly distributed in the whole spectral energy range (see also a histogram in Supplementary Fig.  6d).

Time-cost evaluation
It is essential to estimate the time cost, i.e. total measurement time, with the spectral measurement with AL for realistic application to measurements. The time cost for the spectral measurement with AL t AL is defined as where t ene , t meas , t GP and t SC are the time to change unit energy, e.g. 1 eV, time to measure single spectral intensity, time to compute the GPR and time to evaluate the stopping criterion, respectively. ΔE Init n is a distance between energies in the initial sampling and ΔE AL n is a distance between energies of the n-th and the (n − 1)-th sampling. Thus, t AL consists of time costs for the initial sampling and the sampling in AL. On the other hand, the time cost for the conventional DoE t CDoE is defined as For simplicity, let ΔE CDoE be a constant by assuming a measurement that constant energy step is used in all spectral energy ranges. Figure 5 shows the time cost evaluated for Ni 2+ L 2,3 XAS spectral measurement with various t meas /t ene ratios, which is variable with the experimental conditions. In this evaluation, we assume the signal-to-noise ratio (SNR) of the XAS spectrum does not depend on t meas , i.e. we ignore the effect of SNR on the spectral measurement with AL, which will be explored in a further study. Meanwhile, it is supposed that t GP /t ene = 0.1 for all evaluation because the computational time for GPR t GP is much shorter than t ene and t meas , when the number of measurements is in the order of <1000. We note that the computational cost for evaluating the stopping criterion is negligible. Time cost in Fig. 5 exhibits arbitrary units, one can read vertical axis in seconds, for example, t meas = t ene = 1 [sec] and t GP = 0.1 [sec] in case of Fig. 5a. Both in the spectral measurement with AL and with the conventional DoE, time cost monotonically increases with data size. The slope of the time cost dependence of the conventional DoE is constant because ΔE CDoE is constant as mentioned above. On another hand, the slope of the time cost dependence of the spectral measurement with AL changes because ΔE ADoE n changes in each sampling. At the stopping timing of the spectral measurement with AL, an experimenter can observe the whole spectral shape as shown in Fig. 3d-f. In the conventional DoE, the whole spectral shape appears only after the measurement is finished. The spectral measurement with AL outperforms the conventional DoE, i.e. lower time cost is achieved in most cases, except for the case of t meas /t ene = 1 with the threshold λ = 0.025. A similar tendency is observed for the measurement of Co 2+ L 2,3 XAS; however, in other cases, the spectral measurement with AL always realises lower time cost than that of the conventional DoE as shown in Supplementary Figs. 7-11.

DISCUSSION
In this paper, we proposed the application of an automated stopping criterion for spectral measurement with AL to enhance the efficiency of spectral measurements. The method was applied to the simulated and experimental Ni L 2,3 XAS spectra. Predicted spectra with GPR fitting demonstrate satisfactory accuracy at different stopping timings with several thresholds. The method was applied to XAS spectra other than Ni. The GPR fittings and data size dependency of the error ratio and the test error for simulated and experimental Mn and Co L 2,3 XAS are shown in Supplementary Figs. 2-5. These results show a similar tendency to the simulated and experimental Ni L 2,3 XAS results. Time costs for various XAS spectral measurements with t meas /t ene = 1 are summarised in Fig. 6. The GPR fittings give a reasonable approximation of the XAS spectra in the automated stopping timing whose number of measurements is dramatically reduced as compared to the conventional experimental design. It is revealed that the automated stopping criterion works well for the spectral measurement with AL in general.
Based on the estimation of time cost, the spectral measurement with AL outperforms that with the conventional DoE in many cases as shown in Figs. 5 and 6. In particular, the advantage of the spectral measurement with AL is emphasised in the cases of large t meas /t ene ratio. Therefore, the spectral measurement with AL especially is effective in measurements with long measurement time per energy or other scanning parameters such as XAS measurement of a very dilute system, e.g. single molecule on the surface or inelastic neutron scattering, generally known as a neutron-hungry experiment. It is also effective to experiment like scanning transmission X-ray microscopy (STXM). In the STXM experiment, two-dimensional spatial scan is performed at each X-ray energy, so the measurement time per energy becomes much longer than that of changing X-ray energy. Thus, advantage of the reduction of the measurement point by the spectral measurement with AL becomes prominent for experiments with long measurement time per scanning parameter. Note that the socalled 'on-the-fly' scan is used as a very quick measurement technique in particular for a single XAS spectral measurement. It is important to use properly such techniques and the spectral measurement with AL depending on the experimenter's purpose to improve the efficiency of the measurement.
Here, we discuss why the automated stopping criterion for the spectral measurement with AL works. In Fig. 1a, the measurement energy points in the spectral measurement with AL jump backward and forward in the whole spectral energy range. Thus, GPR fitting can approximate the rough shape of the spectrum in the early stage of the measurement, and the generalisation error becomes small. Fine spectral features, such as satellite peaks, appear as the measurement progresses; however, these measurements are not very relevant to the improvement in the test error between different thresholds. In other words, the method works with a balance between 'exploration' and 'exploitation'. Exploratory sampling is more important than exploitative sampling to reduce the generalisation error and stop the experiment with a minimum number of measurements in the spectral measurement with AL. Alternatively, exploitative sampling becomes effective when one wants to measure detailed spectral features. This type of sampling becomes possible by using an acquisition function proposed in the literature 20 . The utilisation of prior knowledge regarding spectra has the potential to design an acquisition function; however, this subject is a topic for future research.
In conclusion, we applied the stopping criterion based on the stability of the expected generalisation errors for the XAS spectral measurement with AL. This stopping criterion can be evaluated from the self-contained information of the GPR fitting. It is revealed that the automated stopping criterion of the spectral measurement gives an approximated XAS spectrum with sufficient accuracy. The implementation utilises the application of the state-of-the-art theory of the optimal stopping problem in AL to actual measurements. The proposed method can be applied not only to spectral measurements but also to other types of measurements. An enhancement of the spectral measurement efficiency enables the high-throughput characterisation of materials for the construction of an experimental materials database in the era of materials informatics.

METHODS Simulation and measurement of X-ray absorption spectra
The simulation of XAS spectra was performed using CTM4XAS software 50 .
Ni, Mn and Co L 2,3 XAS spectra were calculated for Ni 2+ , Mn 2+ , and Co 2+ ions. Crystal field parameters were set to O h symmetry with 10Dq = 1.0. The calculated multiplets were broadened with Lorentzian and Gaussian functions of 0.2 eV half-width at half-maximum each.
The XAS experiment was performed at the BL-19B at the Photon Factory, Institute of Materials Structure Science, High Energy Accelerator Research Organization, Japan 51 . Three types of samples, manganese dioxide (MnO 2 ) powder and pieces of bulk cobalt (Co) and nickel (Ni), were mounted at the sample manipulator in the vacuum chamber. Mn, Co and Ni L 2,3 XAS were Fig. 5 Time cost estimation of the Ni L 2,3 XAS spectral measurement for simulated Ni 2+ . Time cost versus data size is plotted for various ratio between time to measure single spectral intensity t meas and time to change of unit energy t ene , a t meas /t ene = 1, b t meas /t ene = 10, c t meas / t ene = 100 and (d) t meas /t ene = 1000, respectively. Red and grey curves indicate time cost for the spectral measurement with AL and that with conventional DoE, respectively. Red, green, blue and grey circles indicate time cost at stopping timings of λ of 0.1, 0.05 and 0.025, and the conventional DoE respectively. obtained at room temperature by the total electron yield method, which measures the sample drain current. XAS spectra were obtained by dividing the sample current I by the mirror current I 0 to negate the intensity variation in the incident X-ray. All simulated and experimental spectra used in this study are shown in Supplementary Fig. 1.

Gaussian process regression
The fundamental idea of spectral measurement with AL is identifying the spectral measurement problem with supervised curve fitting or the regression problems. Energy point x is considered as an explanatory variable in the regression, and the corresponding intensity y is the response variable in terms of the regression analysis. In this section, we briefly explain GPR, which was adopted in the present study to realise AL. The details of GPR are thoroughly described in ref. 39 .
For energy points x i , we assume that the intensity y i at x i is modelled as y i = f(x i ) + ε i , where ε i $ N ð0; ξ 2 Þ is the observation noise. In Gaussian process modelling, function f ðxÞ itself is assumed to be a random variable under a Gaussian distribution with mean μ 0 ðxÞ and covariance function Kðx; x 0 Þ; hence, y is also a realisation of the Gaussian random variable with mean μ 0 ðxÞ and variance σ 2 ðxÞ ¼ Kðx; xÞ þ ξ 2 . Given a collection of observations y corresponding to the collection of energy points X n = (x 1 , …, x n ), the mean function and the variance function of the Bayesian posterior distribution are denoted byμðxÞ andσ 2 ðxÞ, respectively. To obtain the mean and variance values at a newly observed point x * , we consider the joint distribution of y and f(x * ), which is expressed as y f ðx Ã Þ $ N μ 0 ðX n Þ μ 0 ðx Ã Þ ; K n;n þ ξ 2 I n K n;Ã K > n;Ã where x, K n;n ¼ KðX n ; X n Þ 2 R n n , and K n;Ã ¼ KðX n ; x Ã Þ 2 R n . The mean function value of the posterior distribution of f(x * ) is obtained aŝ μðx Ã Þ ¼ μ 0 ðx Ã Þ þ k n ðx Ã Þ > ðK n;n þ ξ 2 I n Þ À1 ðy À μ 0 ðX n ÞÞ where k n = (K(x 1 , x * ), …, K(x n , x * )). Moreover, the posterior variance at the new energy point x * is obtained aŝ σ 2 ðx Ã Þ ¼ Kðx Ã ; x Ã Þ À k n ðx Ã Þ > ðK n;n þ ξ 2 I n Þ À1 k n ðx Ã Þ: The acquisition function is defined as follows where σ max and μ max are maximum standard deviation and mean among consequent measurements at time 1, …, t. The amplitude θ 1 and the bandwidth θ 2 of the covariance function Kðx; x 0 Þ (Eq. (1)) and noise variance are predetermined by maximising the marginal likelihood of the Gaussian process model for a similar dataset measured in the past by the same device.
Both the simulated and experimental XAS spectra used in the study have 2000 data points in total. In the present implementation of AL, the data points were divided into three parts: the initial sampling, the pool data and the test data for evaluating the generalisation error, those data sizes were set to 10, 900 and 1090, respectively. Therefore, we assumed the total energy points measured in the conventional DoE are same as the size of the pool data (N = 900).

DATA AVAILABILITY
The data obtained in this study are available from the authors upon reasonable request. Fig. 6 Time costs at stopping timings for various XAS spectral measurements. Comparison of time costs for the spectral measurement with AL at automated stopping timing with several thresholds and the conventional DoE for a simulated L 2,3 XAS spectra of Ni 2+ , Mn 2+ and Co 2+ and b experimental L 2,3 XAS spectra of Ni, MnO 2 , and Co. The ratio between time to measure single spectral intensity t meas and time to change unit energy t ene is set to t meas /t ene = 1.
T. Ueno et al.