Introduction

Efficient materials characterization is critical to the design of improved technologies. Microscopic and spectroscopic techniques produce large amounts of data that traditionally require time-consuming analysis by an expert, which limits the rate of materials development and precludes their use in automated workflows1,2,3,4. Recently, machine learning (ML) has been applied to interpret characterization data more rapidly5,6. For example, autoencoders have been developed to segment images from electron microscopy and identify distinct atoms7,8, defects9,10, and microstructures11,12. Deep learning has also found use in spectroscopy, where convolutional neural networks can be trained to identify crystalline phases from X-ray diffraction (XRD) patterns13,14,15 or chemical species from Raman spectra16. While such methods effectively automate the analysis step of materials characterization, an opportunity exists to fundamentally rethink the measurement step by leveraging in-line ML to interpret experimental output as it becomes available and using this information to modify measurements within a closed-loop process that we call adaptive characterization. As will be demonstrated in this work, adaptive characterization can be applied to steer an experiment along the most efficient path toward precise decision making, circumventing the need for iterative experimentation.

There are several notable examples of adaptive characterization techniques developed in recent years. Bayesian optimization has been applied to raster objects in scattering17,18 and electron/probe microscopy19,20,21, leading to reduced measurement time relative to grid-based sampling22. Such methods have also been used to guide measurements toward the verification of scientific hypotheses by designing surrogate models with built-in physical constraints23,24,25. Alternatively, decisions can be made with reinforcement learning, e.g., to regulate the time spent scanning samples at a beamline depending on their scattering strength26. While much of the past work has applied adaptive characterization to fixed samples with unchanging properties, we show in this paper that the use of in-line ML analysis can also enable improved monitoring of dynamic processes where rapid measurements are required to capture transient states.

For structural characterization, XRD is a prime example of a technique that requires fast and precise measurements when applied in-line with experiments. In situ XRD is widely used to monitor reactions and detect the formation of short-lived intermediate phases that often influence the final reaction products27,28,29. Similarly, operando XRD is used to track phase transformations in battery materials during cycling, thus providing mechanistic insight into their performance30,31,32. For either application, XRD scans must be performed quickly enough to capture short-lived intermediate states while also producing high-resolution data that can be analyzed reliably post hoc. These two requirements compete with one another as short scans typically lead to noisy XRD patterns, complicating phase identification. High-brilliance radiation from a synchrotron light source may be used to generate high-resolution patterns with short measurement times28, though access to such facilities is limited to a select number of users and experiments each year. Alternatively, we propose that ML can be used to adaptively develop high resolution around features that matter most for phase identification, even on standard in-house diffractometers. Such highly efficient data collection can be accomplished only by closing the loop between experiments and ML-enabled analysis, such that rapid and mathematically optimized decisions are made autonomously and on-the-fly to acquire signal in areas that provide maximal information to confirm the presence (or absence) of certain phases.

Here we formulate an adaptively steered XRD technique for autonomous phase identification, driven by an ML algorithm based on a convolutional neural network. Uncertainty quantification is used to decide when additional measurements are needed, while class activation map analysis dictates where those measurements are performed. This approach is validated it on three test cases with increasing complexity based on materials from the Li-La-Zr-O and Li-Ti-P-O chemical spaces. These tests reveal that adaptive XRD consistently outperforms conventional methods on both simulated and experimentally acquired patterns, providing more precise detection of impurity phases while requiring shorter measurement times. We further demonstrate that our ML approach can effectively guide XRD measurements for improved in situ characterization of solid-state reactions, with the synthesis of Li7La3Zr2O12 (LLZO) considered as an example. The use of adaptive scans to monitor LLZO synthesis led to the successful identification of a short-lived intermediate phase that would otherwise be missed by conventional measurements. These findings provide a clear proof of concept for adaptive characterization of dynamic processes, highlighting the opportunity for autonomous experiments driven by ML.

Results

Adaptive XRD approach

Figure 1 shows the coupling between XRD and the ML algorithm that performs phase identification and controls the diffractometer. Each adaptive measurement begins with a rapid scan over a narrow range of 2θ = [10°, 60°], which was optimized to conserve scan time while still including enough peaks to make a preliminary prediction regarding which phases are most likely present in the sample. Supplementary Fig. 1 shows that starting from lower angles (10–50°) leads to notably reduced accuracy, while starting from higher angles (10–70°) requires longer scans but does not lead to more accurate phase identification. After performing an initial scan over 10–60°, the pattern is fed to our previously developed deep learning algorithm, XRD-AutoAnalyzer13. This algorithm not only predicts a set of phases for a given sample, but also assesses its own level of certainty such that each phase has an associated confidence ranging from 0 to 100%. Because higher confidence is correlated with more reliable predictions, we use it as a metric to decide when a pattern has sufficient resolution to accurately identify all phases in a sample. A confidence cutoff of 50% is found to provide a good balance between measurement speed and prediction accuracy (Supplementary Note 1 and Supplementary Fig. 2). In cases where the prediction confidence is less than 50%, the ML algorithm can request additional data from the diffractometer in one of two ways:

  1. 1.

    Resampling a subset of 2θ [10°, 60°] with increased resolution (slower scan rate) to clarify specific peaks that lead to maximal confidence improvement,

  2. 2.

    Expandingmax > 60° with a fast scan rate to detect additional peaks.

Fig. 1: A schematic for adaptively driven XRD with autonomous phase identification.
figure 1

After performing a fast initial, the resulting XRD patten is fed to a pre-trained ML model which proposes likely phases. If the confidence associated with any of these phases low (< 50%), the diffractometer is instructed to perform selective rescans around peaks that distinguish the suspected phases. If necessary, the scan range is also expanded to detect additional peaks and boost the prediction confidence of the ML model.

To select which 2θ should be scanned with increased resolution, we make use of Class Activation Maps (CAMs) designed to highlight features that contribute most to the classification(s) made by a deep learning model33. The CAM for a given XRD pattern is calculated as a function of 2θ and is expected to be large in regions containing important peaks for phase identification15. As a result, CAMs tend to be maximal around the most intense peaks in each pattern (Supplementary Fig. 3). However, sampling such features with increased resolution usually reveals little new information (Supplementary Note 2) as the most prominent peaks can already be detected with low-resolution measurements. Therefore, we instead prioritize resampling in areas of 2θ where the difference between the CAMs of the two most probable phases (proposed by XRD-AutoAnalyzer) exceeds a user-defined threshold. This approach ensures that high-resolution scans are used to clarify peaks that distinguish phases with similar XRD patterns.

In cases where there is significant peak overlap between different phases at low 2θ, the scan range can be expanded to reveal additional peaks that assist in disentangling them. However, because measurements carried out at higher 2θ often produce increasingly broad peaks with lower signal-to-noise ratios, they may not always lead to more accurate phase identification. To understand which parts of an XRD pattern provide the most useful information, we use the prediction confidence associated with each phase proposed by XRD-AutoAnalyzer based on 2θ = [10°, 60° + n10°] for n between zero (2θmax = 60°) and eight (2θmax = 140°). The predicted phases from each subset of 2θ are aggregated into an ensemble (Pens), where the prediction confidence is used to form a weighted average as follows:

$$P_{{{{\mathrm{ens}}}}} = \frac{{\mathop {\sum }\nolimits_{10}^{2{\uptheta}_i} c_iP_i}}{{n + 1}}$$
(1)

In this equation, Pi represents each prediction over [10, 2θi], ci is the confidence of that prediction, and n + 1 gives the total number of 2θ-ranges included in the ensemble. In contrast to the typical analysis technique whereby an individual prediction is made based on a given XRD pattern, the ensemble approach decomposes the pattern into several distinct but overlapping regions (Supplementary Fig. 4) from which separate predictions are made and subsequently aggregated using the confidence-weighted sum described in Eq. 1.

The adaptive XRD approach presented here integrates resampling and expansion of 2θ into one single workflow (Fig. 1). Based on early data obtained from a rapid initial scan over 2θ = [10°, 60°], XRD-AutoAnalyzer makes a preliminary prediction regarding which phases are most likely in the corresponding sample. If the confidence associated with this prediction is less than 50%, a selective rescan is performed over regions of 2θ where the difference between the CAMs of the two most probable phases exceeds a threshold of 25%. An updated prediction is made based on the resampled pattern and the confidence is assessed. If it remains less than 50%, higher angles are scanned (+10° at each step) to detect additional peaks. This iterative process of phase identification, resampling, and expansion is repeated until the prediction confidence exceeds 50% or until a maximum angle of 140° is reached. The requirement of 50% confidence is applied to all suspected phases in the mixture, not only the two most probable. In cases where multiple phases have high uncertainty, more than one round of resampling may be performed at each iteration, thus ensuring that the algorithm remains robust on multi-phase samples.

Evaluation of adaptive XRD on simulated patterns

We first evaluated the performance of the adaptive XRD approach in a simulated environment. XRD-AutoAnalyzer was separately trained in two chemical spaces, Li-La-Zr-O and Li-Ti-P-O, which each contain a rich variety of compositions and structures with applications in solid-state batteries34. The algorithm requires a list of previously reported phases to be trained on, and as such, all unique materials occupying these chemical spaces were extracted from the ICSD. This included 28 and 45 stoichiometric phases in the Li-La-Zr-O and Li-Ti-P-O spaces, respectively (Supplementary Table 1), from which a total of 8000 patterns were simulated, including 1400 single-phase, 2400 two-phase, and 4200 three-phase samples. While XRD-AutoAnalyzer is trained only on single-phase patterns, it readily interprets multi-phase samples by iterating between phase identification and peak subtraction following the procedure described in previous work13. In the samples containing more than one phase, weight fractions were randomly sampled between 20 and 80%. To mimic the limitations of data acquired experimentally, all simulated patterns were stochastically modified based on artifacts that include background noise, strain, texture, and small particle size (Methods section). These are commonly observed in real samples and can alter the positions, intensities, and widths of the corresponding diffraction peaks13,35. The signal-to-noise ratio (s/n) is related to the scan time (t) as follows:36

$$s/n = C\sqrt t$$
(2)

Where C is a scaling constant, which for this work was fit to experimental data obtained from XRD scans on a sample of Li2CO3 (Sigma Aldrich) using a Bruker D8 Advance diffractometer (Supplementary Fig. 6). The signal-to-noise ratio used in our simulated tests, in addition to the sampling density of 2θ, dictates the total effective scan time of each pattern. A shorter scan time is more efficient but will generally lead to less accurate phase analysis. To probe this relation, we duplicated all 8000 simulated patterns into 10 distinct datasets with varied sampling density (0.02°–0.04°) and signal-to-noise ratio (20–60), corresponding to an effective scan time ranging from 5 to 30 min.

The effectiveness of adaptive XRD in the simulated environment was first tested by limiting the algorithm to only perform resampling in a fixed range of 2θ = [10°, 60°]. Starting from patterns with minimal resolution (effective scan time of 5 min), XRD-AutoAnalyzer made initial predictions regarding which phases were present in each sample. When the prediction confidence was less than 50%, high-resolution data (effective scan time of 30 min) was added in regions of 2θ where the CAM differences between suspected phases exceeded 25% (Supplementary Fig. 7), and the total effective scan time was proportionally increased. Figure 2a shows (in blue) the F1-score as a function of scan time for phase identification performed by the adaptive algorithm in the Li-La-Zr-O (top panel) and Li-Ti-P-O (bottom panel) spaces. For comparison, we also show the F1-score associated with phase identification based on conventional scans with uniform resolution and iteratively longer measurement time (in black). With either sampling technique, more phases are accurately identified from patterns with higher resolution (longer effective scan time); however, adaptive XRD reaches convergence more rapidly than its conventional counterpart. The F1-score achieved with adaptive sampling exceeds 0.88 in 10–15 min of effective scan time for each pattern, whereas conventional sampling requires 25–30 min per pattern to reach the same level of accuracy. The rapid convergence of adaptive XRD demonstrates that it leverages low-resolution measurement data to effectively build a probability density for the likely phases in each sample, from which it identifies the optimal regions of 2θ that should be prioritized to distinguish these phases. The upper limit of the F1-score observed for both adaptive and conventional sampling can be attributed to the presence of simulated artifacts (e.g, strain and texture) as well as peak overlap between different phases over the current range of 2θ [10°, 60°]. Since these issues are not resolved by reducing background noise, the use of longer scan times (>30 min) leads to only marginal improvement in the F1-scores of conventional and adaptive analyses (Supplementary Fig. 8).

Fig. 2: F1-scores achieved by XRD-AutoAnalyzer when applied to simulated patterns in the (top) Li-La-Zr-O and (bottom) Li-Ti-P-O spaces.
figure 2

a Conventional results (black squares) were obtained on patterns with incrementally improved resolution over 2θ = [10°, 60°], whereas adaptive results (blue circles) were found by resampling a subset of 2θ [10°, 60°] with high resolution. b Individual results (black squares) were calculated by analyzing patterns with distinct maxima (2θmax), which were aggregated in a confidence-weighted sum to form the ensemble predictions (green circles). Adaptive results (blue star) were obtained by halting expansion of 2θmax when the prediction confidence exceeded 50%.

We used the simulated XRD dataset to quantify the extent to which the F1-score can be improved by including information from higher 2θ in the analysis. Only high-resolution patterns (effective scan time of 30 min) were considered to isolate the effects of the scan range. XRD-AutoAnalyzer was applied to each individual range of 2θ = [10°, 60°+n10°] for n between zero (2θmax = 60°) and eight (2θmax = 140°). These are treated separately and referred to as individual predictions hereafter. In comparison, ensemble predictions were formulated by aggregating the phases identified from all patterns available up to 2θmax, as described in the previous section (Eq. 1).

Figure 2b illustrates how the F1-score varies with increasing scan range for patterns in the Li-La-Zr-O (top panel) and Li-Ti-P-O (bottom panel) spaces. Black (green) datapoints represent individual (ensemble) predictions on patterns with 2θmax denoted by the x-axis. The ensemble predictions show a monotonic increase in the F1-score as higher angles are included, consistently outperforming the individual predictions. By aggregating all phases identified up to 2θmax = 140°, exceptionally high F1-scores of 0.98 and 0.95 are achieved. In contrast, scanning higher 2θ does not necessarily lead to better performance for individual predictions, which show a maximal F1-score of 0.91 at 2θmax = 120°, followed by a decreasing score from 120° to 140°. This trend arises from two effects: (1) signal-to-noise ratios decrease at higher 2θ as peaks become less intense, and (2) artifacts related to strain and small particle size cause larger changes to the positions and widths of peaks at higher 2θ. Because the ensemble approach weights each prediction by its associated confidence, it effectively ignores regions where background noise and/or artifacts mask the diffraction peaks, instead giving greater weight to regions where such peaks are more clearly distinguishable.

To keep the total measurement time minimal, higher 2θ should be scanned only when the prediction confidence from XRD-AutoAnalyzer is low. We demonstrate this policy by iteratively expanding 2θ = [10°, 60° + n10°] until the prediction confidence exceeds 50% or until 2θmax reaches 140°. The corresponding results are shown as blue stars in Fig. 2b. With adaptive expansion of 2θ, high F1-scores of 0.98 and 0.95 were obtained on patterns from the Li-La-Zr-O and Li-Ti-P-O datasets, respectively. These match the best F1-scores obtained on a full scan range (2θmax = 140°), while also conserving measurement time as only angles up to 106° were sampled on average, therefore showing that the adaptive algorithm can effectively decide when higher angles are needed distinguish suspected phases.

Performance of adaptive XRD on experimental mixtures

As a more challenging test, the effectiveness of adaptive XRD as applied to impurity detection was evaluated on 240 two-phase mixtures prepared using different physical combinations of eight compounds from the Li-Ti-P-O and Li-La-Zr-O chemical spaces. All compounds were purchased in the form of solid powders and manually mixed such that the weight fraction of the minority phase was varied between 2 and 20% (“Methods”). For each compound, a reference phase from the ICSD was included during the training of XRD-AutoAnalyzer. We compare the effectiveness of determining these minority phases by using: (1) conventional measurements that sampled 2θ = [10°, 80°] in 10 min, followed by automated phase identification with XRD-AutoAnalyzer applied post hoc; or (2) adaptive measurements with in-line phase identification and guided sampling, following the workflow outlined in Fig. 1. Both scan techniques were applied with an Aeris X-ray diffractometer from Panalytical. Their relative performance is assessed using the impurity detection rate, defined as the percentage of phases correctly identified at a given weight fraction.

Figure 3 displays the detection rates for minority phases in the Li-La-Zr-O (Fig. 3a) and Li-Ti-P-O (Fig. 3b) spaces. When XRD-AutoAnalyzer is applied in-line with adaptive measurements, it successfully identifies ≥75% of the minority phases at weight fractions ≥6%, even at short scan rates. In contrast, a much greater weight fraction of 15% is required to reach a detection rate of 75% using a conventional approach. The increased sensitivity of adaptive XRD holds true for all mixtures tested here (Supplementary Fig. 9), as it consistently detects smaller amounts of the minority phases when compared to conventional scans. Furthermore, it does so while using less scan time. As shown by the distributions of scan times in Fig. 3c, adaptive measurements are completed more rapidly than conventional (10 min) ones, requiring an average scan time of only 6 min per pattern. The improved speed and accuracy of in-line, adaptive XRD is derived from two key advantages: (1) It automatically decides whether additional measurements are needed after a rapid initial scan, and if so, focuses high-resolution scans on regions of 2θ that are most likely to contain peaks associated with the suspected minority phases; and (2) it determines when higher 2θ should be scanned to detect additional peaks that help distinguish phases with similar patterns at 2θ < 60°. Two examples demonstrating these capabilities are displayed in Supplementary Fig. 10, which further confirm the benefits of in-line analysis and decision making for optimizing the acquisition of experimental data.

Fig. 3: Detection rates and measurement times required for impurity detection.
figure 3

a, b The percentage of minority phases that were correctly identified by XRD-AutoAnalyzer when applied to mixtures from the Li-La-Zr-O and Li-Ti-P-O chemical spaces, respectively. Results are plotted separately for predictions based on conventional and adaptive measurements. c The distributions of scan times required by adaptive measurements in each space. Each violin plot illustrates the density of scan times and spans the range of the data. In contrast, the embedded boxes extend only from the lower to upper quartiles. Black dots represent the average scan time required in each case. For comparison, the conventional scan time (10 min) is represented by the black dashed line.

Adaptive XRD for in situ characterization

We demonstrate below that the optimized effectiveness by which adaptive XRD collects data can lead to new experimental capabilities. To this end, a solid-state synthesis procedure targeting Li7La3Zr2O12 (LLZO) was designed and carried out37. During the corresponding synthesis experiments, in situ measurements on a Bruker D8 Advance diffractometer were integrated with XRD-AutoAnalyzer to characterize the reaction pathway via the identification of precursors, intermediate phases, and final products. Such in situ measurements are particularly demanding with respect to the tradeoff between acquisition time and data resolution, as fast reactions and transient intermediate phases can easily be missed when a long acquisition time is used. A precursor powder mixture of La(OH)3, Li2CO3, and ZrO2 was placed in an Anton Paar HTK 1200 N oven chamber and heated to 1100 °C at a rate of 20 °C/min, followed by a 1 h hold at this final temperature. During heating, XRD scans were performed at the onset of a 10 min hold every 100 °C. Three syntheses were separately carried out using distinct measurement techniques (“Methods”):

  1. 1.

    Fast, non-adaptive scans that sampled 10–80° in 1 min.

  2. 2.

    Slow, non-adaptive scans that sampled 10–80° in 10 min.

  3. 3.

    Adaptive scans with varied 2θmax and measurement times.

Whereas ten patterns can be obtained at each hold when using fast scans (case 1), only one slow scan is performed (case 2). The number of patterns measured with adaptive scans (case 3) varied with temperature, as longer scan times were automatically allocated to samples where phase identification was complicated by a poor signal-to-noise ratio.

Fig 4 shows the weight fractions for all phases identified by XRD-AutoAnalyzer during the synthesis of LLZO when using different scan techniques. These results show several limitations of conventional XRD. With fast scans, Li2CO3 is not detected as its peaks are difficult to resolve from the background noise (Supplementary Fig. 11). The low resolution from fast scanning also precludes the identification of LaOOH, which appears as an intermediate phase between La(OH)3 and La2O338. As shown in Fig. 4d, the poor signal-to-noise ratio resulting from a short scan obscures several peaks associated with LaOOH, making it difficult to resolve this phase from others (e.g., La2O3). While a longer scan time of 10 min enables the detection of Li2CO3 by reducing noise in the corresponding pattern (Fig. 4b), LaOOH is still missed. Interestingly, the 10 min scans do clarify several low-2θ peaks associated with LaOOH but fail to detect many of its peaks at higher 2θ (Fig. 4e), suggesting that LaOOH transformed before the full range of 2θ was sampled. These findings highlight two competing factors that dictate the accuracy of in situ XRD measurements: (1) fast scans yield patterns with low signal-to-noise ratios, complicating the identification of small peaks that blend in with the background noise; (2) slows scans suffer from pattern changes as the measurements are performed, making it difficult to identify short-lived intermediate phases that transform before all their peaks can be detected.

Fig. 4: In situ identification of phases formed during LLZO synthesis.
figure 4

The weight fractions are plotted separately for all phases detected from a fast 1 min scans, b slow 10 min scans, and c adaptive scans. A short-lived intermediate phase, LaOOH, is detected only with adaptive measurements. Panels df illustrate XRD patterns obtained from the blue highlighted regions in panels ac, which each should contain LaOOH. d A fast scan misses low-2θ peaks from this phase owing to a poor signal-to-noise ratio. e A slow scan misses high-2θ peaks as LaOOH transforms before the measurement is complete. f An adaptive scan successfully detects all peaks from this phase by performing a rapid scan over [10°, 60°], followed by resampling of [14°, 27°] to clarify the smaller peaks.

Adaptive XRD overcomes the challenges described in the previous paragraph by achieving an optimal balance between speed and accuracy. As shown in Fig. 4c, Li2CO3 is successfully identified with a short scan time of ~3 min as the adaptive algorithm leverages early data to focus high-resolution measurements on a subset of 2θ = [18°, 32°] that contains the major peaks associated with this phase (Supplementary Fig. 11). Note that the algorithm is given no prior information regarding the presence of Li2CO3, but it quickly detects some signal above the noise in the relevant area and accordingly requests additional scanning to better resolve that signal. Adaptive XRD also leads to the successful detection of LaOOH, appearing briefly as an intermediate phase at 400 °C. In Fig. 4f, we show how the diffractometer was steered toward 2θ = [14°, 27°] at this temperature, revealing several LaOOH peaks that would otherwise be difficult to resolve from the background noise. Furthermore, because the total measurement time was kept short (~4 min), the full range of 2θ = [10°, 60°] was sampled before LaOOH transformed into La2O3. We stress that no human intervention was needed to redirect the diffractometer as the ML algorithm autonomously decides which parts of a pattern are most important for phase identification and, accordingly steers measurements toward those regions. In doing so, adaptive XRD enables the identification of short-lived intermediate phases that otherwise would be missed by conventional XRD scans. Knowledge regarding such intermediate phases is often key to understanding and tailoring reaction pathways for inorganic materials synthesis29.

Discussion

We believe that the integration of ML-assisted analysis tools can rapidly transform how experimental research is done. In contrast to traditional experimentation, where data is only analyzed after the fact, adaptive methods leverage all available data in real time to make optimal decisions regarding where the experimental measurements should be steered, and as such minimize the time required to obtain all relevant information. These autonomous and adaptive methods require (a) the development of rapid analysis tools that can make predictions, quantify uncertainty, and identify high-value measurement regimes, all on the timescale of the experiment; and (b) the ability to bring this analysis in-line with experiments and control the instruments needed for characterization. Modern ML techniques, while often requiring significant time for training performed off-line, can usually be evaluated within seconds and are therefore ideal decision-making agents to be integrated with experimental hardware.

We demonstrate in this work specifically, that ML-driven adaptive control over XRD measurements enables rapid and autonomous identification of crystalline materials in multi-phase samples, consistently detecting and categorizing phases more quickly and with higher accuracy than conventional XRD scans. By reducing the measurement time while maintaining high precision, adaptive XRD provides an effective method to monitor solid-state reactions in situ and identify short-lived intermediate phases using an in-house diffractometer. Although such instrumentation provides reduced intensity relative to a synchrotron light source, adaptive measurements make efficient use of the available radiation by rationally allocating scan time to resolve peaks with the highest leverage for phase identification. This approach is generalizable and may be extended to alternative diffraction techniques based on neutron or electron scattering. With future developments, we envision that many spectroscopic and microscopic techniques are likely to benefit from ML guidance and interpretation, enabling optimized selection and refinement of critical features in spectra and images. Increased automation of experimental measurements will not only reduce time and labor spent by human researchers3, but also give unprecedented access to the characterization of short-lived processes in materials science and chemistry.

Methods

Automated phase identification

To automatically identify phases from XRD patterns, we used the XRD-AutoAnalyzer algorithm developed in previous work13 and made available online (https://github.com/njszym/XRD-AutoAnalyzer). This approach is based on a convolutional neural network (CNN) with six convolutional layers, six pooling layers, and three fully connected layers. A dropout rate of 60% was used between the fully connected layers. As input to the CNN, each XRD pattern is treated as a one-dimensional vector with 4501 intensities distributed uniformly over 2θ. The output layer contains N neurons, where N is equal to the number of phases in the training set. Here we trained two separate models to analyze data from the Li-La-Zr-O and Li-Ti-P-O chemical spaces, which included 28 and 45 unique phases, respectively (Supplementary Table 1). For each phase, 150 XRD patterns with stochastically varied peak positions, widths, and intensities were simulated and used to train the CNN. Training was carried out for 50 epochs. At inference, we divided the trained CNN into an ensemble of 1000 individual models whereby each utilized different connections between its fully connected layers (i.e., with 60% dropout). The result is a probabilistic distribution of predicted phases for a given pattern, where the confidence of each phase is defined as the fraction (%) of models in the ensemble that identify it as the most likely phase. Additional details on the phase identification algorithm are given in our previous work13.

Class activation maps

Class Activation Maps (CAMs) were originally designed to highlight areas in an image that have the greatest influence on a CNN’s output33. This can be accomplished by mapping the trained weights of a global average pooling layer, placed after all the convolutional layers in a CNN, onto the final image such that important convolution features have high values in the CAM. Here we use a generalized version known as Grad-CAM39, which has capabilities similar to the traditional CAM approach but does not require a global average pooling layer in the CNN. Using this technique, we calculated the CAM associated with the classification of an ideal (simulated) XRD pattern for each reference phase in the training sets. All CAMs were normalized between 0 and 100 to ensure consistent comparison between different phases. In cases where XRD-AutoAnalyzer failed to identify a phase with a confidence greater than 50%, the absolute difference between the CAMs of the two most probable phases was calculated and resampling was proposed in areas where the difference exceeds some threshold defined by the user. A threshold of 25% was used during all experimental tests described in the main text, where the % is calculated relative to the maximum value of the CAM (i.e., 100).

Simulated test patterns

A total of 1400 single-phase, 2400 two-phase, and 4200 three-phase patterns were simulated to test the adaptive XRD approach in high-throughput. These patterns were based on 28 and 45 unique crystalline phases from the Li-La-Zr-O and Li-Ti-P-O spaces, respectively (Supplementary Table 1). All structures were extracted from the Inorganic Crystal Structure Database (ICSD). Multi-phase patterns were constructed via linear combinations of single-phase peaks, where the weight fraction of each individual phase was randomly sampled between 20 and 80%. To mimic data acquired experimentally, all patterns were stochastically augmented with three different artifacts. Peak shifts caused by strain were implemented with up to ±3% changes in the lattice parameters of each phase. Peak intensities were varied by as much as ±50% according to preferred crystallographic orientation (texture) along randomly sampled Miller indices ([hkl] where 0 ≤ h k l ≤ 2). Different peak widths were sampled using the Scherrer equation based on grain sizes ranging from 5 nm (broad) to 50 nm (narrow). A gaussian shape was assumed for all peaks.

All 8000 simulated patterns were duplicated to form 10 different datasets with varied sampling density and signal-to-noise ratio. The former was set by the number of datapoints contained in each pattern while the former was treated by adding Gaussian noise with a standard deviation reflecting the effective measurement time (see Eq. 2). The patterns with minimal resolution contained 3250 datapoints spanning 10°–140° (∆2θ = 0.04°), and Gaussian noise was added with a standard deviation of 5% (in terms of the maximum peak intensity). In contrast, the highest resolution patterns contained 6500 datapoints (∆2θ = 0.02°) spanning the same range, in addition to Gaussian noise with a standard deviation of only 2%. To mimic experimental resampling with increased resolution, which would be performed by adaptive XRD, we start from the low-resolution pattern and splice in data from the corresponding high-resolution pattern in regions of 2θ where artificial resampling is performed.

XRD-AutoAnalyzer was used to perform phase identification on the simulated XRD patterns described in the previous paragraph. To quantify the performance of this algorithm when applied autonomously to each dataset, we used the F1-score:

$${{{\mathrm{F}}}}_{1} = \frac{{{{{\mathrm{TP}}}}}}{{{{{\mathrm{TP}}}} + \frac{1}{2}\left( {{{{\mathrm{FP}}}} + {{{\mathrm{FN}}}}} \right)}}$$
(3)

Where TP is the number of true positives (correctly identified phases), FP is the number of false positives (phases incorrectly identified), and FN is the number of false negatives (missed phases). A high F1-score (close to 1) is desired to successfully identify all phases in a sample without incorrectly identifying phases that are not present.

Two-phase mixture preparation

Mixtures were prepared based on materials in two chemical spaces: Li-La-Zr-O and Li-Ti-P-O. These included Li2CO3 (Sigma-Aldrich, 99.9%), LiOH (Sigma-Aldrich, 98%), La(OH)3 (Sigma-Aldrich, 99.9%), ZrO2 (Sigma-Aldrich, 99.6%), TiO2 (Alfa Aesar, 99.9%), Li2TiO3 (Sigma-Aldrich, 99.9%), and Li3PO4 (Sigma-Aldrich, 99.9%). There are 12 possible two-phase majority|minority permutations of the materials in each chemical space (e.g., TiO2|Li2CO3 and Li2CO3|TiO2), where the first phase to appear is the majority phase and the second is the minority phase. For each of these two-phase pairs, 10 mixtures were prepared with iteratively larger amounts of the minority phase. This included weight fractions of 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20% for the minority phase in each permutation. All mixtures were shaker-milled for 10 min with a SPEX 800 mixer, followed by characterization with an Aeris X-ray diffractometer from Panalytical.

In situ characterization of Li7La3Zr2O12 synthesis

To synthesize Li7La3Zr2O12 (LLZO), we used a precursor powder mixture containing Li2CO3 (Sigma-Aldrich, 99.9%), La(OH)3 (Sigma-Aldrich, 99.9%), ZrO2 (Sigma-Aldrich, 99.6%). In addition to the stoichiometric amounts of these precursors needed to make LLZO, 10% excess weight of Li2CO3 was included to compensate for suspected volatility at high temperature. These precursors were mixed with ethanol and milled for 10 min using a SPEX 800 mixer, followed by drying at 70 °C in an oven for one hour. The dried sample was loaded into the Anton Paar HTK 1200 N oven chamber of a Bruker D8 Advance X-ray diffractometer, which was heated to 1000 °C at a rate of 20 °C/min in air. A hold time of one hour was used at 1000 °C, followed by a natural cool to room temperature. During the heating ramp, a 10 min temperature hold was imposed every 100 °C such that XRD scans could be performed on the sample.

Three different syntheses were carried out, each with a distinct measurement type. First, slow and non-adaptive measurements were employed whereby a single 10 min scan was performed at each 100 °C hold. Second, fast and non-adaptive measurement were applied such that 10 one min scans were performed at each 100 °C hold. Third, adaptive measurements were used with varied scan time and number of scans applied to each 100 °C hold. On average, adaptive scans required ~3 min per pattern. For both types of non-adaptive measurements, the scan range was kept fixed at 2θ = [10°, 80°]. In contrast, adaptive scans kept a fixed minimum angle of 10°, but varied the maximum angle between 60° and 140° following the workflow described in the main text and illustrated in Fig. 1. All patterns were analyzed in an automated fashion using XRD-AutoAnalyzer. The corresponding model was trained on 28 phases in the Li-La-Zr-O chemical space (Supplementary Table 1).