Automatic identification of crystal structures and interfaces via artificial-intelligence-based electron microscopy

Leitherer, Andreas; Yeo, Byung Chul; Liebscher, Christian H.; Ghiringhelli, Luca M.

doi:10.1038/s41524-023-01133-1

Download PDF

Article
Open access
Published: 02 October 2023

Automatic identification of crystal structures and interfaces via artificial-intelligence-based electron microscopy

npj Computational Materials volume 9, Article number: 179 (2023) Cite this article

2668 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Characterizing crystal structures and interfaces down to the atomic level is an important step for designing advanced materials. Modern electron microscopy routinely achieves atomic resolution and is capable to resolve complex arrangements of atoms with picometer precision. Here, we present AI-STEM, an automatic, artificial-intelligence based method, for accurately identifying key characteristics from atomic-resolution scanning transmission electron microscopy (STEM) images of polycrystalline materials. The method is based on a Bayesian convolutional neural network (BNN) that is trained only on simulated images. AI-STEM automatically and accurately identifies crystal structure, lattice orientation, and location of interface regions in synthetic and experimental images. The model is trained on cubic and hexagonal crystal structures, yielding classifications and uncertainty estimates, while no explicit information on structural patterns at the interfaces is included during training. This work combines principles from probabilistic modeling, deep learning, and information theory, enabling automatic analysis of experimental, atomic-resolution images.

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

Article 12 April 2024

Scaling deep learning for materials discovery

Article Open access 29 November 2023

Introduction

Distinct crystal structures, surfaces, and interfaces in bulk as well as nanomaterials play a key role in tailoring desirable properties in many applications, e.g., catalysis or energy conversion and storage^1,2,3. In particular, exposed surface structures in catalysts determine catalytic performances comprising activity and selectivity⁴. Furthermore, interfaces such as grain boundaries or stacking faults can largely affect the transport properties in energy storage or conversion devices^5,6,7,8,9,10. For example, grain boundaries serve as ion migration paths in batteries^6,7, act as scattering sites for phonons in thermoelectric devices^5,8, and could degrade electronic conductivity in solar cells^9,10. To engineer advanced materials for such applications, it is necessary to characterize their crystalline structure down to the atomic level, including defects or interfaces, local lattice orientations, and distortions^11,12,13. Currently, the ultimate tool to probe imperfections in crystalline materials is electron microscopy.

To date, electron microscopy techniques with aberration correction have been developed for investigating microstructures of materials with atomic spatial resolution. In particular, scanning transmission electron microscopy (STEM) images are more readily interpretable than images obtained via high resolution transmission electron microscopy (HR-TEM), due to direct correlation between image contrast and the atomic number Z of the observed species¹⁴. In STEM, a focused, high-energy electron beam passes through an electron transparent and hence thin sample. The electrons interact with the atoms in the sample and get both scattered elastically and inelastically, enabling to image the sample through various detector geometries (e.g., bright field (BF), dark field (DF), angular dark field (ADF), as well as high-angle annular dark field (HAADF)) and probe it through spectroscopic techniques (e.g., electron energy-loss spectroscopy (EELS) and energy-dispersive X-ray spectroscopy (EDS))^15,16,17,18. The most commonly employed technique to image atomic structures and crystalline defects is HAADF-STEM, where electrons scattered to large angles are collected by an annular detector forming an incoherent image. Moreover, a variety of data channels can be collected simultaneously with high-speed detectors, but as of today the wealth of information available in STEM is not fully exploited, due to the lack of versatile, automatic analysis tools^19,20.

Big-data analytics and artificial intelligence (AI) have great potential for analyzing large electron-microscopy data, with several applications to various datasets being reported^{20,21,22,23,24,25,26}. Such methods are introduced to uncover overlooked characteristics and this way drive a paradigm shift in image analysis and design of descriptors of atomic-resolution data. To provide a few examples, space-group classification was proposed based on electron imaging and diffraction datasets²¹. Also, multivariate statistical techniques were employed to extract structural information such as the crystal structure and orientation of a small sample region from complex four-dimensional STEM datasets²⁴. Detection and assignment of microstructural characteristics that differ from the vast majority of crystalline regions and phases in STEM datasets has been performed, e.g., the identification of the local dopant distribution in graphene^22,25, or monitoring of electron-beam induced phase transformations²³. One can also train AI methods to assign two-dimensional (2D) Bravais lattices to STEM or scanning tunneling microscopy (STM) images^23,27. A further approach is to reconstruct the real-space lattice from atomic-resolution images^25,28,29, providing real-space information that can be analyzed with structure-identification methods that are based, for instance, on graphs²⁵ or structural descriptors³⁰. Unsupervised learning for defect detection or chemical-species classification is reported, for instance, in^31,32. The above approaches rely heavily on recent developments in deep learning³³. Properly trained neural networks (NNs) such as convolutional neural networks (CNNs) have been shown to solve image classification problems more accurately than other machine-learning methods and in particular, more efficiently than humans, especially in high-throughput tasks.

Here, we propose AI-STEM, which stands for Artificial-Intelligence Scanning Transmission Electron Microscopy. AI-STEM automatically identifies projected crystal symmetry and lattice orientation as well as the location of defects such as grain boundaries in STEM images. Both synthetic and experimental images can be processed directly and in automatic fashion, no reconstruction of real-space lattices is required. We employ a Fourier-space descriptor, termed FFT-HAADF (FFT: Fast Fourier Transform), as input for a CNN. The deep-learning model classifies a given image into a selection of crystalline regions that differ not only by crystal symmetry but also orientation. This provides additional information compared to, for instance, the classification of a given image into the five Bravais lattices that exist in two dimensions. In particular, we propose an efficient training scheme that enables fast retraining and extension of the method. The model is trained on simulated images only, achieving near-perfect accuracy on both training and test data (in total 31 470 data points, see “Methods”). The training data contains typical noise sources that are encountered in experiment. Notably, we adopt the Bayesian neural-network (BNN) approach, employing the Monte Carlo dropout framework that was originally developed by Gal and Ghahramani³⁴. BNNs do not only classify a given input but also provide uncertainty estimates. We exploit this additional information that is absent in standard deep-learning models to locate bulk regions as areas of low and interfaces as areas of high model uncertainty. This way, AI-STEM can identify defects without being explicitly informed about them during training. The identification of bulk and interface regions is related to semantic segmentation, a popular computer-vision task in which each image pixel is classified in order to locate individual objects³⁵. Based on AI-STEM’s bulk-versus-interface segmentation, further analysis can be conducted where it is meaningful—according to the model: for instance, we demonstrate how the local lattice rotation can be calculated in the detected bulk regions. Finally, we employ unsupervised learning to visualize the high-dimensional NN representations in an interpretable, two-dimensional map. This reveals that the model separates not only crystalline grains with different symmetry but also different types of interfaces—despite never being explicitly instructed to do so. All code and data is made publicly available.

Results

Development of an automated classification procedure

Our goal is to develop an automatic framework for analyzing experimental HAADF-STEM images of bulk materials such as shown in Fig. 1a: in this image, the bulk crystalline regions are separated by a grain boundary (the interface region). The final prediction as shown in Fig. 1f should classify the image into bulk and interface regions, while also obtaining information about the bulk symmetry and lattice orientation. Here, the bulk region should be labeled as “fcc 111”, i.e., face-centered cubic symmetry in [111] orientation, since both grains are viewed along their common [111] zone axis corresponding to the tilt axis of the grain boundary. Finally, AI-STEM’s predictions can be used to automatically identify where to calculate additional properties that provide further characterization, for instance, of the bulk regions and their local lattice rotation (cf. Fig. 1g). In the following, we explain the intermediate steps that are required to map image input (Fig. 1a) to a characterization such as shown in Fig. 1f.

**Fig. 1: Schematic overview of the AI-STEM procedure for analyzing experimental STEM images.**

Fourier-space representation of atomic-resolution images

To achieve sensitivity to the substructure in an image such as shown in Fig. 1a, we divide it into local fragments (cf. Fig. 1b). Specifically, a sliding window of predefined size is scanned over the whole image and local patches are extracted for each stride. This allows to investigate structural transitions, e.g., between bulk and interface regions, in a smooth fashion. The selection of stride and window size is discussed in the Methods section. Each of the local patches is then transformed into reciprocal space by computing a Fourier-space descriptor (cf. Fig. 1c). Essentially, the fast Fourier transform (FFT) is calculated with additional pre- and post-processing steps (see Methods). We term this descriptor FFT-HAADF and use it as input for the machine-learning classification model. By calculating the Fourier transform, information on the lattice periodicity is enhanced, thus providing a starting point for a machine-learning model, which can be generalized to imaging modalities that provide atomic resolution information, such as HR-TEM or STM. In addition, translational invariance is introduced already at the level of the representation. The descriptor is not rotationally invariant, which is why we employ data augmentation, as we will explain in the section “Training data generation”.

The Bayesian classification model

To define the classification task, we need to specify the target labels as well as the model that maps the FFT-HAADF descriptor to the corresponding target labels. As classification model, we employ a CNN. This machine-learning method is well-known for its record-breaking performance in image classification^36,37 and is thus a perfect fit for our problem setting. The model receives the FFT-HAADF image descriptor as input and assigns the symmetry (e.g., face-centered cubic) and lattice orientation (e.g., [111], cf. Fig. 1d, e). We select in total 10 different crystalline surface structures into which a given image is classified (cf. Fig. 2a). This includes the most common crystal structures appearing in metals, comprising face-centered cubic (fcc), body-centered cubic (bcc), and hexagonal close-packed (hcp) structures. We focus on low-index crystallographic orientations, which can be resolved at atomic resolution, as the projected interatomic distances are well within the resolution limit. The selected orientations are also based on mono-species metal systems for each of the crystal structures considered here: copper (Cu) for fcc, iron (Fe) for bcc, and titanium (Ti) for the hcp structure, respectively. The CNN consists of a sequence of convolutional, pooling, and fully connected layers (cf. Fig. 2b). The last layer is composed by 10 neurons, each corresponding to one of the surface classes. In particular, the output neurons are normalized such that each represents the classification probability for one of the 10 surface structures. For a given image, the most likely class corresponds to the predicted label. In the complete AI-STEM workflow, the CNN is applied to each local window, providing a classification for each local segment (Fig. 1e).

**Fig. 2: Image descriptor and convolutional neural network (CNN) model for classification of STEM HAADF images.**

In general, beyond classification, it is desirable to estimate the model uncertainty. This allows to assess how much one can trust a specific prediction, especially in situations that are different to the training set. This can be useful in various scenarios, e.g., for autonomous driving³⁸ or medical diagnosis³⁹. In our case, we train the model only on perfect crystal structures with periodic arrangements of atomic columns and use the uncertainty in the classification to identify the presence of structural defects. Given the large number of degrees of freedom for any defect, creating a library of potentially interesting defects for training is challenging – which is why we take a different approach: we use a Bayesian neural network^34,40 which does not only classify a given (local) HAADF-STEM image, but also provides uncertainty estimates of the classification. If the uncertainty is high (low), the image is likely (unlikely) to deviate from the perfect crystal structure (on which the model is trained) and could contain a crystal defect, secondary phase with different crystal symmetry or even amorphous regions. This way, we can identify the host crystal structure and orientation at the same time and can locate regions in the image that differ from any of the training classes, where in this work, we consider grain boundaries as an example. One may be tempted to interpret the classification probabilities from the last CNN layer as being informative about model uncertainty. However, high classification probability does not always correlate with low uncertainty. In particular, standard NNs are known for overconfident extrapolations – even for points that are far outside the training set^34,40. Modeling of predictive uncertainty can be improved by constructing a probabilistic model that provides a distribution of predictions rather than a single, deterministic one.

In order to estimate uncertainty in deep learning models, distributions are placed over the NN weights—resulting in probabilistic outputs—instead of considering a single set of NN parameters as done in the standard approach—resulting in deterministic predictions. More formally, a standard NN is a non-linear function ${f}_{{{{\boldsymbol{\omega }}}}}:{{{\mathcal{X}}}}\to {{{\mathcal{Y}}}}$, i.e., a mapping from input to output space that is parametrized by parameters ω (a set of weights and biases ${{{\boldsymbol{\omega }}}}:= {\{{{{{\bf{W}}}}}_{l},{{{{\bf{b}}}}}_{l}\}}_{l = 1}^{L}$, where L is the number of layers). After training a model on data D_train, inference of a target y (here: a class label) for a new point x (here: the FFT-HAADF image descriptor) is calculated via

$$p(y| {{{\bf{x}}}},{D}_{{{\mbox{train}}}})=\int\,p(y| {{{\bf{x}}}},{{{\boldsymbol{\omega }}}})p({{{\boldsymbol{\omega }}}}| {D}_{{{\mbox{train}}}})d{{{\boldsymbol{\omega }}}}.$$

(1)

In this expression, p(ω∣D_train) denotes the posterior that indicates how likely a set of parameters is given training data D_train. Moreover, the likelihood p(y∣x, ω) corresponds to the softmax activation function – a standard approach to normalize the output layer such that they can be interpreted as classification probabilities:

$$p(y=c| {{{\bf{x}}}},{{{\boldsymbol{\omega }}}})=\frac{\exp \left({[{f}_{{{{\boldsymbol{\omega }}}}}({{{\bf{x}}}})]}_{c}\right)}{\mathop{\sum}\limits_{{c}^{{\prime} }}\exp \left({[{f}_{{{{\boldsymbol{\omega }}}}}({{{\bf{x}}}})]}_{{c}^{{\prime} }}\right)}.$$

(2)

where ${[{f}_{{{{\boldsymbol{\omega }}}}}({{{\bf{x}}}})]}_{c}$ is the output value of the NN for the class c. We see in Eq. (1) that instead of a single hypothesis, all parameter settings weighted by their posterior probabilities are included during inference. The standard approach would correspond to choosing the posterior as a delta distribution over a specific parameter setting—resulting in the above-mentioned overconfident predictions in out-of-distribution scenarios. Evaluating integrals over the whole parameter space, as appearing in Eq. (1), is practically impossible – especially for large deep learning models. Fortunately, approximating tools for evaluating Eq. (1) are available.

One way to approximate Bayesian inference in deep learning models (i.e., Eq. (1)) is Monte Carlo (MC) dropout^34,40. This approach is principled in the sense that the uncertainty estimates from MC dropout approximate those of a Gaussian process⁴⁰. In more detail, dropout^41,42 is employed—a regularization technique that is usually used to avoid overfitting by dropping individual neurons during training. This way, the model has to compensate the loss of individual neurons, avoiding that the neural activation concentrates to local parts of the network. It has been shown that powerful uncertainty estimates can be obtained by using dropout not only during training but also at test time³⁴. Specifically, for a given input, the output layer is sampled for a certain number of iterations T, where each sample is calculated from different networks that are perturbed according to the dropout algorithm. To obtain a Bayesian CNN, dropout is applied after each convolutional and fully connected layer (see the yellow blocks in 2b). Classification can then be performed by calculating a simple average, i.e., the probability of class c given input x and training data D_train (whose general expression is shown in Eq. (1)) can be approximated as

$$p(y=c| {{{\bf{x}}}},{D}_{{{\mbox{train}}}})\approx \frac{1}{T}\mathop{\sum }\limits_{t=1}^{T}p(y=c| {{{\bf{x}}}},{{{{\boldsymbol{\omega }}}}}_{t}).$$

(3)

Here, p(y = c∣x, ω_t) (defined in Eq. (2)) denotes the classification probability of class c given input x and parameter configuration ω_t that is obtained by random removal of neurons (defined according to the dropout algorithm). Modest number of samples typically suffice⁴⁰, where in this work, we employ T = 100 samples. Notably, this process is in principle trivial to parallelize. We discuss more details on computation time and choice of T in the Supplementary Information (cf. Supplementary Fig. 4). Beyond the simple average in Eq. (1), additional information about the model confidence is contained in the collection of samples p(y = c∣x, ω_t). For this, we invoke information theory, specifically mutual information. This (scalar) quantity provides a means to quantify the uncertainty, which has been employed in different settings including self-driving cars⁴³ as well as crystal-structure identification³⁰. The mutual information is defined between predictive and posterior distribution and is denoted as I(ω, y∣D_train, x) (see “Methods” for the exact definition). Intuitively, it can be understood as the information gained about the model parameters ω if one would receive the label y for a new point x. Thus, if the mutual information is high for a given data point, one would gain information once the label is specified—corresponding to high predictive uncertainty. Similar to Eq. (1), integrals over the whole parameter space appear, which are computationally intractable. However, using MC dropout, one can find a tractable expression that only involves summations over all classes and samples³⁴ (“Methods”).

Training data generation

To train the classification model for crystal-structure identification in atomic-resolution images, a suitable training dataset has to be generated. Notably, we refrain from training on experimental images which may contain unknown artefacts, such as noise, distortions or defects. Furthermore, acquiring and curating an experimental database of images of pristine crystal structures imaged at different orientations with atomic resolution is an elaborate task. Instead, we train only on simulated images, where we have exact control over imaging conditions and noise sources, allowing us to create a dataset with known labels. Obtaining such reliable training data is essential to achieve trustable labeling output of the CNN. One may criticize simulations for potentially missing crucial features that are present in experiment. However, with the advent of aberration-correction in STEM¹⁴, the direct comparison of experimental and simulated images at atomic resolution became accessible also on a quantitative basis⁴⁴. It has even been shown that it is possible to determine the number of atoms in an atomic column or to retrieve the 3D atomic structure of nano-objects by combining experimental and simulated images^45,46. Recently developed efficient implementations of the multislice algorithm enable to simulate STEM images similar to experimental conditions^47,48. Using high-performance computing, realistic simulations of images can be conducted, achieving computation times of a few hours to days for 10–100 images. In this work, we provide additional speed-up by using a convolution-based approach, reducing the computation time from days to minutes for an entire training dataset (see “Methods”). Using this efficient simulation scheme, we obtain images for each of the 10 classes for different lattice constants. Additionally, we include data augmentation steps to consider a range of lattice rotations and noise sources that are resembling typical experimental conditions (“Methods”). In this work, we include lattice shear, blurring, as well as Gaussian and Poisson noise—resulting in 31,470 data points. We want to emphasize that even though the model is trained on synthetic data, we apply it to classify experimental atomic resolution STEM images as shown in the Results section (see Fig. 4).

Neural-network training procedure

For training, the 31,470 data points are split, where 80% is used for training and 20% for validation. Based on the performance on the validation set, we optimize hyperparameters such as the filter size in the convolutional layers and dropout ratio (the number of neurons dropped). Specifically, we employ Bayesian optimization, which is a general approach for global optimization of black-box functions that are computationally expensive to evaluate⁴⁹. This makes Bayesian optimization a perfect fit for optimizing NNs, where exploring different architectures and optimization parameters is typically accompanied with high computational cost. Here, the black-box function to be optimized is the validation loss, and the optimization protocol we invoke⁵⁰ provides us with a list of candidate models, all with near-perfect accuracy (see “Methods”). Their uncertainty estimates, however, are different, as we will highlight via the following model selection procedure.

To find the model that shows strongest performance in both classification and detection of out-of-training-distribution regions, we analyze the simulated test image in Fig. 3a. It contains both crystalline and amorphous regions, providing a test bed for identifying models with high uncertainty at the transition between grains and in the amorphous region—both of which are never shown to the models during training. The four regions in the image are simulated separately (using full multi-slice simulations) and then stitched together. Three of these regions are crystalline, representing one of the in total three different symmetries in the training set: Fe (bcc, [100]), Cu (fcc, [100]), and Ti (hcp, [0001]). Here, we expect low uncertainty and correct assignment of the respective symmetry. The amorphous region is simulated based on a three-dimensional structure obtained via realistic molecular-dynamics simulations of amorphous silicon⁵¹. All models obtained via Bayesian optimization are applied to this image. Given their near-perfect accuracy during training, they all can recognize the crystalline parts of the image, while their assignments in the amorphous region differ. We can now also analyze the corresponding uncertainties, which provide an estimate of the reliability of the classifications (cf. Fig. 3b). We select the model with the highest uncertainty, as quantified by the mutual information (cf. Eq. (6)), in the amorphous region. For this model, the classification results are shown in Fig. 3c, where one can see that the correct crystal symmetries are assigned in the expected regions, while in the amorphous part, several different phases are assigned. The mutual information shown in Fig. 3d increases at the interfaces between the four different crystalline regions, as well as in the amorphous part. The detailed architecture is specified in Table 1.

**Fig. 3: Application of AI-STEM to synthetic, polycrystalline data.**

Table 1 Convolutional neural network architecture employed in this work.

Full size table

Application to experimental STEM data

Now we turn to applying AI-STEM to experimental data. In the following, we challenge the model with several HAADF-STEM images, demonstrating the practical applicability of AI-STEM. In particular, we show that the model can classify crystalline regions in experimental images and how the bulk-versus-interface segmentation can be inferred and employed for further analysis—here, for determining the local lattice orientation in the bulk regions.

First, a HAADF image of elemental Cu shown in Fig. 4a is analyzed. The image contains a horizontally aligned grain boundary separating two misoriented single crystals with a [111] orientation in the upper and lower grain, respectively. As shown in Fig. 4b, the model classifies the grain regions correctly as fcc [111]. At the interface, the same label is assigned, but with increased uncertainty (as quantified by mutual information), allowing to detect the interface region (cf. Fig 4c).

**Fig. 4: Application of AI-STEM to experimental STEM images with grain boundaries.**

The segmentation obtained via AI-STEM’s predictions can now be used to conduct further analysis of the local lattice structure. Practically, to separate the image into bulk and interface regions, we fix a mutual-information threshold of 0.1, interpreting all local windows above this value as interface and the remaining ones as bulk. Depending on the type of region, i.e., interface or bulk, different quantities are suited. As an example, we calculate here the local lattice orientation, a quantity that is only reasonable to compute in the bulk regions. Specifically, for each local window, we reconstruct⁵² the real-space lattice from the atomic columns and determine^53,54 the angle of misalignment with respect to a reference training image or rather its reconstructed atomic columns (cf. Supplementary Methods for more details). Note that this way, information from the training data is entering this analysis. Also note that the reference lattice is not required as input but determined based on the NN assignments—making this procedure fully automatic and extendable (in case of retraining and new classes being added to the training set). The calculated angle is termed lattice mismatch and the results for the Cu grain boundary are shown in Fig. 4d. The reference images are shown below the heatmaps. For the interface region, depicted in gray in Fig. 4d, no calculation is performed. The expected misorientations are exemplarily indicated in Fig. 4a, which closely match the calculated values of Fig. 4d.

Next, we consider a HAADF image of Fe⁵⁵ containing a grain boundary that is horizontally aligned and separates two crystalline grains with [100] orientation (cf. Fig. 4e). Compared to the previous example, this image contains intensity variation of the background but also the atomic columns, which is more pronounced, for instance, in the upper left part compared to the lower part of the image. Such variations are common in experimental images and may stem from surface damage induced during sample preparation or surface oxide formation. However, AI-STEM correctly classifies the bulk regions as bcc [100] (cf. Fig. 4f), while the assignment changes at the grain boundary, but with increased uncertainty (cf. Fig. 4g). In the upper left in more noisy parts of the image, the uncertainty also increases. The obtained mismatch angles are shown in Fig. 4h and also here calculated and expected angles (again indicated in the original image in Fig. 4e) are in agreement.

Finally, we investigate a low-angle [0001] tilt grain boundary in Ti (cf. 4i), which consists of a periodic array of dislocations with a line direction perpendicular to [0001]. Hence, the interface structure is qualitatively different compared to the previously shown high angle grain boundaries for Cu and Fe. In particular, the smaller misorientation angle between both grains in the Ti image leads to regions within the interface where the atomic lattices of the two grains are still connected with each other. AI-STEM correctly assigns hcp [0001] (cf. Fig. 4j), with only few outliers in the classification at the grain boundary, which is again revealed via the mutual information (cf. Fig. 4k). One can observe that the mutual information is decreasing in the regions in between the grain boundary dislocations, where the lattice resembles that of undisturbed Ti [0001] and is increasing in the locations of the dislocation cores at the interface. This shows that the uncertainty estimate of the predictions can even be used to locate more confined lattice defects such as individual dislocations. Similar to the previous two examples, we obtain the local lattice mismatch (cf. Fig. 4l) that matches the expectations (cf. Fig. 4i) with a margin of few degrees.

Analyzing AI-STEM’s internal representations via unsupervised learning

So far, we have demonstrated how AI-STEM can be used to classify lattice symmetry and orientation and is capable of detecting interfaces and even individual dislocations within an interface. To understand how the model interprets crystalline grains and interface regions, we apply unsupervised learning to the internal NN representations. Specifically, we employ manifold learning to embed the high-dimensional NN representations into two-dimensional, readily interpretable maps. We employ Uniform Manifold Approximation and Projection (UMAP)⁵⁶, which approximates the manifold that underlies a given dataset, and allows to construct low-dimensional embeddings that can capture both global and local relationships among the original, high-dimensional data points. We consider the experimental images shown in Fig. 4a, e, i, and compute the NN representations for each of the local windows, as determined within the AI-STEM workflow (cf. Fig. 1b). Superficially, we inspect the last, fully connected layer before the output layer, i.e., before the classification is conducted (cf. Fig. 2b). The two-dimensional UMAP embedding is shown in Fig. 5a, where the color scale corresponds to the NN assignments. Despite the high level of compression, from 128 to 2 dimensions, all three images are well separated. For each image, two sub-clusters can be observed that correspond to the two bulk grains (cf. Fig. 5a). These are joined by contiguous strings that correspond to the interface regions, respectively. This is also visualized by using the mutual information as a color scale (cf. Fig. 5b), where along the strings, increased uncertainty can be observed (which indicates the presence of the defects). Notably, the different grain boundary types (e.g. high angle vs. low angle) are also mapped to different regions in the map. This demonstrates the capability of AI-STEM to not only recognize bulk symmetry and orientation but also to distinguish different interface types – even though it has never been provided with explicit examples for such a task during training.

**Fig. 5: Visualizing neural-network representations of local crystalline and defective atomic structure in experimental images.**

Discussion

In this work, we propose AI-STEM which automatically characterizes crystal structure and interfaces in simulated and experimental atomic-resolution STEM datasets. This is enabled by adapting several techniques: we employ signal-processing tools to represent imaging data, deep learning to identify crystal symmetry and orientation, and Bayesian modeling in combination with information theory to estimate model uncertainty as well as to optimize NN hyperparameters. At the core of AI-STEM is a Bayesian convolutional neural network, which goes beyond standard NN models, providing classifications and principled uncertainty estimates. The former allow identification of lattice symmetry and crystal orientation while the latter are used to segment an image into bulk and interface regions. Despite being trained only on simulated STEM images of perfect lattice structures, AI-STEM generalizes to experimental images, as demonstrated by several challenging examples. The training data can be obtained by discrete multislice image simulations considering dynamical scattering effects, while, in this work, we show that a fast convolution approach can be employed. In order to verify the applicability of the labeling procedure, a diverse set of simulated images of typical monocrystalline structures is generated, serving as reliable ground truth. Based on the segmentation provided by AI-STEM’s prediction, one can conduct augmenting analysis that reveals additional characteristics of the identified regions. Here, we determine the local lattice rotation in the crystalline grains. Using unsupervised learning, we demonstrate that different types of interfaces appear separated in the internal NN space, despite no explicit information on any interface pattern is being provided during training. This analysis also shows how unsupervised learning can be used to explain a black-box model, in post-hoc fashion^57,58,59. Moreover, on-line data processing is feasible with the proposed approach since the method is easy to parallelize and already using a single GPU we are within the range of typical acquisition times (cf. Supplementary Fig. 4).

Furthermore, note that the presented experimental images in Fig. 4 have near to perfect zone axis orientation within the experimental limit, since the aim is to resolve the atomic structure of the interfaces with highest possible precision. While this might be considered an idealized scenario, the results of Fig. 4 should at least constitute a test for small deviations in crystal tilts, which are typically present in experimental images. Since we did not include any information on this in the training data set, the model already at that level shows to be robust. We have conducted further tests for larger deviations in tilt of the adjoining crystals in Supplementary Fig. 5. The model provides the expected assignments, even if only lattice fringes are resolved, while the accompanying high uncertainty values require a careful interpretation of the prediction. The model robustness may be further improved by including crystal tilt variations as additional parameters in the training set.

Since various types of noise components, such as scan and detector read-out noise, are typically present in STEM images, we further tested the applicability of AI-STEM for different noise levels as shown in Supplementary Fig. 6. Here, we deployed AI-STEM to images with different degrees of primarily fast scan noise that are contained in the images. Specifically, we consider the experimental image in Fig. 4a as a reference example with reduced fast scan noise by frame averaging and show that the number of frame averages does not influence AI-STEM’s performance. Even a single shot frame with high noise contributions is correctly classified showing low uncertainty values in the prediction of the bulk crystal regions (see Supplementary Fig. 6).

In the future, it would be interesting to provide not only a bulk-versus-interface segmentation but also predict additional details automatically, e.g., how the crystalline grains differ. Currently, this can only be done by additional analysis, e.g., based on reconstructing the (projected) real-space lattice. However, one can see from the latent space visualization in Fig. 5 that grains with different orientation are separated. In principle, a clustering algorithm may be employed to separate the grains, while this can be challenging to automate as clustering typically involves several parameter choices that are not guaranteed to generalize well. Alternatively, one may consider a multi-label classification problem or construct a separate machine-learning model to predict the (local) lattice rotation automatically.

In conclusion, our method shows great potential to automatically analyze and classify crystallographic attributes in STEM datasets without human intervention. In electron-microscopy research, the development of a “self-driving” microscope appears on the horizon due to rapid advances in artificial intelligence^60,61. While we focus on mono-species systems as a proof of concept, this work paves the way to autonomous investigations of complex nanostructures at the atomic level.

Methods

AI-STEM parameters

Besides the classification model, the two most important components are the stride and box size. For the box size, we recommend a value of 12 Å, on which the model is trained. If significantly larger window sizes are necessary for a desired application, the practical approach is to augment the dataset using our efficient training procedure and retrain the model. Also note that the model is trained for a specific resolution, in which 1 pixel corresponds to 0.12 Å. For different resolutions, one may simply rescale the image or, as we proceeded here, adjust the window size. For instance, the Cu image in Fig. 4a is measured for a resolution of about 0.0880 Å per pixel, while the other images in Fig. 4 are measured for 0.1245 Å. To match both resolutions to the training range, we increase the box size to 136 pixels for Cu (as it is measured at higher resolution, i.e., we need to increase the box size to obtain a number of atomic columns that is comparable to the training set), and 96 for the other two images (as it is recorded at lower resolution, i.e., smaller windows are required to obtain a number of atomic columns that is comparable to the training set). In principle, our data-generation method also allows to vary the resolution, such that retraining with various resolutions could be done as well. For the stride, we use values on the order of 1 Å, to demonstrate the high-resolution capabilities of the approach. Smaller strides can suffice to reveal the main characteristics, cf. Supplementary Fig. 3 (in particular, it is possible to separate an image into bulk and interface regions). For the synthetic image in Fig. 3, we employ a stride of 12 pixels, corresponding to ~1.4 Å. The same settings were used for the experimental images of Ti and Cu (Fig. 4a, i). For Fe (Fig. 4e), the stride was halved as this image is smaller (about half of the size of Ti, and two third of Cu), enabling a comparable number of local fragments.

FFT-HAADF descriptor

We start from the periodic arrangement of atomic columns in HAADF-STEM images. These are acquired in low-index crystallographic orientations, which directly represent the underlying projected crystal symmetry. In the AI-STEM workflow, an input image corresponds to a local fragment or window, extracted from a larger image. The cutting procedure may lead to to boundary effects, e.g., truncated atomic columns. This can lead to spurious patterns in the FFT, which is why we apply a window function to the STEM HAADF image before calculating the FFT—a standard practice in signal processing⁶². Here, we use the Hann window that provides a smooth decay at the image boundaries. Then, the FFT is calculated, resulting in spectra which have a dominant central peak, suppressing possibly valuable information at higher frequencies. Thus, we apply a thresholding scheme: the FFTs are normalized to the range [0, 1] and then all values above 0.1 are set to 1.0. This provides visible enhancement of peak patterns around the central peak, which is visualized for all classes in this work in Supplementary Fig. 1.

Neural network training

The CNN is trained on 31,470 64 × 64 pixel images (the FFT-HAADF descriptor of the STEM HAADF images). A split of this dataset into training and test is performed in stratified fashion (via scikit-learn, using a random state of 42; see Data availability for the dataset link). Adam optimization is employed for training⁶³. The CNN is implemented using Tensorflow⁶⁴. Hyperparameters are optimized using Bayesian optimization, specifically the Tree-structured Parzen estimator (TPE) algorithm as provided by the Python library hyperopt⁵⁰. We experimented with minimizing either validation loss or accuracy, while no significant difference could be found, in terms of classification accuracy. We chose the validation loss as objective function to be minimized. We tested different configuration spaces for the network architecture and optimization parameters, including number of layers, number of filters, filter size, dropout ratio as well as batch sizes (example notebooks are provided, cf. “Data availability”). The models typically converge to near-perfect accuracy in few epochs and we find that we can restrict to smaller configuration spaces, reducing the computational cost. We fix the architecture to 6 layers (number of filters: 32, 32, 16, 16, 8, 8) and focus on the search for the right kernel size (3 × 3, 5 × 5, 7 × 7) as well as the dropout ratio (values between 2 and 10 percent, step size 1 percent). In particular, the choice of dropout ratio is known to be important for the quality of the uncertainty estimates⁴⁰. We run the TPE algorithm for 25 iterations. Each model is optimized for 25 epochs, saving only the model with best validation accuracy. These models achieve all near-perfect accuracy (99,9% classification accuracy on both training and validation set), but their uncertainty estimates differ. We thus select the model that has the highest median uncertainty in the amorphous region in the synthetic polycrystal example (Fig. 3), where we expect a low degree of crystallinity. The model chosen in this fashion is reported in Table 1.

Uncertainty quantification

Given the test point x, the mutual information between the predictions and the model posterior p(ω∣D_train) is defined as^34,40,65

$${\mathbb{I}}\left[y,{{{\boldsymbol{\omega}}}}| {{{\bf{x}}}},{D}_{{{\text{train}}}}\right]:= {\mathbb{H}}[y| {{{\bf{x}}}},{D}_{{{\text{train}}}}]-{{\mathbb{E}}}_{p({{{\boldsymbol{\omega}}}}| {D}_{{{\text{train}}}})}\left[{\mathbb{H}}[y| {{{\bf{x}}}},{{{\boldsymbol{\omega}}}}]\right].$$

(4)

The first term on the r.h.s. is termed predictive entropy⁴⁰. It quantifies the (average) information in the distribution of predictions and is defined by

$${\mathbb{H}}[y| {{{\bf{x}}}},{D}_{{{\mbox{train}}}}]:= -\mathop{\sum}\limits_{c}p(y=c| {{{\bf{x}}}},{D}_{{{\mbox{train}}}})\log p(y=c| {{{\bf{x}}}},{D}_{{{\mbox{train}}}}).$$

(5)

The second term on the r.h.s. of Eq. (4) is defined as

$$\begin{array}{l}{{\mathbb{E}}}_{p({{{\boldsymbol{\omega}}}}| {D}_{{{\text{train}}}})}\left[{\mathbb{H}}[y| {{{\bf{x}}}},{{{\boldsymbol{\omega}}}}]\right]:= {{\mathbb{E}}}_{p({{{\boldsymbol{\omega }}}}| {D}_{{{\text{train}}}})}\left[\mathop{\sum}\limits_{c}p(y=c| {{{\bf{x}}}},{{{\boldsymbol{\omega }}}})\log p(y=c| {{{\bf{x}}}},{{{\boldsymbol{\omega }}}})\right].\end{array}$$

One may refer to this as expected entropy as it averages the entropy of the predictions given the parameters ω that are distributed according to the posterior distribution⁶⁶. Using Monte Carlo dropout, one can approximate the mutual information as³⁴

$$\begin{array}{l}{\mathbb{I}}\left[y,{{{\boldsymbol{\omega }}}}| {{{\bf{x}}}},{D}_{{{\mbox{train}}}}\right]\approx \\ -\mathop{\sum}\limits_{c}\left(\frac{1}{T}\mathop{\sum}\limits_{t}p\left(y=c| {{{\bf{x}}}},{{{{\boldsymbol{\omega }}}}}_{t}\right)\right)\log \left(\frac{1}{T}\mathop{\sum}\limits_{t}p\left(y=c| {{{\bf{x}}}},{{{{\boldsymbol{\omega }}}}}_{t}\right)\right)\\ +\frac{1}{T}\mathop{\sum}\limits_{c}\mathop{\sum}\limits_{t}p\left(y=c| {{{\bf{x}}}},{{{{\boldsymbol{\omega }}}}}_{t}\right)\log p\left(y=c| {{{\bf{x}}}},{{{{\boldsymbol{\omega }}}}}_{t}\right).\end{array}$$

(6)

Details on training data generation

For each of the 10 surface classes, we consider a small interval of ± 0.1 Å around their respective experimental lattice parameters. This is due to the fact that some of the classes can be similar (a consequence of the 2D projection provided by STEM images), for instance fcc 100 and bcc 100 (cf. Fig. 2a). The lattice parameters are the following: for all Cu fcc single crystals, the lattice constant a is 3.63 Å; for all Fe bcc single crystals, the lattice constant a is 2.87 Å; for all Ti hcp single crystals, the lattice constants a and c are 2.95 Å and 4.68 Å, respectively (c/a ~ 1.587). For each of these classes, we include a range of rotations (0-90 degrees, step size 5 degrees, using the Python package scipy⁶⁷). Then, different noise sources are applied, as implemented in the Python package scikit-image⁶⁸: first, shear is applied to all images (affine transformation applied to the images, only using shear but no scaling or translation) for all rotations. We apply additional noise sources for a subselection of data points (only every second rotated and sheared image, keeping the dataset size below 100k), including Gaussian blurring (scanning a Gaussian filter of certain width over the image), and finally, addition of random noise sources (Gaussian or Poisson). Visual examples are provided in Supplementary Fig. 2.

Simulation of STEM dataset

To generate the ground truth STEM datasets consisting of HAADF images, STEM image simulations were performed with the abTEM software package⁴⁷. The crystal orientations as shown in Fig. 2 for the ten classes are generated by the atomic simulation environment (ASE) Python module⁶⁹. The thickness of all simulation cells was set to 8 nm (z − direction) with a slice thickness of 0.2 nm. The x − and y − dimensions of the simulation cells was chosen to be ~8 nm, respectively. An electron energy of 300 kV, a probe semi-convergence angle of 24 mrad and semi-collection angles of the HAADF detector ranging from 78 to 200 mrad were used for the simulations. The pixel size was fixed to 12 pm resulting in images with ~64 × 64 pixels. Thermal diffuse scattering was considered by using 12 frozen phonon configurations with a root-mean-squared thermal displacement according to the Debye-Waller factors obtained from Peng et al.⁷⁰ for Cu, Fe and Ti at 280 K. The image simulations were performed on a Windows 11 Pro based workstation with an Intel Xeon CPU with 32 GB of RAM and a NVIDIA Quadro K1200 GPU. The total simulation times for the Cu-fcc class was ~11 h, for the Fe-bcc class ~26 h and Ti-hcp class ~13 h, respectively.

To speed up the training dataset generation, we also employed a simple convolution approach where the probe wave function generated in abTEM⁴⁷ was convolved with the summed projected potentials for each cell. This reduces the total calculation time for each class to several minutes. We employ this approach for training the CNN model, demonstrating that this computationally efficient approach can yield strong performance on experimental images (cf. Fig. 4).

Scanning transmission electron microscopy experiment

All experimental STEM data were acquired using a probe corrected Titan Themis 60-300 (Thermo Fisher Scientific). The TEM is equipped with a high brightness field emission gun and a gun monochromator. The electrons were accelerated to 300 kV and images were recorded at a probe current of 80 pA with a high-angle annular dark field (HAADF) detector (Fishione Instruments Model 3000). The collection angles for the HAADF images were set to 73-200 mrad using a semi-convergence angles of 17 mrad and 23.8 mrad. Image series with 20–40 images and a dwell time of 1–2 μs were acquired, registered and averaged in order to minimize the effect of instrumental instabilities and noise in the images.

Experimental HAADF-STEM images of a Σ 19b(178)[111] tilt grain boundary in Cu with a misorientation angle of ~48^∘ , a Σ 5(013)[001] tilt boundary in Fe with a misorientation angle of ~38^∘ and a low angle [0001] tilt GB in Ti with a misorientation angle of ~13^∘ are used to test the AI-STEM approach. Details on the sample fabrication and preparation for the Cu, Fe and Ti grain boundary images can be found in^13,55,71, respectively.

Data availability

All relevant data (training and test set, experimental and synthetic images, as well as neural-network models) are available at https://doi.org/10.5281/zenodo.7756516.

Code availability

Code and several examples for applying AI-STEM and reproducing the results of this article are available at https://github.com/AndreasLeitherer/ai4stem.

References

Harmer, M. P. The phase behavior of interfaces. Science 332, 182–183 (2011).
Article CAS Google Scholar
Zhao, M. & Xia, Y. Crystal-phase and surface-structure engineering of ruthenium nanocrystals. Nat. Rev. Mater. 5, 440–459 (2020).
Article CAS Google Scholar
Luo, J. et al. A critical review on energy conversion and environmental remediation of photocatalysts with remodeling crystal lattice, surface, and interface. ACS Nano 13, 9811–9840 (2019).
Article CAS Google Scholar
Barroo, C., Wang, Z.-J., Schlögl, R. & Willinger, M.-G. Imaging the dynamics of catalysed surface reactions by in situ scanning electron microscopy. Nat. Catal. 3, 30–39 (2020).
Article CAS Google Scholar
Gordiz, K. & Henry, A. Phonon transport at crystalline Si/Ge interfaces: the role of interfacial modes of vibration. Sci. Rep. 6, 23139 (2016).
Article CAS Google Scholar
He, X., Sun, H., Ding, X. & Zhao, K. Grain boundaries and their impact on Li kinetics in layered-oxide cathodes for Li-ion batteries. J. Phys. Chem. C. 125, 10284–10294 (2021).
Article CAS Google Scholar
Sun, Y., Cong, H., Zan, L. & Zhang, Y. Oxygen vacancies and stacking faults introduced by low-temperature reduction improve the electrochemical properties of Li2MnO3 nanobelts as lithium-ion battery cathodes. ACS Appl. Mater. Interfaces 9, 38545–38555 (2017).
Article CAS Google Scholar
Hu, C., Xia, K., Fu, C., Zhao, X. & Zhu, T. Carrier grain boundary scattering in thermoelectric materials. Energy Environ. Sci. 15, 1406–1422 (2022).
Article CAS Google Scholar
Lee, J. W. et al. The role of grain boundaries in perovskite solar cells. Mater. Today Energy 7, 149–160 (2018).
Article Google Scholar
Naumann, V. et al. Explanation of potential-induced degradation of the shunting type by Na decoration of stacking faults in Si solar cells. Sol. Energy Mater. Sol. Cells 120, 383–389 (2014).
Article CAS Google Scholar
Lu, W., Liebscher, C. H., Dehm, G., Raabe, D. & Li, Z. Bidirectional transformation enables hierarchical nanolaminate dual-phase high-entropy alloys. Adv. Mater. 30, 1804727 (2018).
Article Google Scholar
Liebscher, C. H., Stoffers, A., Alam, M. & Lymperakis, L. Strain-induced asymmetric line segregation at Faceted Si grain boundaries. Phys. Rev. Lett. 121, 015702 (2018).
Article CAS Google Scholar
Meiners, T., Frolov, T., Rudd, R. E., Dehm, G. & Liebscher, C. H. Observations of grain-boundary phase transformations in an elemental metal. Nature 579, 375–392 (2020).
Article CAS Google Scholar
Pennycook, J. & Nellist, P. D.Scanning Transmission Electron Microscopy-Imaging and Analysis (Springer, 2011).
Thomas, J. M., Leary, R. W., Eggeman, A. S. & Midgley, P. A. The rapidly changing face of electron microscopy. Chem. Phys. Lett. 631, 103–113 (2015).
Article Google Scholar
Collins, S. M. & Midgley, P. A. Progress and opportunities in EELS and EDS tomography. Ultramicroscopy 180, 133–141 (2017).
Article CAS Google Scholar
Pan, Jea Enhanced superconductivity in restacked TaS2 nanosheets. J. Am. Chem. Soc. 139, 4623–4626 (2017).
Article CAS Google Scholar
Ophus, C. Four-dimensional scanning transmission electron microscopy(4D-STEM): From scanning nanodiffraction to ptychography and beyond. Microsc. Microanalysis 25, 563–582 (2019).
Article CAS Google Scholar
Kalinin, S. V., Sumpter, B. G. & Archibald, R. K. Big–deep–smart data in imaging for guiding materials design. Nat. Mater. 14, 973–980 (2015).
Article CAS Google Scholar
Spurgeon, S. R. et al. Towards data-driven next-generation transmission electron microscopy. Nat. Mater. 20, 274–279 (2021).
Article CAS Google Scholar
Aguiar, J. A., Gong, M. L., Unocic, R. R., Tasdizen, T. & Miller, B. D. Decoding crystallography from high-resolution electron imaging and diffraction datasets with deep learning. Sci. Adv. 5, 1949 (2019).
Article Google Scholar
Ziatdinov, M. et al. Building and exploring libraries of atomic defects in graphene: Scanning transmission electron and scanning tunneling microscopy study. Sci. Adv. 5, 8989 (2019).
Article Google Scholar
Vasudevan, R. Kea Mapping mesoscopic phase evolution during E-beam induced transformations via deep learning of atomically resolved images. Npj Comput. Mat. 4, 30 (2018).
Article Google Scholar
Jesse, S. et al. Big data analytics for scanning transmission electron microscopy ptychography. Sci. Rep. 6, 1 (2016).
Article Google Scholar
Ziatdinov, Mea Deep learning of atomically resolved scanning transmission electron microscopy images: chemical identification and tracking local transformations. ACS Nano 11, 12742–12752 (2017).
Article CAS Google Scholar
Kalinin, S. Vea Machine learning in scanning transmission electron microscopy. Nat. Rev. Methods Prim. 2, 11 (2022).
Article CAS Google Scholar
Choudhary, K., Gurunathan, R., DeCost, B. & Biacchi, A. Atomvision: A machine vision library for atomistic images. J. Chem. Inf. Modeling 63, 1708–1722 (2023).
Article CAS Google Scholar
Wei, J., Blaiszik, B., Scourtas, A., Morgan, D. & Voyles, P. M. Benchmark tests of atom segmentation deep learning models with a consistent dataset. Microsc. Microanalysis 29, 552–562 (2023).
Article Google Scholar
Corrias, M. et al. Automated real-space lattice extraction for atomic force microscopy images. Mach. Learn. 4, 015015 (2023).
Google Scholar
Leitherer, A., Ziletti, A. & Ghiringhelli, L. M. Robust recognition and exploratory analysis of crystal structures via Bayesian deep learning. Nat. Comm. 12, 6234 (2021).
Article CAS Google Scholar
Guo, Y. et al. Defect detection in atomic-resolution images via unsupervised learning with translational invariance. npj Comput. Mater. 7, 1–9 (2021).
Article CAS Google Scholar
Kalinin, S. V. et al. Deep bayesian local crystallography. npj Comput. Mater. 7, 1–12 (2021).
Article Google Scholar
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016). http://www.deeplearningbook.org.
Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning, 1050–1059 (2016).
Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440 (2015).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 1097–1105 (2012).
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS Google Scholar
Kendall, A. & Cipolla, R. Modelling uncertainty in deep learning for camera relocalization. In 2016 IEEE international conference on Robotics and Automation (ICRA), 4762–4769 (IEEE, 2016).
Yang, X., Kwitt, R. & Niethammer, M. Fast predictive image registration. In Deep Learning and Data Labeling for Medical Applications, 48–57 (Springer, 2016).
Gal, Y. Uncertainty in deep learning. Ph.D. thesis, University of Cambridge (2016).
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. Preprint at https://arxiv.org/abs/1207.0580 (2012).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Google Scholar
Michelmore, R., Kwiatkowska, M. & Gal, Y. Evaluating uncertainty quantification in end-to-end autonomous driving control. Preprint at https://arxiv.org/abs/1811.06817 (2018).
LeBeau, J. M., Findlay, S. D., Allen, L. J. & Stemmer, S. Quantitative atomic resolution scanning transmission electron microscopy. Phys. Rev. Lett. 100, 206101 (2008).
Article Google Scholar
LeBeau, J. M., Findlay, S. D., Allen, L. J. & Stemmer, S. Standardless atom counting in scanning transmission electron microscopy. Nano Lett. 10, 4405–4408 (2010).
Article CAS Google Scholar
Yu, M., Yankovich, A. B., Kaczmarowski, A., Morgan, D. & Voyles, P. M. Integrated computational and experimental structure refinement for nanoparticles. ACS Nano 10, 4031–4038 (2016).
Article CAS Google Scholar
Madsen, J. & Susi, T. The abTEM code: transmission electron microscopy from first principles. Open Res. Eur. 1, 24 (2021).
Article Google Scholar
Ophus, C. A fast image simulation algorithm for scanning transmission electron microscopy. Adv. Struct. Chem. Imag. 3, 1–11 (2017).
Article Google Scholar
Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452–459 (2015).
Article CAS Google Scholar
Bergstra, J., Yamins, D. & Cox, D. Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on Machine Learning, 115–123 (2013).
Deringer, V. L. et al. Realistic atomistic structure of amorphous silicon from machine-learning-driven molecular dynamics. J. Phys. Chem. Lett. 9, 2879–2885 (2018).
Article CAS Google Scholar
Nord, M., Vullum, P. E., MacLaren, I., Tybell, T. & Holmestad, R. Atomap: a new software tool for the automated analysis of atomic resolution images using two-dimensional gaussian fitting. Adv. Struct. Chem. Imaging 3, 1–12 (2017).
Article Google Scholar
Myronenko, A. & Song, X. Point set registration: Coherent point drift. IEEE Trans. Pattern Anal. Mach. Intell. 32, 2262–2275 (2010).
Article Google Scholar
Gatti, A. A. & Khallaghi, S. Pycpd: Pure numpy implementation of the coherent point drift algorithm. J. Open Source Softw. 7, 4681 (2022).
Article Google Scholar
Ahmadian, A. et al. Aluminum depletion induced by co-segregation of carbon and boron in a bcc-iron grain boundary. Nat. Commun. 12, 6008 (2021).
Article CAS Google Scholar
McInnes, L., Healy, J. & Melville, J. UMAP: Uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
Lipton, Z. C. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16, 31–57 (2018).
Article Google Scholar
Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl, R. & Yu, B. Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. 116, 22071–22080 (2019).
Article CAS Google Scholar
Roscher, R., Bohn, B., Duarte, M. F. & Garcke, J. Explainable machine learning for scientific insights and discoveries. IEEE Access 8, 42200–42216 (2020).
Article Google Scholar
Dyck, O., Jesse, S. & Kalinin, S. V. A self-driving microscope and the Atomic Forge. MRS Bull. 44, 669–670 (2019).
Article Google Scholar
Kalinin, S. V., Borisevich, A. & Jesse, S. Fire up the atom forge. Nature 539, 485–487 (2016).
Article CAS Google Scholar
Harris, F. J. On the use of windows for harmonic analysis with the discrete fourier transform. Proc. IEEE 66, 51–83 (1978).
Article Google Scholar
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Abadi, M. et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems https://www.tensorflow.org/ (2015).
Houlsby, N., Huszár, F., Ghahramani, Z. & Lengyel, M. Bayesian active learning for classification and preference learning. Preprint at https://arxiv.org/abs/a1112.5745 (2011).
Smith, L. & Gal, Y. Understanding measures of uncertainty for adversarial example detection. Preprint at https://arxiv.org/abs/1803.08533 (2018).
Virtanen, P. et al. SciPy 1.0: Fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020).
Article CAS Google Scholar
Van der Walt, S. et al. scikit-image: image processing in python. PeerJ 2, e453 (2014).
Article Google Scholar
Larsen, A. Hea The atomic simulation environment-a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Article Google Scholar
Peng, L.-M., Ren, G., Dudarev, S. & Whelan, M. Debye–waller factors and absorptive scattering factors of elemental crystals. Acta Crystallogr. Sect. A: Found. Crystallogr. 52, 456–470 (1996).
Article Google Scholar
Devulapalli, V., Bishara, H., Ghidelli, M., Dehm, G. & Liebscher, C. Influence of substrates and e-beam evaporation parameters on the microstructure of nanocrystalline and epitaxially grown Ti thin films. Appl. Surf. Sci. 562, 150194 (2021).
Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge and thank Matthias Scheffler, Angelo Ziletti, Niels Cautaerts, Donghun Kim, and Christoph Freysoldt for inspiring discussions and suggestions. We acknowledge funding from BiGmax, the Max Planck Society’s Research Network on Big-Data-Driven Materials Science. L.M.G. acknowledges funding from the European Union’s Horizon 2020 research and innovation program, under grant agreements No. 951786 (NOMAD CoE) and No. 740233 (TEC1p). Furthermore, the authors acknowledge the Max Planck Computing and Data facility (MPCDF) for computational resources and support, which enabled neural-network training on 1 GPU (Tesla Volta V100 32GB) on the Talos machine learning cluster. B.C.Y. acknowledges funding from the National Research Foundation (NRF) of Korea under Project Number 2021M3A7C2090586.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Luca M. Ghiringhelli
Present address: Department of Materials Science and Engineering, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Germany
These authors contributed equally: Andreas Leitherer, Byung Chul Yeo.

Authors and Affiliations

The NOMAD Laboratory at the Fritz-Haber-Institut of the Max-Planck-Gesellschaft and IRIS-Adlershof of the Humboldt-Universität zu Berlin, Berlin, Germany
Andreas Leitherer & Luca M. Ghiringhelli
Department of Energy Resources Engineering, Pukyong National University, Busan, 48513, Republic of Korea
Byung Chul Yeo
Max-Planck-Institut für Eisenforschung, 40237, Düsseldorf, Germany
Christian H. Liebscher
Physics Department and IRIS Adlershof, Humboldt-Universität zu Berlin, Berlin, Germany
Luca M. Ghiringhelli

Authors

Andreas Leitherer
View author publications
You can also search for this author in PubMed Google Scholar
Byung Chul Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Christian H. Liebscher
View author publications
You can also search for this author in PubMed Google Scholar
Luca M. Ghiringhelli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A. L. and B.C.Y. contributed equally to this work. A.L., B.C.Y., C.H.L. and L.M.G. designed the idea and the project. A.L. and B.C.Y. performed the neural-network calculations. C.H.L. conducted the multi-slice image simulations. C.H.L. and L.M.G. supervised the project. All authors contributed to the manuscript.

Corresponding authors

Correspondence to Andreas Leitherer, Christian H. Liebscher or Luca M. Ghiringhelli.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Leitherer, A., Yeo, B.C., Liebscher, C.H. et al. Automatic identification of crystal structures and interfaces via artificial-intelligence-based electron microscopy. npj Comput Mater 9, 179 (2023). https://doi.org/10.1038/s41524-023-01133-1

Download citation

Received: 22 March 2023
Accepted: 18 September 2023
Published: 02 October 2023
DOI: https://doi.org/10.1038/s41524-023-01133-1