Deep Bayesian Local Crystallography

The advent of high-resolution electron and scanning probe microscopy imaging has opened the floodgates for acquiring atomically resolved images of bulk materials, 2D materials, and surfaces. This plethora of data contains an immense volume of information on materials structures, structural distortions, and physical functionalities. Harnessing this knowledge regarding local physical phenomena necessitates the development of the mathematical frameworks for extraction of relevant information. However, the analysis of atomically resolved images is often based on the adaptation of concepts from macroscopic physics, notably translational and point group symmetries and symmetry lowering phenomena. Here, we explore the bottom-up definition of structural units and symmetry in atomically resolved data using a Bayesian framework. We demonstrate the need for a Bayesian definition of symmetry using a simple toy model and demonstrate how this definition can be extended to the experimental data using deep learning networks in a Bayesian setting, namely rotationally invariant variational autoencoders.

structural units and symmetry in atomically resolved data using a Bayesian framework. We demonstrate the need for a Bayesian definition of symmetry using a simple toy model and demonstrate how this definition can be extended to the experimental data using deep learning networks in a Bayesian setting, namely rotationally invariant variational autoencoders.
Macroscopic symmetry is one of the central concepts in the modern condensed matter physics and materials science. Formalized via point and spatial group theory, symmetry underpins areas such as structural analysis, serves as the basis for the descriptive formalism of quasiparticles and elementary excitations, phase transitions, and mesoscopic order-parameter-based descriptions, especially of crystalline solids. In macroscopic physics, symmetry concepts arrived with the advent of X-ray methods developed by Bragg, and for almost a century remained the primary and natural language of physics. Notably, the rapid propagation of laboratory X-ray diffractometers and largescale X-ray scattering facilities provided ample experimental data across multiple material classes and serve as a necessary counterpart for theoretical developments. Correspondingly, symmetrybased descriptors have emerged as a foundational element of condensed matter physics and materials science alike.
The natural counterpart of symmetry-based descriptors is the concept of physical building blocks. Thus, crystalline solids can be generally described via a combination of the unit cells with discrete translational lattice symmetries. At the same time, systems such as Penrose structures possess well-defined building blocks but undefined translation symmetry. Finally, a broad range of materials lack translational symmetries, with examples ranging from structural glasses and polymers to ferroelectric and magnetic morphotropic systems. [1][2][3][4][5][6][7][8][9] Remarkably, the amenability of symmetry-based descriptors have led to much deeper insights into the structure and functionalities of materials with translational symmetries compared to (partially) disordered systems. [10][11][12] The beginning of the 21 st century has seen the emergence of real space imaging methods including scanning probe microscopy (SPM) [13][14][15] and especially (scanning) transmission electron microscopy ((S)TEM). [16][17][18] Following the introduction of the aberration corrector in the late '90s 19 and the advent of commercial aberration-corrected microscopes, atomically resolved imaging is now mainstream. Notably, modern STEMs allow atomic columns to be imaged with ~pm-level precision. 20 This level of structural information allows insight into the chemical and physical functionalities of materials, including chemical reactivity, magnetic, and dielectric properties utilizing structure-property correlations developed by condensed matter physicists from macroscopic scattering data. [21][22][23][24][25][26][27] Over the last decade, several groups have extended these analyses to derive mesoscopic order parameter fields such as polarization, [28][29][30][31] strains and chemical strains, 32 and octahedra tilts [33][34][35] directly from STEM and SPM data. In several cases, these data can be matched to the mesoscopic Ginzburg-Landau models, providing insight into the generative mesoscopic physics of the material. 36,37 Recently, a similar approach was proposed and implemented for theory-experiment matching via microscopic degrees of freedom. [38][39][40] Yet, despite the wealth of information contained in atomically resolved imaging data, analyses to date were almost invariably based on the mathematical apparatus developed for macroscopic scattering data. However, the nature of microscopic measurements is fundamentally different. For the case of ideal single crystal containing a macroscopic number of structural units, the symmetry of the diffraction pattern represents that of the lattice and the width of the peaks in Fourier space is determined by the intrinsic factors such as angle resolution of the measurement system, rather than disorder in the material. The presence of symmetry breaking distortions, such as the transition from a cubic to tetragonal state, is instantly detectable from peak splitting. For microscopic observations only a small part of the object is visible and the positions of the atoms are known only within an uncertainty interval; this uncertainty can be comparable to the magnitude of the symmetry breaking feature of interest such as tetragonality or polarization. Thus, questions arise: What image size is it justified to define the symmetry from the atomically resolved data?
and What level of confidence can be defined? Ideally, such an approach should be applicable not only for structural data, but also for more complex multi-dimensional data sets such as those available in scanning tunneling spectroscopy (STS) 41 in scanning tunneling microscopy (STM), force-distance curve imaging 42 in atomic force microscopy, or electron energy loss spectroscopy (EELS) 43,44 and ptychographic imaging [45][46][47] in scanning transmission electron microscopy (STEM).
Here we propose an approach for the analysis of spatially resolved data based on deep learning in a Bayesian setting. This analysis utilizes the synergy of three fundamental concepts; the (postulated) parsimony of the atomic-level descriptors corresponding to stable atomic configurations, the presence of distortions in the idealized descriptors (e.g., due to local strains or other forms of symmetry breaking), and the presence of possible discrete or continuous rotational symmetries. These concepts are implemented in a workflow combining feature selection (atom finding), a rotationally invariant variational autoencoder to determine symmetry invariant building blocks, and a conditional autoencoder to explore intra-class variability via relevant disentangled representations. This approach is demonstrated for 2D imaging data but can also be generalized for more complex multi-dimensional data sets.

Why local symmetry is Bayesian
Here, we illustrate why the consistent definitions of local symmetry properties necessitates the Bayesian framework. As an elementary, but easy to generalize example, we consider the 1D diatomic chain formed by alternating atoms (1) and (2)  does not consider any potential prior knowledge of the system, it implicitly relies on the relevant distributions being Gaussian, and it is sensitive to the choice of an ideal system. A detailed analysis of the relevant drawbacks is given by Kruschke. 48 An alternative approach to these problems is via the Bayesian framework, based on the concept of prior and posterior probabilities linked as: 49,50 | where D represents the data obtained during the experiment, | represents the likelihood that  As an example, a set of diatomic chains is generated with bond lengths derived from two normal distributions, N(µ = 0.5, σ 2 = 0.01) and N(µ = 1.5, σ 2 = 0.01), where  is the men and  the standard deviation of the distribution. These two sets of bond lengths are treated independently and are referred to as odd and even bond lengths, respectively. The likely distributions for this case are also assumed to be normal distributions, N(µ = µ1, σ = σ1) and N(µ = µ2, σ = σ2). A total of four parameters, µ1 and σ1for the odd bond lengths and µ2 and σ2 for the even bond lengths, exhaustively determine the parameter space. We refer to this analysis as case -1.
The key element of Bayesian inference is the concept of prior, summarizing the known information on the system. [49][50][51] In experiments, the priors are typically formed semi-quantitatively based on general physical knowledge of the material (e.g., SrTiO3 is known to be cubic with lattice Bayesian in nature, updating the prior knowledge of the system with the experimental data.

Local crystallographic analysis
As a second concept, we discuss established approaches for the systematic analysis of atomic structures from experimental observations and the deep fundamental connections between the intrinsic symmetries present (or postulated) in the data and the neural network architectures.
For example, the classical fully connected multilayer perceptron intrinsically assumes the presence of potential strong correlations between arbitrarily separated pixels of the input image, resulting in a well-understood limitation of these networks to only relatively low-dimensional features.
Convolutional neural networks (CNNs) are introduced as a universal approach for equivariant data analysis where the features of interest can be present anywhere within the image plane. This network architecture implicitly assumes the presence of continuous translational symmetry, similar to the sliding window/transform approach, Figure 2 (a). [52][53][54] While allowing derivation of mesoscopic information, even for atomically resolved data, this approach suffers from inevitable spatial averaging and ignores the existence of well-defined atomic units. known discrete translational symmetry, and (b) discrete system without translational symmetry and unknown local symmetries.
If the positions of the atomic species can be determined, the analysis can be performed based on the local atomic neighborhoods (local crystallography) 55,56 or the full atomic connectivity graph. In these approaches, the full image is reduced to atomic coordinates and the subsequent analysis is based on the latter. It is important to note that in this case all remaining information in the image plane is ignored, i.e., the full data set is approximated by the point estimates of the atomic positions. Finally, the combined approach can be based on the analysis of sub-images centered on defined atomic positions. 57,58 In this case, the known atomic positions provide the reference points and the sub-images contain information on the structure and functionality around them.
For atomic and sub-image-based descriptors, the behavior referenced to the ideal behavior is of interest and is defined by high-symmetry positions or ideal lattice sites. If these are known, then behaviors such as symmetry-breaking distortions can be immediately quantified and explored, Figure 2 (b). However, the very nature of experimental observations is such that this ground truth information is not available directly, necessitating suitable approximations. For example, an ideal lattice can be postulated and average parameters can be found using a suitable filtering method.
However, this approach is sensitive to minute distortions of the image (e.g., due to drift) and image distortion correction is required. Similarly, variability in the observed images due to microscope configurations (mistilt, etc.) can provide observational biases.
These examples illustrate that deep analysis of the structure and symmetry from atomically resolved data sets necessitates simultaneous (a) identification of ideal building blocks and symmetry breaking distortions, while (b) allowing for general rotational invariance in the image plane and (c) accounting for discrete translational symmetry as implemented in the Bayesian setting. Ideally, such descriptors will be referenced to local features, as shown in Figure 2 (c).

Bayesian local crystallography
Here, we aim to combine the local crystallography and Bayesian approaches. The general workflow for deep Bayesian local crystallographic analysis is shown in Figure 3 (a). For the first step, the STEM image or a stack of images are fed into the deep fully convolutional neural network (DCNN). for semantic segmentation and atom finding. 59,60 The semantics segmentation refers to a process where each pixel in the raw experimental data is categorized as belonging to an atom (or to a particular type of atom) or to a "background" (vacuum). The atom finding procedure is then  Note that this sub-image description is chosen since both the original STEM data and DCNN reconstructions contain information beyond atomic coordinates, such as column shapes and unresolved features, and this needs to be taken into account during analysis. It is important to note that the choice of sub-image stack (original image, smoothed image, or DCNN output) defines the type of information that will be explored. For example, DCNN outputs define the probability density that a certain image pixel belongs to a given atom class that is optimal for exploration of chemical transformation pathways. At the same time, original image contrast may be optimal for exploration of physical phenomena. Finally, we note that the extremely important issue in this analysis is the correction of distortions for effects such as fly-back delays or general image instabilities, which can alleviate unwanted artifacts and introduce new ones. Several examples of these will be discussed below. If necessary, these sub-images can be used to further refine the classes using standard methods such as principal component analysis (PCA) or Gaussian mixture modelling (GMM). However, as mentioned above, these clustering methods will tend to separate the atoms into symmetry equivalent positions, leading to over-classification and poorly separable classes.
To avoid this problem, the subsequent step in the analysis is the rotationally invariant variational autoencoder (rVAE). In general, VAE is a directed latent-variable probabilistic Here, we aim to learn a rotationally invariant code for our data. Unfortunately, standard neural network layers (fully connected and convolutional) do not respect rotational symmetry or invariance. One potential way to circumvent this problem is to use convolutional layers with modified, steerable filters. 63 Another approach, which is specific to the VAE set up, is to disentangle rotations and translations from image content by making the generative model (decoder) explicitly dependent on the coordinates (Figure 3b). 64 In this case, the ELBO is computed as where  is a latent angle (see Figure 3b) and s  is a "rotational prior" set by a user before the optimization. The second and third terms in Eq. (2)   The rVAE analysis of the multiphase system is illustrated in Figure 4. Here, Fig. 4 (a) shows the atomically resolved STEM image of the multiphase (LaxSr1-x)MnO3 (LSMO) -NiO system. The dense NiO inclusions with a rock salt structure in the perovskite LSMO matrix are clearly observed, as visualized in Fig. 4 (c). DCNN allows one to locate virtually all the atomic units in the LSMO matrix and a majority of the atoms in the NiO. The sub-image stack formed from the DCNN output was analyzed by rVAE, Fig. 4 (b), is a representation of the atomic configurations in the latent space of the system. For the window size of 36 pixels used here, there is little variation in either the latent 1 or latent 2 directions.
The encoded angle, Fig. 4 (d), shows a clear checkerboard pattern in the LSMO phase and is uniform in the NiO phase. The offset maps shown in Fig. 4 (e,f) are relatively featureless but contain horizontal lines that are attributed to minute scanning non-idealities. The spatial maps of the latent components are shown in Fig. 4 (d-h). An interesting contrast behavior is observed in the latent space; latent parameter 1 shows a clear variation between the NiO and LSMO phases but little variation within each phase. Latent parameter 2, Fig. 4 (h), exhibits the checkerboard pattern of the LSMO perovskite lattice but is relatively featureless within the NiO phases. These results can be understood by examining the variable histograms ( Supplementary Fig. S1). The encoded angle is clearly split between two peaks and the latent space histograms also indicate a separation of features. The offsets, however, form single peaks that account for the lack of strong features observed in Fig. 4 (e,f).
The effect of varying the window size is shown in Supplementary Fig. S2  notable difference compared to the raw analysis is a distinct transition from the perovskite to the rock-salt structure in the sub-image representation in the 2D latent parameter space in the latent variable 1 direction, as shown in Fig. S3 (b). The perovskite phase of the first latent variable is almost featureless but is strongly differentiated from the NiO phase. While the histogram of the encoded angle in Fig. S4 is split into two peaks, as shown in Fig. S5, increasing or decreasing the window size leads to the encoded angle collapsing into small variations about a single value. To further extend this analysis, we note that the rVAE often tends to disentangle dissimilar types of distortions within a system. For example, experiments with a large number of different STEM images (beyond those shown in this paper) illustrate that scan distortions often tend to be described by one (group of) latent variables, whereas systematic changes in the local structure are described by the remaining latent variables. This property of VAEs is generally well known in computer science applications such as style networks; however, here we see that it applies for the physical systems as well.
We further explore this separation of atomic units based on neighborhood behavior using disentangled representations. As observed in Fig. 4, the angle and latent variable 2 seem to offer the optimal 2D basis to separate the atomic units, with clear contrast and a lack of distortion behaviors. The corresponding distribution and KDE plots are shown in Fig. 5   To get further insight into the materials structure, we explore the disentangled representations of the structural building blocks using the conditional variational autoencoder (cVAE) approach. The schematics of cVAE is shown in Figure 3 (c). Here, the autoencoder approach is applied on the concatenated image stack (or its reduced representation) and the class labels. In this manner, the latent space encodes both the sub-images and labels. On the decoding stage, the reconstructed object is drawn from the combination of the latent variables and the label.
The typical example of the cVAE application will be disentanglement of the styles in the MNIST data set. Whence simple VAE will draw all the numbers and distribute them in the latent space, the cVAE will draw the selected number and the latent space representations will reflect writing styles -e.g. tilt, line width, etc. The key aspect of using cVAE approach, as opposed of VAE analysis of individual classes, is that the thus disentangled styles will be common across the data set, reminiscent of hierarchical Bayesain models.
As an example of cVAE analysis, shown in Figure 6   We can extend this analysis to a system with a significantly more complex lattice such as the Sr3Fe2O7 (SFO) layered perovskite. Sr3Fe2O7 is a mixed valence Ruddlesden-Popper series compound with double perovskite structure that nominally features tetravalent iron. Charge disproportionation to Fe(III) and Fe(V) was observed by Mössbauer spectroscopy. 65, 66 Spiral magnetic order was observed by neutron diffraction 67 and provides a rare example of a magnetic cycloid arising from a ferromagnetic nearest neighbor competing with antiferromagnetic nextnearest exchange. 68 Further interest in this material arise from high oxygen mobility. 69 The preparation of a near stoichiometric compound requires high oxygen partial pressure. 70 The rVAE analysis of SFO is shown in Figure 7. The original STEM image, Fig. 7 (a), clearly illustrates the layered structure of SFO. Of most interest is the encoded angle, Fig. 7 (d), which shows three separate values, one down the center of the layers and a different value on either edge of the layers. The histogram of the encoded angle is shown in Fig. S7 where three peaks are clearly present. The outer peaks differ by approximately  radians, which is consistent with the edges of the layers experiencing a 180° rotation in their local configuration. The x-offset in Fig.   7 (e) is relatively featureless, which is consistent with the histogram shown in Fig. S7. The yoffset in Fig. 7 (f), however, clearly identifies the boundary between layer, with its histogram displaying two peaks. The two latent spaces in Fig. 7 (g-h) exhibit a gradual change in intensity from left to right corresponding to the sample thickness variation, which is also observed in the raw STEM image (Fig. 7 (a)). The corresponding histograms all have flattened peaks corresponding to this gradual change. Similar to observations for a 2-phase system, these behaviors are now disentangled and can be explored separately. We observed a similar separation for other STEM images where scan distortions e.g., due to fly-back delays, were clearly concentrated in a single latent variable.
The choice of window size is crucial for extracting some of these features. The effect of using a smaller and larger window size on the rVAE process is shown in Fig. S8. For a smaller window size, all the histograms exhibit single peaks with a restricted angular range. The latent spaces, however, still reflect thickness variations in the sample. For a larger window size, the encoded angle histogram exhibits only two peaks and only a single value is observed at the boundary of the layer. For completeness, this analysis was also performed on the DCNN segmented images and the results are shown in Fig. S8-S9. The behavior of the encoded angles shown in Fig. S8 is similar to that shown in Fig. 7. However, the latent variables in Fig. 7 do not reflect the thickness variation with the first latent variable in Fig. S8 (g) and in fact is quite featureless and the second latent variable strongly highlights the boundary between layers. Unlike the raw STEM image results, through the use of a smaller window size on the segmented data, the histogram of the encoded angle retains three peaks, as observed in Fig. S10. The histogram for the second latent variable is clearly split between two values and this is reflected in the corresponding spatial representation. For the larger window size, the encoded angle collapses to two peaks as before but both latent variables strongly show the layered structure.  Clustering analyses of the latent representations for SFO are shown in Figure 8. Fig. 8 (a) shows the label map for the GMM clusters shown in Fig. 8 (b) where the clustering is performed over angles and the second latent variable. Note the presence of three well-separated atomic groups that are consistent with the three peaks in the encoded angle histograms previously discussed.
Examination of components 1 and 3, shown in Fig 8 (c) and (e), respectively, show that they are near mirror images of each other. This is consistent with a 180° difference in the corresponding peaks in the histograms of the encoded angles. The second component, shown in Fig. 8 (d), is approximately centrosymmetric and corresponds to the center of the layered structure.
We compare the analysis of the raw images with the DCNN outputs. As mentioned above, the raw data represents the true variability of the STEM image contrast but also includes a high noise level. Comparatively, DCNN output is the semantically segmented image, i.e., the probability density that a specific image pixel belongs to the atom. Dr. Karren More for careful reading and correcting the manuscript.

Materials and methods:
Thin film growth. The LSMO-NiO VAN and the single-phase LSMO and NiO films were grown on STO(001) single-crystal substrates by PLD using a KrF excimer laser (λ= 248 nm) with fluence of 2 J/cm 2 and a repetition rate of 5 Hz. All films were grown at 200 mTorr O2 and 700 ℃. The films were post-annealed in 200 Torr of O2 at 700 ℃ to ensure full oxidation, and cooled down to room temperature at a cooling rate of 20 ℃/min. For out-of-plane transport measurements, the films were grown on 0.5% Nb-doped STO(001) single-crystal substrates. The film composition was varied by using composite laser ablation targets with different composition.

Sample preparation
A polycrystalline rod of Sr3Fe2O7-x with 6 mm in diameter and 50 mm in length was prepared using powders synthesized from solid state reaction of stoichiometric SrCO3 and Fe2O3 at 1100 °C. The single crystalline material utilized here was grown using a high pressure floating zone furnace with O2 partial pressure of 148 bar. Refinement of neutron diffraction data obtained at the NOMAD instrument of the Spallation Neutron Source using GSAS-II 71 revealed a single-phase material with an oxygen content of 6.8, see Supplementary Fig. S12 and Table S1.

STEM:
The plan-view STEM samples of Ni-LSOM were prepared using ion milling after mechanical thinning and precision polishing. In brief, a thin film sample was firstly ground, and then dimpled and polished to a thickness less than 20 micrometer from the substrate side. The sample was then transferred to an ion milling chamber for further substrate-side thinning. The ion beam energy and milling angle were adjusted towards lower values during the thinning process, which was stopped when an open hole appeared for STEM characterization. The Sr3Fe2O7 sample(s) were prepared by FIB lift out followed by local low energy Ar ion milling, down to 0.5 eV, in a Fischione NanoMill.
The STEM used for the characterization of both samples was a Nion UltraSTEM200 operated at 200 kV. The beam illumination half-angle was 30 mrad and the inner detector half-angle was 65 mrad. Electron energy-loss spectra were obtained with a collection half-angle of 48 mrad.

Deep Bayesian Local Crystallography
Sergei       Figure S7: Histograms of encoded angle, offsets, and latent spaces for the Sr3Fe2O7 system shown in Fig. 7 using a window size of 32 pixels.      Table S1.