Causal analysis of competing atomistic mechanisms in ferroelectric materials from high-resolution Scanning Transmission Electron Microscopy data

Machine learning has emerged as a powerful tool for the analysis of mesoscopic and atomically resolved images and spectroscopy in electron and scanning probe microscopy, with the applications ranging from feature extraction to information compression and elucidation of relevant order parameters to inversion of imaging data to reconstruct structural models. However, the fundamental limitation of machine learning methods is their correlative nature, leading to extreme susceptibility to confounding factors. Here, we implement the workflow for causal analysis of structural scanning transmission electron microscopy (STEM) data and explore the interplay between physical and chemical effects in ferroelectric perovskite across the ferroelectric-antiferroelectric phase transitions. The combinatorial library of the Sm-doped BiFeO3 is grown to cover the composition range from pure ferroelectric BFO to orthorhombic 20% Sm-doped BFO. Atomically resolved STEM images are acquired for selected compositions and are used to create a set of local compositional, structural, and polarization field descriptors. The information-geometric causal inference (IGCI) and additive noise model (ANM) analysis are used to establish the pairwise causal directions between the descriptors, ordering the data set in the causal direction. The causal chain for IGCI and ANM across the composition is compared and suggests the presence of common causal mechanisms across the composition series. Ultimately, we believe that the causal analysis of the multimodal data will allow exploring the causal links between multiple competing mechanisms that control the emergence of unique functionalities of morphotropic materials and ferroelectric relaxors.

to cover the composition range from pure ferroelectric BFO to orthorhombic 20% Sm-doped BFO.
Atomically resolved STEM images are acquired for selected compositions and are used to create a set of local compositional, structural, and polarization field descriptors. The informationgeometric causal inference (IGCI) and additive noise model (ANM) analysis are used to establish the pairwise causal directions between the descriptors, ordering the data set in the causal direction.
The causal chain for IGCI and ANM across the composition is compared and suggests the presence of common causal mechanisms across the composition series. Ultimately, we believe that the causal analysis of the multimodal data will allow exploring the causal links between multiple competing mechanisms that control the emergence of unique functionalities of morphotropic materials and ferroelectric relaxors.
Functionality of material systems such as morphotropic phase boundary systems, [1][2][3][4] ferroelectric relaxors, [5][6][7][8] spin and cluster glasses, [9][10][11][12] charge ordered manganites, [13][14][15][16][17] are determined by the complex interplay between structural, orbital, chemical, spin and other degrees of freedom. 18,19 Traditionally, these materials system has been explored via the combination of macroscopic physical property measurements and scattering techniques, with the theoretical counterpart being provided via combination of analytical and numerical methods. Given that the physics of these materials is ultimately linked to the emergence of frustrated degenerate ground states driven by competing interactions, analyses based on the macroscopically averaged descriptors such as concentrations, order parameter fields, etc. provide only limited insight into the generative and especially causal physics of these materials systems.
Progress in the high-resolution imaging techniques have allowed visualization of these materials systems to the atomic level. Techniques such as Scanning Tunneling Microscopy (STM) have provided insight into electronic structure of surfaces and superconductive and magnetic order parameters. 20,21 Scanning Transmission Electron Microscopy (STEM) enabled studies of chemical composition down to the single atom level [22][23][24] and, via quantitative mapping of structural distortions, enabled visualization of order parameter fields such as polarization [25][26][27][28] , tilts [29][30][31] , and mechanical 25,[32][33][34][35] and chemical [35][36][37] strains. However, this emergence of data brings the challenge of analysis of systems with multiple spatially distributed degrees of freedom, including determination of both the functional laws connecting the functionalities and structure and the causal links that define the cause and effect relationship in the non-stationary and non-ergodic systems.
Recently, machine learning (ML) has emerged as a powerful tool for the analysis of mesoscopic and atomically resolved images and spectroscopy in electron and scanning probe microscopy. 38-41 The applications ranging from feature extraction 42 to information compression and elucidation of relevant order parameters 43 to inversion of imaging data to reconstruct structural models have been demonstrated. However, the fundamental limitation of the vast majority of machine learning methods is their correlative nature, leading to extreme susceptibility to confounding factors and observational biases. 44,45 While in classical statistical methods methodologies to address confounder-or selective bias induced phenomena such as Simpson paradox are established, 46 the complex and often non-transparent nature of modern machine learning tools such as deep neural networks renders them extremely prone to misinterpretation.
We pose that correlative machine learning provides a reliable and powerful tool in cases when the causal links are well established, as is atom finding in SPM and STEM and analysis of 4D STEM data when this condition is satisfied. Notably, ML applications in theory generally fall under this category since the causal links are postulated. Alternatively, ML methods work well when the confounding factors are effectively frozen via the narrowness of experimental conditions or experimental system. However, both these conditions are violated for experimental studies, when causal relationships are known only partially (and are in fact often the target of study) and confounding and observational bias factors (composition uncertainty, microscope tuning, contaminations) are abundant.
One approach to explore the generative physical models from the microscopic data is based on the fit to the relevant mesoscopic or atomistic models, i.e. discovering the generative physical models. On the mesoscopic level, direct match between the solution between Ginzburg-Landau equations and order parameter fields determined from the atomically resolved data can be used to determine the interface terms via corresponding boundary conditions, 47 as well as the nature of coupling and gradient terms. 39 Statistical distance minimization can be used to directly match the discrete data to lattice model, e.g. to reconstruct the interaction parameters. [48][49][50][51] However, even when the functional laws describing the system are known, this level of the description is not sufficient to establish the causal mechanisms active in the system. As a simple example, the knowledge of the ideal gas law does not establish whether pressure is cause or an effect of the volume change unless the character of the process is established.
In many cases, it can be argued that the causal effects can be estimated based on the energy scales of corresponding phenomena, e.g. magnetic properties driven by relatively weak energy scales are unlikely to affect atomic structure. However, this is not the case when the energy scales are comparable, or when depolarization and global effects become significant. For example, while the magnetization energy density per volume can be small, concentration of magnetically induced mechanical stresses can lead to stress corrosion at the domain walls. Specifically, for ferroelectric materials it is generally assumed that cationic order is frozen at the state of material formation, and then polarization field evolves to accommodate average polarization instability and local pinning.
However, it is known that ions can redistribute to compensate polarization, with examples including segregation at the domain walls, memory effects, etc. 2,52 Hence, for non-equilibrium and non-ergodic materials the question of cause and effect become paramount. For example, does polarization align to the cationic disorder or does polarization instability at the morphotropic phase boundaries drive the cationic disorder?
More generally, being able to answer causal questions is required both for meaningful applications of machine learning techniques and inferring the materials physics, since causal knowledge allows avoiding correlative, but incorrect conclusions, explore counterfactuals and interventions, 53 i.e. realistic strategies for materials development. Correspondingly, we argue that analyzing causal relationships from the observations of atomically resolved degrees of freedom is key for understanding the physics of non-ergodic systems.
Here, we implement the workflow for causal analysis of STEM data and explore the interplay between physical and chemical effects in ferroelectric perovskite across the ferroelectricantiferroelectric phase transitions. The combinatorial library of the Sm -doped BiFeO3 is grown to cover the composition range from pure ferroelectric BFO to orthorhombic 20% Sm-doped BFO. [54][55][56][57] Atomically resolved STEM images are acquired for selected compositions and are used to create a set of local compositional, structural, and polarization field descriptors. The  The TEM samples were prepared for three sites along the gradient composition sample with nominal compositions of 0%, 7% and 20% Sm doping (see methods). STEM data was collected from the [100] pseudocubic zone axis using High-Angle Annular Dark Field (HAADF) detector imaging as shown in Fig. 2, thereby providing the projected atomic structure as well as compositional information by the atomic column intensity (which scales by ~Z 2 ). As the concentration of Sm increases through the sample series there is an observed phase transition from the prototypical rhombohedral ferroelectric phase 58 of BiFeO3 ( Fig. 2A) to an antiferrodistortive orthorhombic phase 59 at 20% Sm (Fig. 2C). The transition is readily observable in maps of the polar atomic displacement between the A-site and B-site cation sublattices, P, which is shown for the three compositions in Figure 2.
For the pure BiFeO3 phase P is a proxy for the electrical dipole moment and the distribution in Fig  To describe the local materials behaviors, we parametrize the data choosing the perovskite unit cell as the basis. We introduce a set of descriptors for the local material behavior based on the properties and distribution of the 5 cation atomic columns in each unit cell. These are outlined in Table 1, along with the corresponding calculations. Unit-cell parameters are defined from local neighborhood atomic columns of HAADF STEM data corresponding to a five cation perovskitetype cell: corner A-sites A1, A2, A3, A4 and central B-site B1 (labels in Fig 2A). Parameters include structural descriptors from positional data regarding unit cell size and shape (a, b, a/b, θ, Vol), compositional information from atomic HAADF intensity & distribution (I1, I2, I3, I4, I5), and electrical polarization information from non-centrosymmetric displacement of the A-and B-site sublattices (P). Examples for several of these descriptors for a HAADF STEM unit cell are illustrated in Figure 2A. We here define a and b as the two lattice vectors of the unit cell connecting A-site corner positions, their internal angle θ, magnitude ratio a/b, and total cell volume Vol. I1 corresponds to the mean atomic column intensity and scales with the sample mass-thickness. I2-I5 correspond to internal asymmetries. Notably, I5 corresponds to the intensity ratio between cation sublattices, thus readily distinguishes the SmxBi1-xFeO3 film and SrTiO3 substrate. Our choice of basis also includes several internal gradient terms including A-site intensity asymmetries in I2-I4 and the gradient of the a and b vector between opposed edges of the unit cell (denoted as abΔ).
Moreover, additional gradient terms can be derived using a larger basis or calculated across multiple neighbor cells.  Here, we note that under some general conditions including macroscopic uniformity, this relationship can be simplified to yield the local non-linear relationship between state variables, with the non-local effects being represented via the unknown mean local fields. These nonlinear and non-local partial differential equations can be linearized around the specific ground state to give the linear relationship between the observed and non-observable parameters. Secondly, we note that in the presence of the strong composition fluctuations and nanodomains, the local fields will be the superposition of slowly varying (on the atomic level) depolarization fields and disorder related fields. Here, we aim to explore the causal relationships from the observed descriptor fields. The initial insight into the statistical properties of this material system can be deduced from the analysis of joint pairwise distributions as shown in Figure 4. Figure  Here we explore two step approach for analysis of the possible causal relationships between the STEM observables. First, the causal directions are analyzed for all pairs of variables to yield pairwise causal relationships and represented as a "causal sieve" matrix. By construction, the matrix is antisymmetric with 1 and -1 elements. Secondly, the properties of graph which adjacency matrix is given by the "causal sieve" are explored.
To describe causal direction for two variables, we use and compare the informationgeometric causal inference (IGCI) 61 and the additive noise model (ANM). 62 The IGCI method is based on the assumption of the independence of the 'cause' distribution and the conditional distribution of the 'effect' given the cause. 61,63,64 It can be shown, using an empirical slope-based estimator, 65 and vice versa, where the (xj, yj) pairs are ordered ascendingly according to x in the first term and according to y in the second term. IGCI was first assumed to be applicable only to noise-free observations where Y = f(X) and X = f −1 (Y ) but was later shown (empirically) to work on noisy data as well. 64 The equation (2) was used for all the IGCI-based analysis of the cause-effect pairs in the current paper.
As a second pairwise causality check, we use the additive noise model (ANM) estimator for finding a causal direction from the observed data. The simple idea behind the ANM method is that the effect is a function of its cause plus a noise term independent of the cause. 66 In the ANM one performs the non-linear regression fitting, first for X on Y and then for Y on X, and calculates the difference between test scores for the independence of residuals in both cases. The negative difference value implies that X causes Y, while the positive value implies that Y causes X.
For our analysis, we used a Gaussian process (GP) regressor with the squared exponential kernel. Because the exact inference of the GP regressor parameters is intractable for the datasets with ~10 4 points, we used the inducing points-based sparse GP approximation 67 with variational free energy (VFE) inference method, as implemented in Pyro's probabilistic programming language. 68 The inducing points were selected uniformly from the observation data points with a step of ~20. In addition to the GP regressor, we also added an option for choosing a 2-layer neural network as a regressor (see the accompanying Jupyter notebook). The independence of residuals test was done by calculating the Hilbert-Schmidt Independence Criterion (HSIC) 69 with the Gaussian kernel whose width was set to the median distance between points in input space.
The IGCI typically takes advantage of some specific features of the dataset, whereas the ANM tend to yield good results as long as the additive assumption holds. 70 Before applying to the experimental observations, both IGCI and ANM methods were first tested on the publicly available database of labeled cause-effect pairs 71 and the resultant accuracy in predictions (~64% and ~66%, respectively) was comparable to the results reported for the same database in the machine learning literature. 53 We then calculate a matrix of pairwise dependencies for the list of descriptors derived from the experimental descriptions. The causal sieve matrices for three explored compositions are shown in Figure 5 for a selected subset of variables to enable ease of visualization. The analysis of the full set of structural, chemical, and polarization parameters is available in the accompanying notebook. Here, the positive 1 value means that the column value is identified as the cause, whereas the row variable is the effect, Col -> Row. Interestingly, the structures of the IGCI causal matrices in the studied cases are always such that for N matrix one row contains N positive entries, another row contains importance" than I1. In all three cases, the (I1, Vol) variables are followed by polarization components (Px, Py), differential chemical composition I5, and finally by tetragonality a/b. The more direct way to visualize and compare the causal relationship is by arranging the observables as the causal chains as shown in Figure 6 for all three compositions. As seen in Figure   6, the dependence chain has significant overlaps between different compositions and analysis via IGCI and ANM. For IGCI, the chemical variables such as local composition and molar volume are clearly higher in the causal chain. In addition, the polarization components are arranged as (Py, The polarization effects are secondary, and differential chemical contrast and tetragonality are the weakest. The ANM analysis results are more difficult to interpret; here we argue that functional relationship between the variables are fundamentally different in dissimilar phases and therefore GP interpolation approach produces fundamentally different responses. This behavior will be explored in the future.
Overall, we note that optimization and discovery of new materials as well as understanding of fundamental physical mechanisms can be significantly accelerated if the causality of corresponding mechanisms can be understood, allowing to explore counterfactuals and interventions and avoiding correlative but incorrect conclusions. The fundamental physics offers a large set of knowledge on functional relationship between the materials parameters; however, real material systems often are characterized by only partially known physics or presence of nonequilibrated non-ergodic processes. We expect that in these cases causal analysis can provide the knowledge of cause and effect relationships necessary for materials optimization, design, and especially discovery.

Acknowledgements:
This effort (electron microscopy, feature extraction) is based upon work supported by the U.S.

Materials:
The combinatorial library of Sm -doped BiFeO3 and the SrRuO3 layer were both fabricated through pulsed laser deposition (PLD). Specifically, after reaching the base pressure (~ 2.0  10 -8 Torr) of the deposition chamber, the SrTiO3 (001) substrate was heated up to 600 C, and an oxygen flow was introduced to the chamber to maintain a desired deposition pressure (~ 100 mTorr). A laser energy density of ~ 0.8 J/cm 2 and an ablation frequency of 20 Hz were adopted for the deposition of the films. During the deposition of the Bi1-xSmxFeO3 layer, a BiFeO3 target and a SmFeO3 target were alternatively ablated, and a shadow mask was controlled to move accordingly to obtain a uniform composition gradient across the substrate.

Methods:
TEM samples were prepared by FIB liftout and local low energy Ar ion milling, down to 0.5 keV, in a Fischione NanoMill. STEM was performed at 200kV on a NION UltraSTEM. The three compositions were imaged consecutively to help maintain consistent imaging/microscope conditions. A correction algorithm was applied to correct for slow-scan axis scanning aberrations by reconstruction from two orthogonal source images according to 72  The causal data analysis is available in a form of executable Google Colab notebook at https://colab.research.google.com/github/ziatdinovmax/Notebooks-for-papers/blob/master/ferroicscausal-analysis.ipynb