Imaging Atomic-Scale Chemistry from Fused Multi-Modal Electron Microscopy

Efforts to map atomic-scale chemistry at low doses with minimal noise using electron microscopes are fundamentally limited by inelastic interactions. Here, fused multi-modal electron microscopy offers high signal-to-noise ratio (SNR) recovery of material chemistry at nano- and atomic- resolution by coupling correlated information encoded within both elastic scattering (high-angle annular dark field (HAADF)) and inelastic spectroscopic signals (electron energy loss (EELS) or energy-dispersive x-ray (EDX)). By linking these simultaneously acquired signals, or modalities, the chemical distribution within nanomaterials can be imaged at significantly lower doses with existing detector hardware. In many cases, the dose requirements can be reduced by over one order of magnitude. This high SNR recovery of chemistry is tested against simulated and experimental atomic resolution data of heterogeneous nanomaterials.


Introduction
Modern scanning transmission electron microscopes (STEM) can focus sub-angstrom electron beams on and between atoms to quantify structure and chemistry in real space from elastic and inelastic scattering processes. The chemical composition of specimens is revealed by spectroscopic techniques produced from inelastic interactions in the form of energy dispersive X-rays (EDX) [1,2] or electron energy loss (EELS) [3,4]. Unfortunately, high-resolution chemical imaging requires high doses (e.g., >10 6 e/Å 2 ) that often exceed the specimen limits-resulting in chemical maps that are noisy or missing entirely [5,6]. Substantial effort and cost to improve detector hardware has brought the field closer to the measurement limits set by inelastic processes [7,8]. Direct interpretation of atomic structure at higher-SNR is provided by elastically scattered electrons collected in a high-angle annular dark field detector (HAADF); however, this signal underdescribes the chemistry [9]. Reaching the lowest doses at the highest SNR ultimately requires fusing both elastic and inelastic scattering modalities.
Currently, detector signals-such as HAADF and EDX/EELS-are analyzed separately for insight into structural, chemical, or electronic properties [10]. Correlative imaging disregards shared information between structure and chemistry and misses opportunities to recover useful information. Data fusion, popularized in satellite imaging, goes further than correlation by linking the separate signals to reconstruct new information and improve measurement accuracy [11][12][13]. Successful data fusion designs an analytical model that faithfully represents the relationship between modalities, and yields a meaningful combination without imposing any artificial connections [14].
Here we introduce fused multi-modal electron microscopy, a technique offering high SNR recovery of nanomaterial chemistry by linking correlated information encoded within both HAADF and EDX / EELS. We recover chemical maps by reformulating the inverse problem as a nonlinear optimization which seeks solutions that accurately match the actual chemical distribution in a material. Our approach substantially improves SNRs for chemical maps, often around 300-500%, and can reduce doses over one order of magnitude while remaining consistent with original measurements. We demonstrate on EDX/EELS datasets at subnanometer and atomic resolution. Moreover, fused multi-modal electron microscopy recovers a specimen's relative concentration, allowing researchers to measure local stoichiometry with less-than 15% error without any knowledge of the inelastic cross sections. Convergence and uncertainty estimates are identified along with simulations that provide ground-truth assessment of when and how this approach can fail.

Principles of Multi-Modal Electron Microscopy
Fused multi-modal electron microscopy recovers chemical maps by solving an optimization problem seeking a solution that strongly correlates with (1) the HAADF modality containing high SNR, (2) the chemically sensitive spectroscopic modality (EELS and / or EDX), and (3) encourages sparsity in the gradient domain producing solutions with reduced spatial variation. The overall optimization function results as following: arg min where λ are regularization parameters, b H is the measured HAADF, b i and x i are the measured and reconstructed chem- ical maps for element i, ε herein prevents log(0) issues but can also account for background, the log is applied element-wise to its arguments, superscript T denotes vector transpose, and 1 denotes the vector of n x n y ones, where n x × n y is the image size.
The three terms in (1) define our multi-modal approach to surpass traditional dose limits for chemical imaging. First, we assume a forward model where the simultaneous HAADF is a linear combination of elemental distributions (x γ i where γ ∈ [1.4, 2]). The incoherent linear imaging approximation for elastic scattering scales with atomic number as Z γ i where γ is typically around 1.7 [15][16][17]. This γ is bounded between 2 for Rutherford scattering from bare nuclear potentials to 4/3 as described by Lenz-Wentzel expressions for electrons experiencing a screened coulombic potential [18,19]. Second, we ensure the recovered signals maintain a high-degree of data fidelity with the initial measurements by using maximum negative log-likelihood for spectroscopic measurements dominated by low-count Poisson statistics [20,21]. In a higher count regime, this term can be substituted with a simple least-squares error. Lastly, we utilize channel-wise total variation (TV) regularization to enforce a sparse gradient magnitude, which reduces noise by promoting image smoothness while preserving sharp features [22]. This sparsity constraint, popularized by the field of compressed sensing (CS), is a powerful yet minimal prior toward recovering structured data [23,24]. When implementing, each of these three terms can and should be weighted by an appropriately selected coefficients that balances their contributions. All three terms are necessary for accurate recovery (Supplementary Figure 1). Figure 1 demonstrates high-SNR recovery for EDX signals of commercial cobalt sulfide (CoS) nano-catalysts for oxygenreduction applications-a unique class with the highest activity among non-precious metals [25]. Figure 1a illustrates the model that links the two modalities (EDX and HAADF) simultaneously collected in the electron microscope. The low detection rate for characteristic X-rays is due to minimal emission (e.g., over 50% for Z > 32 and below 2% for Z < 11) and collection yield (< 9%) [26]. For high-resolution EDX, the low count rate yields a sparse chemical image dominated by shot noise (Fig. 1b). However, noise in the fused multi-modal chemical map is virtually eliminated (Fig. 1d) and recovers chemical structure without a loss of resolution-including the nanoparticle core and oxide shell interface. The chemical maps produced by fused multi-modal EM quantitatively agree with the expected stoichiometry-the specimen core contains a relative concentration of 39±1.6%, 42±2.5% and 13±2.4% and exterior shell composition of 26±2.8%, 11±2.0%, 54±1.3% for Co, S, O respectively. The dose for this dataset was approximately ∼10 5 e Å −2 and a 0.7 sr EDX detector was used; however, these quantitative estimates remained consistent when the dose was reduced to ∼10 4 e Å −2 .

High-SNR Recovery of Nanomaterial Chemistry
Fused multi-modal electron microscopy accurately recovers chemical structure down to atomic length scales-demonstrated here for EELS spectroscopic signals. EELS derived chemical maps for Co 3−x Mn x O 4 (x = 1.49) high-performing supercapacitor nanoparticles [27] are substantially improved by fused multi-modal electron microscopy in Figure 2. This composite Co-Mn oxide was designed to achieve a synergy between cobalt oxide's high specific capacitance and manganese oxide's long life cycle [27,28]. While the Co 3−x Mn x O 4 nanoparticle appears chemically homogeneous in the HAADF projection image along the [100] direction (Fig. 2c), core-shell distinctions are hinted at in the raw EELS maps (Fig. 2b). Specifically, these nanoparticles contain a Mn-rich center with a Co shell and homogeneous distribution of O. However the raw EELS maps are excessively degraded by noise, preventing analysis beyond rough assessment of specimen morphology. The multi-modal reconstructions (Fig. 2d) confirm the crystalline Co-rich shell and map the Co/Mn interface in greater detail (Fig. 2e). In the presence of cobalt and manganese, the HAADF image lacks noticeable contrast from oxygen; the resulting oxygen map lacks detail and benefits mostly from regularization. Figure 3 exhibits fused multi-modal electron microscopy at atomic resolution on copper sulphur heterostructured nanocrystals with zinc sulfide caps with potential applications in photovoltaic devices or battery electrodes [29]. The copper sulfide properties are sensitive to the Cu-S stoichiometry and crystal structure at the interface between ZnS and Cu 0.64 S 0.36 . Figure 3 shows highresolution HAADF and EELS characterization of a heterostructure Cu 0.64 S 0.36 -ZnS interface. Fused multi-modal electron microscopy maps out the atomically sharp Cu 0.64 S 0.36 -ZnS interface and reveals step edges between the two layers. The labeled points on the RGB chemical overlay (Fig. 3d) shows the chemical ratios produced by multi-modal EM for the Cu 0.64 S 0.36 and ZnS regions-values which are consistent with the reported growth conditions. Figure 3e shows the algorithm convergence for each of the three terms in the optimization function (Eq. 1)-smooth and asymptotic decay is an indicator of reliable reconstruction. Refer to Supplementary Figure 2 for an additional demonstration at the atomic-scale on an ordered manganite system.
Fused multi-modal imaging of Fe and Pt distributions from inelastic multislice simulations (Fig. 4) provide ground truth solutions to validate recovery at atomic resolution under multiple scattering conditions of an on-axis ∼8 nm nanoparticle. Here, we applied Poisson noise (Fig. 4b) containing electron doses of ∼10 9 e Å −2 , to produce chemical maps with noise levels resembling experimental atomic-resolution EELS datasets (SNR 5). We estimated SNR improvements by measuring peak-SNR for the noisy and recovered chemical maps [30]. Qualitatively, the recovered chemical distributions (Fig. 4c) match the original images. Fig. 4d illustrates agreement of the line profiles as the atom column positions and relative peak intensities between the ground truth and multi-modal reconstruction are almost identical.
Simulating EELS chemical maps is computationally demanding as every inelastic scattering event requires propagation of an additional wavefunction [31,32]-scaling faster than the cube of the number of beams, O(N 3 log N ). Inelastic transition potentials of interest (in this case the L 2,3 Fe and M 4,5 Pt edges) were calculated from density function theory (see Methods). Long computation times (nearly 4,000 core-hours) result from a large number of outgoing scattering channels corresponding to the many possible excitations in a sample. For this reason, there is little precedence for inelastic image simulations. We relaxed the runtime by utilizing the PRISM STEM-EELS approximation, achieving over a ten-fold speedup (see Methods) [33]. Future work may explore the effects of smaller ADF collection angles with increased coherence lengths and crystallographic contrast [15,34], or thicker specimens where electron channeling becomes more concerning [35,36].

Quantifying Chemical Concentration
Fused multi-modal electron microscopy can produce stoichiometricly meaningful chemical maps without specific knowledge of inelastic cross sections. Here, the ratio of pixel values in the reconstructed maps quantify elemental concentration. We demonstrate quantifiable chemistry on experimental metal oxide thin films with known stoichiometry: NiO [37] and ZrO 2 . A histogram of intensities from the recovered chemical maps are fit with Gaussian distributions to determine the average concentration. The recovered pixel values highlighted in Figure 5 followed a single Gaussian distribution where the Zr and Ni concentrations are centered about 35±5.8% and 50±2.9%. In both cases, the average Ni and Zr relative concentration is approximately equivalent to the expected ratio from the crystal stoichiometry: 33% and 50%. The CoS nanoparticle in Fig. 1 follows a bi-modal distribution for the core and shell phases (Supplementary Figure 5). We found measuring stoichiometry is robust across a range of γ values close to 1.7. In cases where γ is far off (e.g., γ = 1.0), the quantification is systematically incorrect (Supplementary Figure  6).
We further validate stoichiometric recovery on a synthetic gallium oxide crystal (Fig 5) where two overlapping Ga and O thin films of equal thickness have a stoichiometery of Ga 2 O 3 . The simulated HAADF signal is proportional to i (x i Z i ) γ where x i is the concentration for element i and Z i is the atomic number. As shown by the histogram, the simulated results agree strongly with the prior knowledge and successfully recovers the relative Ga concentration. The Gaussian distribution is centered about 40±0.4% when the ground truth is 40%. The inset shows convergence plots.
We estimate a stoichiometric error of less-than 15% for most Pixel intensity histograms for an experimental Zr (green), Ni (blue) and synthetic Ga (red) concentration maps. The standard deviation (σ) for each element is reported. The raw and reconstructed EDX maps are illustrated inside of the plot. Ground truth concentrations are highlighted by the respective colored triangles above the top axis. Stable convergence for the three components in the cost function: model term (orange), data fidelity (magenta), and regularization (turquoise) are illustrated in the inset. Qualitatively the convergence is identical for all three example datasets. Zr and Ni scale bars: 5, 10 nm, respectively. materials based on the relative concentration's standard deviation (±7%) added in quadrature with the variation of solutions (±6%). Although the algorithm shows stable convergence, the overall quantitative conclusions are slightly sensitive to the selection of hyperparameters. We estimate incorrect selection of hyperparameters could result in variation of roughly ±6% from the correct prediction in stoichiometery even when the algorithm converges (convergence shown in Supplementary Figures 8-9). This error is comparable to estimating chemical concentrations directly from EELS / EDX spectral maps from the ratio of scattering cross section against core-loss intensity [38]. However, traditional approaches require accurate knowledge of all experimental parameters (e.g., beam energy, specimen-thickness, collection angles) and accurate calculation of the inelastic cross-section typically to provide errors roughly between 5-10% [39].

Influence of Electron Dose
To better understand the accuracy of fused multi-modal electron microscopy at low doses, we performed a quantitative study of normalized root-mean-square error (RMSE) concentrations for a simulated 3D core-shell nanoparticle (CoS core, CoO shell). Figure 6 shows the fused multi-modal reconstruction accuracy across a wide range of HAADF and chemical SNR. The simulated projection images were generated by simple linear incoherent imaging model of the 3D chemical compositions highlighted in Fig. 6d-here the probe's depth of focus is much larger than the object. Random Poisson noise corresponding to different electron dose levels was applied to vary the SNR across each pixel.
Overall, the RMSE simulation map (Fig. 6a) shows the coreshell nanoparticle chemical maps are accurately recovered at lowdoses (HAADF SNR 4 and chemical SNR 2); however, they become less accurate at extremely low doses. The RMSE map for multimodal reconstruction shows a predictably continuous degradation in recovery as signals diminish. The degraded and reconstructed chemical maps for various noise levels are highlighted in Figure 6b. The Co map closely mirrors the Z-contrast observed in HAADF (not shown) simply because it is the heaviest element present. Usually researchers will perform spectroscopic experiments in the top right corner of Fig. 6a (e.g., HAADF SNR > 20, chemical SNR > 3), which for this simulation, provides accurate recovery.
In actual experiments, the ground truth is unknown and RMSE cannot be calculated to assess fused multi-modal electron microscopy. However we can estimate accuracy by calculating an average standard error of our recovered image from the Hessian of our model (see methods). The standard error reflects uncertainty at each pixel in a recovered chemical map by quantifying the neighborhood size for similar solutions ( Supplementary Figure 10). The average standard error across all pixels in a fused multi-modal image provides a single value metric of the reconstruction accuracy (see Methods). Figure 6c shows that RMSE and average standard error correlate, especially at higher doses (SNR > 10).

Discussion
While this paper highlights the advantages of multi-modal electron microscopy, the technique is not a black-box solution.
Step sizes for convergence and weights on the terms in the cost function (Eq. 1) must be reasonably selected. This manuscript illustrates approaches to assess the validity of concentration measurements using confidence estimation demonstrated across several simulated and experimental material classes. Standard spectroscopic pre-processing methods become ever more critical in combination with multi-modal fusion. Improper background subtraction of EELS spectra or overlapping characteristic X-ray peaks that normally causes inaccurate stoichiometric quantification also reduces the accuracy of fused multi-modal imaging.
Fused multi-modal electron microscopy offers little advantage in recovering chemical maps for elements with insignificant contrast in the HAADF modality. This property is limiting for analyzing specimens with low-Z elements in the presence of heavy elements (e.g., oxygen and lutetium). Future efforts could resolve this challenge by incorporating an additional complementary elastic imaging mode where light elements are visible, such as annular bright field (ABF) [40]. However in some instances, fused multi-modal electron microscopy may recover useful information for under-determined chemical signals. For example, in a Bi 0.35 Sr 0.18 Ca 0.47 MnO 3 (BSCMO) system [41], only the Ca, Mn, and O EELS maps were obtained, yet multimodality remarkably improves the SNR of measured maps despite missing two elements (Supplementary Figure 2).
Although fused multi-modal chemical mapping appears quite robust at nanometer or sub-nanometer resolution, we found atomic-resolution reconstructions can be challenged by spurious atom artifacts which require attention. However, this is easily remedied by down-sampling to frequencies below the first Bragg peaks and analysing a lower resolution chemical map. Alternatively, recovery with minimal spurious atom artifacts is achieved when lower resolution reconstructions are used as an initial guess (Supplementary Figure 11).
In summary, we present a model-driven data fusion algorithm that substantially improves the quality of electron microscopy spectroscopic maps at nanometer to atomic resolutions by using both elastic and inelastic signals. From these signals, or modalities, each atom's chemical identity and coordination provides essential information about the performance of nanomaterials across a wide range of applications from clean energy, batteries, and opto-electronics, among many others. In both synthetic and experimental datasets, multi-modal electron microscopy shows quantitatively accurate chemical maps with values that reflect stoichiometry. This approach not only improves SNR but opens a pathway for low-dose chemical imaging of radiation sensitive materials. Although demonstrated herein for common STEM detectors (HAADF, EDX, and EELS), this approach can be extended to many other modalities-including pixel array detectors, annular bright field, ptychography, low-loss EELS, etc. One can imagine a future where all scattered and emitted signals in an electron microscope are collected and fused for maximally efficiently atomic characterization of matter.

Electron Microscopy
Simultaneously acquired EELS and HAADF datasets were collected on a 5-th order aberration-correction Nion UltraSTEM microscope operated at 100 keV with a probe semi-angle of roughly 30 mrad and collection semi-angle of 80-240 mrad and 0-60 mrad for HAADF and EELS, respectively. Both specimens were imaged at 30 pA, for a dwell time of 10 ms (Fig. 3) and 15 ms (Fig. 2) receiving a total dose of 3.25 ×10 4 and 7.39 ×10 4 e/Å 2 . The EELS signals were obtained by integration over the core loss edges, all of which were done after background subtraction. The background EELS spectra were modeled using a linear combination of power laws implemented using the open-source Cornell Spectrum Imager software [6].
Simultaneously acquired EDX and HAADF datasets were collected on a Thermo Fisher Scientific Titan Themis G2 at 200 keV with a probe semi-angle of roughly 25 mrad, HAADF collection semi-angle of 73-200 mrad, and 0.7 sr EDX solid angle. The CoS specimen was imaged at 100 pA and 40 µs dwell time for 50 frames receiving a total dose of approximately 2 × 10 5 e/Å 2 . The initial chemical distributions were generated from EDX maps using commercial Velox softwarethat produced inital net count estimates (however atomic percent estimates are also suitable).

Fused Multi-Modal Recovery
Here, fused multi-modal electron microscopy is framed as an inverse problem expressed in the following form:x = arg min x≥0 Ψ 1 (x) + λ 1 Ψ 2 (x) + λ 2 TV(x) wherex is the final reconstruction, and the three terms are described in the main manuscript (Eq. 1). When implementing an algorithm to solve this problem, we concatenate the multi-element spectral variables (x i , b i ) as a single vector: x, b ∈ R nxnyni where n i denotes the total number of reconstructed elements.
The optimization problem is solved by a combination of gradient descent with total variation regularization. We solve this cost function by descending along the negative gradient directions for the first two terms and subsequently evaluate the isotropic TV proximal operator to denoise the chemical maps [42]. The gradi-ents of the first two terms are: where denotes point-wise division. Here, the first term in the cost function, relating the elastic and inelastic modalities, has been equivalently re-written as Ψ 1 = 1 2 b H − Ax γ 2 2 , where A ∈ R nxny×nxnyni expresses the summation of all elements as matrix-vector multiplication. Evaluation for the TV proximal operator is in itself another iterative algorithm. In addition, we impose a non-negativity constraint since negative concentrations are unrealistic. We initialize the first iterate with the measured data (x 0 i = b i ), an ideal starting point as it is a local minima for Ψ 2 .
The inverse of the Lipschitz constant (1/L) is an upper bound of the step-size that can theoretically guarantee convergence. From Lipschitz continuity, we estimated the step size for the model term's gradient (∇Ψ 1 ) as: 1/L ∇Ψ1 ≤ 1/ A 1 A ∞ = 1/n i . The gradient of the Poisson negative log-likelihood (Ψ 2 ) is not Lipschitz continuous, so its descent parameter cannot be precomputed [43]. We heuristically determined the regularization parameters starting with values with a similar order of magnitude to 1/L ∇Ψ1 , then iteratively reduce until the cost function exhibits stable convergence. The regularization parameters were manually selected, however future work may allow automated optimization by the L-curve method or cross-validation [44].

Estimating Standard Error of Recovered Chemical Maps
Using estimation theory, we can approximate the uncertainty in a recovered chemical image for unbiased estimators with the model's (Eq. 1) Hessian expressed as: H(x) = ∇ 2 x Ψ 1 (x) + ∇ 2 x Ψ 2 (x), where Calculation of standard error follows the Cramer-Rao inequality, which provides a lower bound given by: Var(x j ) ≥ H −1 (x) jj [45], where Var(x) are variance maps for the recovered chemical distributions (x) and subscript jj denotes indices along the diagonal elements. We determined this lower bound from an empirical derivation of the Fisher Information Matrix. From the variance we thus extract standard error maps: Standard Error = Var(x) as demonstrated in Supplemental Figure 10. The average standard error denotes the mean value of all pixels in Standard Error. Note, the TV regularizer reduces noise and may introduce bias due to smoothing, so the standard error measurements could potentially be lower; our Fisher information derivation provides an upper bound on uncertainty.

Inelastic Scattering Simulations for Atomic Imaging
The inelastic scattering simulations for the FePt nanoparticle structure (Fig. 4) were performed using the abTEM simulation code [46], using the algorithm described in [33]. In this algorithm the initial STEM probe is propagated and transmitted to some depth into the specimen using the scattering matrix method described in the PRISM algorithm [47]. Next, the inelastic transition potentials of interest (in this case the L 2,3 Fe and M 4,5 Pt edges) were calculated and applied using the methods given in [48,49], using the GPAW density functional theory code [50]. Finally, a second scattering matrix is used to propagate the inelastically scattered electrons through the sample and to the plane of the EELS entrance aperture. The elastic signal channels were calculated with the conventional PRISM method using the same parameters.
The atomic structure used in the simulations was a portion of the FePt nanoparticle structure determined from atomic electron tomography [51]. After cropping out 1/4 of nanoparticle coordinates, the boundaries were padded by 5 Å total vacuum. The STEM probe's convergence semiangle was set to 20 mrad and the voltage to 200 kV. The multislice steps used a slice thicknesses of