We report the development and experimental implementation of the automated experiment workflows for the identification of the best predictive channel for a phenomenon of interest in spectroscopic measurements. The approach is based on the combination of ensembled deep kernel learning for probabilistic predictions and a basic reinforcement learning policy for channel selection. It allows the identification of which of the available observational channels, sampled sequentially, are most predictive of selected behaviors, and hence have the strongest correlations. We implement this approach for multimodal imaging in piezoresponse force microscopy (PFM), with the behaviors of interest manifesting in piezoresponse spectroscopy. We illustrate the best predictive channel for polarization-voltage hysteresis loop and frequency-voltage hysteresis loop areas is amplitude in the model samples. The same workflow and code are applicable for any multimodal imaging and local characterization methods.
Multimodal imaging methods underpin multiple areas of fundamental and applied sciences. Conventional intermittent contact mode atomic force microscopy yields topographic, phase, and error signals that highlight different aspects of surface structure1,2,3. In combination with detection modes such as electrostatic4,5,6, magnetic7,8,9, and Kelvin probe force microscopy10,11,12,13,14, these technique offers multiple information channels containing information on dissimilar aspects of materials functionality. In optical imaging in biology, specific dies are used to highlight different elements of cell structure and are visualized with different color filters or spectral range in hyperspectral methods. In energy-dispersive electron microscopy and electron energy loss spectroscopy (EELS)15,16, different energy ranges highlight concentrations of individual elements17.
In many cases, imaging is used to define objects of interest for more detailed studies18,19,20,21. In scanning probe microscopy (SPM), the structural or functional images can be used to select locations for force-distance or current-voltage measurements22, or locations for local sampling for chemical studies. In optical and scanning electron microscopy, the imaging data can be used to select locations for e.g., nanoindentation23. In mass-spectrometry, the sampling points are often selected based on the optical or SPM imaging24,25. This paradigm of imaging followed by selection of specific location(s) for detailed studies is common across physical, chemical, and biological imaging. Currently, these studies are often performed as guided by human operator intuition, via a classical point and click approach. However, in this case the process is slow and heavily biased by operator experience and expectations. An alternative approach is that of dense grid-based measurements, such as force-volume26, piezoresponse spectroscopy, piezoresponse nonlinearity measurements in SPM27,28,29,30,31, hyperspectral electron energy loss spectroscopy (EELS) measurements in scanning transmission electron microscopy (STEM)32, photoluminescence lifetime measurement in optical microscopy19, or electron diffraction measurement in electron microscopy29. However, the grid measurements tend to be time consuming and are often limited or impossible for circumstances where the probe or the sample degrade rapidly with measurements.
An alternative in multimodal imaging is thus naturally of interest, enabling a spectroscopy workflow within an automated experiment framework. In this framework, the locations for spectroscopic studies are selected based on the features of interest in multimodal image. Here, the direct problem—performing measurements at a known location of interest—can be engendered via (by now) standard computer vision algorithms. For example, we can choose the specific objects such as domain walls or molecules, to identify locations for detailed spectroscopic measurements18,21,33,34,35,36.
However, the inverse problem—discovering the features of interest in the right channel, e.g., topography, or piezoresponse, or conductivity image channel, that are best predictive of behaviors of interest—is poorly amenable to human operation. For example, we aim to discover which microstructural element has the best predictive capacity for the functional property encoded in polarization hysteresis loop or resonance frequency hysteresis loop such as maximal loop area, imprint bias, or more complex functionals of the loop shape. For unimodal imaging, this approach have recently been demonstrated for STEM-EELS, 4D STEM, and band excitation piezoresponse spectroscopy (BEPS)27,32,37,38. In these studies, we have discovered which features in image space are most predictive of the specific functionalities determined via spectral measurements, for example localization of the hysteresis loops with the maximal area at specific domain walls or emergence of low energy plasmons at the edges of 2D material flakes.
Here, we develop a framework for the automated discovery of the best predictive channel in multimodal imaging for the behavior of interest within a spectroscopic data set. Traditionally, such analysis is based on physical intuition using a priori expected physical relationships. However, this approach often leads to significant operator biases and precludes the discovery of the phenomena of interest. Here, we develop the experimental framework toward the discovery of the channel that offers best predictability for the behavior of interest in multimodal imaging. We have chosen to illustrate these using piezoresponse force microscopy (PFM) as the method that allows multichannel imaging and extensive set of spectroscopies39. However, this approach is universal and applies to other forms of multimodal imaging.
As model systems, we explored three thin film samples: lead titanate (PTO)40, lead zirconate titanate (PZT), and bismuth ferrite (BFO), these films are grown on SrRuO3 layers. Band excitation piezoresponse force microscopy (BEPFM) measurements were performed on three model thin film materials to investigate their domain structure. These results are shown in Fig. 1. The PTO thin film indicates both 180° ferroelectric domain structures—dark domain and bright domain in phase image (Fig. 1b), and non-180° ferroelastic domain structures—dark and bright stripe domains in amplitude image (Fig. 1a). The ferroelastic domains exhibit different strain and elastic properties due to the variation in crystallographic orientation, resulting in visible domain contrast in resonance frequency image (Fig. 1c). Topography image (Fig. 1d) also illustrates the ferroelastic domain features. In contrast, the PZT thin film only exhibits non-180° ferroelastic domain structures, displaying in BEPFM amplitude (Fig. 1e), phase (Fig. 1f), resonance frequency (Fig. 1g), and topography (Fig. 1h) images. The BFO majorly shows 180° ferroelectric domain structure (Fig. 1i, j), where the domain wall contrast is also visible in resonance frequency image (Fig. 1k). Notably, a few ferroelastic domains with weak contrast also show in amplitude image (Fig. 1i).
Multiple-channel deep kernel learning
Next, we perform a multiple-channel deep kernel learning (DKL) measurement utilizing the ensembles of DKL models and basic reinforcement learning policy. Earlier, we showed how combining the structured Gaussian process with the epsilon-greedy policy allows one to learn a correct model of the system’s behavior and use it to drive the exploration of the configuration space41,42. However, that approach is limited to low-dimensional spaces and is not suitable for the structure-property relationship problems in the multimodal imaging. Here we use DKL43 that is a hybrid of a neural network and a Gaussian process to circumvent the dimensionality problem. As the fully Bayesian implementation of DKL is computationally too slow for real-time feedback and control, we approximated it with the ensembles of DKL models44. In this setup, each neural network in the ensemble is initialized independently resulting in different embeddings connected to separate Gaussian processes and the final prediction for each channel is an ensemble average.
The process of channel learning with ensemble-DKL is shown in Fig. 2a, b. The BEPFM images including amplitude, phase, frequency, and topography are used as four possible input channels. Each image is featurized by splitting it into patches that are used as inputs. The behavior of interest is encoded in polarization or resonance frequency hysteresis loops for each patch as a scalar target. Here we use the hysteresis loop area, but any functional of the spectroscopic signal can be selected. At the beginning of the channel learning experiment, a small, custom-defined number of warm-up steps is taken, at which a separate ensemble of DKL models is trained for each channel. In this process, the channel that produces the lowest mean predictive uncertainty on the unmeasured points is given a positive reward. This rewarded model is also used to derive the next measurement point corresponding to the largest uncertainty value in the prediction. After the warm-up steps, an epsilon-greedy policy45 is used to sample a single channel at each exploration step and derive the next measurement point.
We implement this ensemble-DKL workflow on an Oxford Instrument Asylum Cypher microscope. As shown in Fig. 2c, to accelerate the DKL training and prediction we send the real-time measurement data to an Nvidia DGX-2 GPU server for analysis. Specifically, the custom DKL code written in JAX46 is run on a docker container residing on the GPU server. Via a combination of port forwarding and socket programming, data is sent directly from the instrument computer to the DGX-2 device without file I/O, and then processed within the container, taking advantage of the high processing capabilities on the server. For the data transfer, we utilize the mlsocket package (https://pypi.org/project/mlsocket/), which is a wrapper around the low-level python socket interface and enables sending and receiving of numpy arrays. The server houses 16 Nvidia Tesla V-100 GPUs each with 32GB of memory, enabling the different ensemble models to run in parallel. Practically, we select between multi-GPU “parallel” and single GPU “vectorized” approach for the ensemble-DKL training based on the size of image patch and complexity/depth of the neural network. For the image patch size of 20 × 20, the 3-layer fully-connected neural network, and 20 ensemble models, each iteration takes ~30 s when utilizing a single GPU, whereas for a comparison, the same iteration takes ~300 s on the CPU. As such, the connection to edge computing is critical for efficiency and viability of the proposed workflow.
Here, we performed two sets of measurements—the polarization-voltage loop area and frequency-voltage loop area are used as target property descriptors, which measure the energy loss during switching and voltage-induced irreversible dynamics, respectively—on three model samples. First, a small number of randomly sampled points are measured as seed points for training. In these measurements, we start with 0.25% of the total measurement points as the seed data for DKL training, then perform 20 warm-up steps and 200 exploration steps. In the warm-up steps, each channel is trained in parallel and the one with the lowest mean uncertainty is used to derive the next measurement point. After the warm-up, a single channel is sampled at each step according to the epsilon-greedy policy with epsilon decreased uniformly (“annealed”) from 0.4 to 0.1 during the 200 exploration steps.
Shown in Fig. 3 are the evolution of channel reward, mean predictive uncertainty, and channel selection during the ensemble-DKL driven measurement for three samples. For the PTO sample, when the target property is polarization-voltage loop area (Fig. 3a), amplitude channel shows the highest reward and the phase channel is the second-best. Although the resonance frequency shows a very low reward (Fig. 3a), the evolution of uncertainty (Fig. 3b) indicates that the predictive uncertainty based on resonance frequency channel gradually decreases during the experiment, which implicates that the elastic variation displayed in frequency image has an effect on polarization dynamics. However, the topography channel shows both low reward (Fig. 3a) and no decrease of prediction uncertainty (Fig. 3b). When the frequency-voltage loop area is used as the target property, we observe an increase of reward to the resonance frequency and phase channels at the end of experiment (Fig. 3c), accompanied with larger decrease rate of predictive uncertainty from the resonance frequency and phase channels (Fig. 3d). The behavior of resonance frequency channel is due to the directly correlated property from loops and image data. The behavior of phase channel can be understood as the electrostatic effect on the detected cantilever resonance frequency, where the up and down polarized domains (shown as dark and bright contrast in phase image) may associate with different surface charge states that induce different electrostatic effect.
The PZT results (Fig. 3e–h) is very similar to those of PTO. We ascribe this similarity to the fact that most variability of the phenomena on epitaxial film surfaces are related to ferroelastic domain structure. However, note that predictive uncertainty from topography channel (Fig. 3f, h) slightly decreases during experiment in the PZT sample.
For the BFO results (Fig. 3i–l), when the polarization-voltage loop area is used as target property, the reward to amplitude channel (Fig. 3i) quickly stabilized around 0.5–0.6 after ~50 exploration steps, while other channel rewards drop quickly. Interestingly, the predictive uncertainties of four channels are distinct (Fig. 3j)—the uncertainty corresponding to amplitude channel keeps decreasing, the phase channel uncertainty is also very low but shows a slight increase in the middle of the measurement, and the uncertainties corresponding to frequency and topography channels are very high. When the resonance frequency-voltage loop area is used as target property, the evolution of channel reward and uncertainty (Fig. 3k, l) is similar to that of polarization-voltage loop area as target property. This is most likely because both phenomena are ferroelectric domain related.
After the ensemble-DKL exploration measurement, we can use the ensemble-DKL model to predict the target property at unmeasured points. Notably, the prediction can be made from each channel. Shown in Figs. 4 and 5 are the prediction of polarization-voltage loop area and frequency-voltage loop area of three samples from each channel, respectively. For the PTO and PZT samples, predictions from topography (Figs. 4d, h and 5d, h) display some features also showing up in the predictions from other channels, presumably because the ferroelastic domains also show in topography.
The model selection during exploration steps is based on both the current channel reward (partially from warm-up steps) and the exploration/exploitation balance with epsilon-greedy policy. For the PTO and PZT results of using frequency-voltage loop area as target property, even if ensemble-DKL used the amplitude channel originally, the frequency channel reward starts increasing at the end of measurements (Fig. 3c, g) and the frequency channel uncertainty decreases faster than amplitude channel in some cases (Fig. 3d, h). Therefore, to investigate more details of the channel behaviors when using frequency-voltage loop area as target property, an additional experiment with enlarged exploration steps and different exploration/exploitation rate in epsilon-greedy policy was performed. In this measurement, we perform 20 warm-up steps and 480 exploration steps. In the exploration steps, epsilon in the epsilon-greedy policy decreased uniformly from 0.9 to 0.01 is used to sample a single channel at each step. Compared to previous measurements, here the epsilon is larger at the beginning of the measurement and smaller at the end of the measurement, corresponding to larger exploration rate at the beginning and smaller exploration rate at the end, respectively. Shown in Fig. 6 are the results, in this measurement, other channels are used more frequently (Fig. 6f). This is because of the higher exploration rate at the beginning of the measurement. In this case, we can observe more details of the evolution of other channels. We observe an obvious increase of reward to the phase channel (Fig. 6e) and the fastest decrease of uncertainty from phase channel prediction (Fig. 6f), probably because of the electrostatic effect as mentioned before.
To summarize, we have implemented an ensemble-DKL driven automated PFM for the identification of the channel with best predictive capacity, i.e., the channel for the most accurate reconstruction of target property encoded in spectroscopic data. This approach identifies the BEPFM image channel with the most predictive power for a target property of interest during measurement, which is also an indication of the strongest correlation between this BEPFM image channel and the target property.
Here, we implement this approach in BEPFM and piezoresponse spectroscopy measurement, and illustrate its application in exploring the structure-property relationships in three thin film materials with various ferroelectric and ferroelastic properties. To accelerate the ensemble-DKL training and prediction, we also develop an approach enabling real-time data transfer between microscope PC and GPU server, which allows GPU server to analyze the results from the on-the-fly microscope. This workflow and approach are universal and can be applied in other imaging and spectroscopic characterization methods, e.g., electron microscope, optical microscope, mass spectrometry imaging, as well.
The band excitation piezoresponse force microscopy measurements were performed on an Oxford Instrument Asylum Research Cypher AFM system using an ElectriMulti75-G Budget Sensors tip (Pt/Ir coated) with a band of frequencies near the resonance frequencies to track the resonance frequency shift.
Machine learning code is available at https://github.com/yongtaoliu/Ensemble-DKL. Real-time machine learning analysis during automated experiments was performed on a docker container residing on 16 Nvidia Tesla V-100 GPUs server, data were sent directly from the instrument computer to the GPU server via a combination of port forwarding and socket programming.
The method that support the findings of this study are available at https://github.com/yongtaoliu/Ensemble-DKL.
The code related to this study are available at https://github.com/yongtaoliu/Ensemble-DKL.
Gerber, C. & Lang, H. P. How the doors to the nanoworld were opened. Nat. Nanotechnol. 1, 3–5 (2006).
Garcia, R. & Perez, R. Dynamic atomic force microscopy methods. Surf. Sci. Rep. 47, 197–301 (2002).
Hong, J. W., Park, S. I. & Khim, Z. G. Measurement of hardness, surface potential, and charge distribution with dynamic contact mode electrostatic force microscope. Rev. Sci. Instrum. 70, 1735–1739 (1999).
Coffey, D. C. & Ginger, D. S. Time-resolved electrostatic force microscopy of polymer solar cells. Nat. Mater. 5, 735–740 (2006).
Iwata, M. et al. Domain wall observation and dielectric anisotropy in PZN-PT by SPM. Mater. Sci. Eng. B 120, 88–90 (2005).
Ziegler, D., Rychen, J., Naujoks, N. & Stemmer, A. Compensating electrostatic forces by single-scan Kelvin probe force microscopy. Nanotechnol 18, 225505 (2007).
Martin, Y. & Wickramasinghe, H. K. Magnetic imaging by force microscopy with 1000-A resolution. Appl. Phys. Lett. 50, 1455–1457 (1987).
Grutter, P., Liu, Y., LeBlanc, P. & Durig, U. Magnetic dissipation force microscopy. Appl. Phys. Lett. 71, 279–281 (1997).
Popov, G. et al. Micromagnetic and magnetoresistance studies of ferromagnetic La0.83Sr0.13MnO2.98 crystals. Phys. Rev. B 65, 064426 (2002).
Nonnenmacher, M., Oboyle, M. P. & Wickramasinghe, H. K. Kelvin probe force microscopy. Appl. Phys. Lett. 58, 2921–2923 (1991).
Tanimoto, M. & Vatel, O. Kelvin probe force microscopy for characterization of semiconductor devices and processes. J. Vac. Sci. Technol. B 14, 1547–1551 (1996).
Baumgart, C., Helm, M. & Schmidt, H. Quantitative dopant profiling in semiconductors: a Kelvin probe force microscopy model. Phys. Rev. B 80, 085305 (2009).
Sadewasser, S. et al. New insights on atomic-resolution frequency-modulation Kelvin-probe force-microscopy imaging of semiconductors. Phys. Rev. Lett. 103, 266103 (2009).
Melitz, W., Shen, J., Kummel, A. C. & Lee, S. Kelvin probe force microscopy and its application. Surf. Sci. Rep. 66, 1–27 (2011).
Bosman, M., Watanabe, M., Alexander, D. T. L. & Keast, V. J. Mapping chemical and bonding information using multivariate analysis of electron energy-loss spectrum images. Ultramicroscopy 106, 1024–1032 (2006).
Browning, N. D. et al. The atomic origins of reduced critical currents at 001 tilt grain boundaries in YBa2Cu3O7-delta thin films. Phys. C. 294, 183–193 (1998).
Kapetanakis, M. D. et al. Low-loss electron energy loss spectroscopy: an atomic-resolution complement to optical spectroscopies and application to graphene. Phys. Rev. B 92, 125147 (2015).
Liu, Y. et al. Exploring leakage in dielectric films via automated experiments in scanning probe microscopy. Appl. Phys. Lett. 120, 182903 (2022).
Liu, Y. et al. Twin domains modulate light-matter interactions in metal halide perovskites. APL Mater. 8, 011106 (2020).
Vasudevan, R. K. et al. Autonomous experiments in scanning probe microscopy and spectroscopy: choosing where to explore polarization dynamics in ferroelectrics. ACS Nano 15, 11253–11262 (2021).
Kelley, K. P. et al. Probing metastable domain dynamics via automated experimentation in piezoresponse force microscopy. ACS Nano 15, 15096–15103 (2021).
Cappella, B. & Dietler, G. Force-distance curves by atomic force microscopy. Surf. Sci. Rep. 34, 1–104 (1999).
Oliver, W. C. & Pharr, G. M. An improved technique for determining hardness and elastic-modulus using load and displacement sensing indentation experiments. J. Mater. Res. 7, 1564–1583 (1992).
Liu, Y. et al. Role of decomposition product ions in hysteretic behavior of metal halide perovskite. ACS Nano 15, 9017–9026 (2021).
Liu, Y. et al. Direct observation of photoinduced ion migration in lead halide perovskites. Adv. Funct. Mater. 31, 2008777 (2021).
Gad, M., Itoh, A. & Ikai, A. Mapping cell wall polysaccharides of living microbial cells using atomic force microscopy. Cell Biol. Int. 21, 697–706 (1997).
Liu, Y. et al. Experimental discovery of structure–property relationships in ferroelectric materials via active learning. Nat. Mach. Intell. 4, 341–350 (2022).
Liu, Y. et al. Decoding the shift-invariant data: applications for band-excitation scanning probe microscopy. Mach. Learn. Sci. Tech. 2, 045028 (2021).
Liu, Y. et al. Correlating crystallographic orientation and ferroic properties of twin domains in metal halide perovskites. ACS Nano 15, 7139–7148 (2021).
Vasudevan, R. K. et al. Nanoscale origins of nonlinear behavior in ferroic thin films. Adv. Funct. Mater. 23, 81–90 (2013).
Vasudevan, R. K. et al. Polarization dynamics in ferroelectric capacitors: local perspective on emergent collective behavior and memory effects. Adv. Funct. Mater. 23, 2490–2508 (2013).
Roccapriore, K. M., Kalinin, S. V. & Ziatdinov, M. Physics discovery in nanoplasmonic systems via autonomous experiments in scanning transmission electron microscopy. Adv. Sci. 9, 2203422 (2022).
Kelley, K. P. et al. Dynamic manipulation in piezoresponse force microscopy: creating nonequilibrium phases with large electromechanical response. ACS Nano 14, 10569–10577 (2020).
Sotres, J., Boyd, H. & Gonzalez-Martinez, J. F. Enabling autonomous scanning probe microscopy imaging of single molecules with deep learning. Nanoscale 13, 9193–9203 (2021).
Huang, B., Li, Z. & Li, J. An artificial intelligence atomic force microscope enabled by machine learning. Nanoscale 10, 21320–21326 (2018).
Liu, Y. et al. Disentangling electronic transport and hysteresis at individual grain boundaries in hybrid perovskites via automated scanning probe microscopy. Preprint at https://arxiv.org/abs/2210.14138 (2022).
Roccapriore, K. M., Dyck, O., Oxley, M. P., Ziatdinov, M. & Kalinin, S. V. Automated experiment in 4D-STEM: exploring emergent physics and structural behaviors. ACS Nano 16.5, 7605–7614 (2022).
Liu, Y. et al. Automated experiments of local non‐linear behavior in ferroelectric materials. Small 18, 2204130 (2022).
Vasudevan, R. K., Jesse, S., Kim, Y., Kumar, A. & Kalinin, S. V. Spectroscopic imaging in piezoresponse force microscopy: new opportunities for studying polarization dynamics in ferroelectrics and multiferroics. MRS Commun. 2, 61–73 (2012).
Morioka, H. et al. Suppressed polar distortion with enhanced Curie temperature in in-plane 90°-domain structure of a-axis oriented PbTiO3 Film. Appl. Phys. Lett. 106, 042905 (2015).
Liu, Y. et al. Hypothesis-driven automated experiment in scanning probe microscopy: exploring the domain growth laws in ferroelectric materials. Preprint at https://arxiv.org/abs/2202.01089 (2022).
Ziatdinov, M. A. et al. Hypothesis learning in automated experiment: application to combinatorial materials libraries. Adv. Mater. 34, 2201345 (2022).
Wilson, A. G., Hu, Z., Salakhutdinov, R. & Xing, E. P. in Artificial Intelligence and Statistics. 370–378 (PMLR, 2016).
Ziatdinov, M., Liu, Y. & Kalinin, S. V. Active learning in open experimental environments: selecting the right information channel (s) based on predictability in deep kernel learning. Preprint at https://doi.org/10.48550/arXiv.2203.10181 (2022).
Zai, A. & Brown, B. Deep Reinforcement Learning in Action (Manning Publications, 2020).
Bradbury, J. et al. JAX: composable transformations of Python+ NumPy programs. Version 0.2 5, 14–24 (2018).
This effort (implementation in SPM, measurement, data analysis) was primarily supported by the center for 3D Ferroelectric Microelectronics (3DFeM), an Energy Frontier Research Center funded by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences under Award Number DE-SC0021118. This research (ensemble-DKL) was supported by the Center for Nanophase Materials Sciences (CNMS), which is a US Department of Energy, Office of Science User Facility at Oak Ridge National Laboratory. This work was also supported by MEXT Program: Data Creation and Utilization Type Material Research and Development Project Grant Number JPMXP1122683430.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, Y., Vasudevan, R.K., Kelley, K.P. et al. Learning the right channel in multimodal imaging: automated experiment in piezoresponse force microscopy. npj Comput Mater 9, 34 (2023). https://doi.org/10.1038/s41524-023-00985-x