Experimental search for high-temperature ferroelectric perovskites guided by two-step machine learning

Experimental search for high-temperature ferroelectric perovskites is a challenging task due to the vast chemical space and lack of predictive guidelines. Here, we demonstrate a two-step machine learning approach to guide experiments in search of xBi\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[ {{\mathrm{Me}}_y' {\mathrm{Me}}_{(1 - y)}'' } ]$$\end{document}[Mey′Me(1-y)″]O3–(1 − x)PbTiO3-based perovskites with high ferroelectric Curie temperature. These involve classification learning to screen for compositions in the perovskite structures, and regression coupled to active learning to identify promising perovskites for synthesis and feedback. The problem is challenging because the search space is vast, spanning ~61,500 compositions and only 167 are experimentally studied. Furthermore, not every composition can be synthesized in the perovskite phase. In this work, we predict x, y, Me′, and Me″ such that the resulting compositions have both high Curie temperature and form in the perovskite structure. Outcomes from both successful and failed experiments then iteratively refine the machine learning models via an active learning loop. Our approach finds six perovskites out of ten compositions synthesized, including three previously unexplored {Me′Me″} pairs, with 0.2Bi(Fe0.12Co0.88)O3–0.8PbTiO3 showing the highest measured Curie temperature of 898 K among them.


Comment:
The authors make heavy use of abbreviations and acronyms throughout the article. I would suggest that they double-check to make sure all are well defined. For example, the acronym "MPB", though widely used in the field, is not defined in the article.
Response: We appreciate the comment. We have carefully reviewed our manuscript and made revisions to address this point. We have tried our best to minimize the usage of abbreviations and where appropriate, we have attempted to define them clearly. We have added a definition for the morphotropic phase boundary (MPB) on Page 4, which now reads as follows: "We built another dataset of 117 compositions for which the T C data are known from published experiments. This dataset contains compositions that are both at and away from the Morphotropic Phase Boundary (MPB) composition, but we do not distinguish between them. In the ferroelectrics literature, the term MPB refers to structural phase transitions arising due to changes in chemical composition at a given temperature and especially in PbTiO 3 based materials, MPB encompasses a region in the phase diagram where two ferroelectric phases (typically in tetragonal and rhombohedral symmetries) coexist." We first refer to Efficient Global Optimization (EGO) on Page 2 and now added the following text on Page 2 for clarification, "We also use our recently developed active learning (or adaptive design) approach based on efficient global optimization (EGO) 13,17 to recommend promising compositions for experimental synthesis and characterization (details of EGO algorithm are discussed in the Regression and Active Learning section)." Comment: It would be helpful for the authors to describe how their results compare to the state-of-the-art and similar compounds. Are Currie temperatures for the materials they have discovered significantly better Response: The highest Curie temperature (T C ) in our training dataset belongs to the BiFeO 3 -PbTiO 3 solid solutions (T C = 1101 K) and the second highest belongs to the Bi(Zn 1/2 Ti 1/2 )O 3 -PbTiO 3 (T C = 990 K) solid solutions. These chemical spaces are well studied in the literature. On the other hand, our two new solid solutions, namely Bi(Fe 0.12 Co 0.88 )O 3 -PbTiO 3 and Bi(Al 0.10 Co 0.90 )O 3 -PbTiO 3 , rank third (T C = 898 K) and fourth (T C = 883 K), respectively, relative to the known materials. More importantly, the merits of these compositions are their ease of processing in the perovskite phase (in addition to having a reasonably high T C ), which are critical for actuator performance. Thus, we qualify these materials as competitive (at-best) relative to the state-of-the-art. Furthermore, our work is one of the first in identifying Bi(FeCo)O 3 and Bi(AlCo)O 3 end members as promising high-T C materials that can also be stabilized in the desired perovskite phase. We anticipate that these results will spur new activities on these interesting materials class. We have added a sentence on Page 6 (first paragraph in Section Discussion) in our manuscript, which now reads as follows: Comment: The authors state that they have ignored the tetragonality ratio and domain mobility. Why? It would seem that it would be straightforward to calculate the tetragonality ratio unless I am missing something. Even if the values aren't ideal, it would be valuable to the community to know what they are.
Response: This is a good question. There are two ways to formulate the problem, • Use tetragonality or domain mobility as an independent variable (similar to how we considered tolerance factor in this paper). One of our key requirements for building machine learning models is that we should be able to quantify our input data such that they not only "fingerprint" known compositions but also represent the yet-to-be-explored composition space. Since descriptors such as tetragonality and domain mobility are not known to us a priori before performing experiments (for the ∼61,500 compositions), we cannot incorporate them as inputs to our machine learning. Therefore, we ignored them in our machine learning study. On the other hand, we can represent each of the ∼61,500 composition using tolerance factor (as it is not an experimental outcome) and hence, tolerance factor is a good independent variable for our problem.
• Use tetragonality or domain mobility as a dependent variable or unknown (similar to T C ). We can, in principle, formulate a multiobjective optimization problem such that we are in search of a new composition with a high T C AND within a desired tetragonality ratio (e.g., 1-1.07) AND domain mobility. However, we do not take this route in this work. This is mainly because we do not have the capability (yet) to handle multiobjective optimization design problems (optimize two or more continuous variables such as T C , tetragonality and domain mobility). What we have shown in this paper is the potential of using machine learning methods for predicting both phase stability and optimization of a physical property (T C ) based on learning from experimental data. It should also be noted that this paper is one of the first in the literature to demonstrate these ideas using experiments. Clearly, constrained optimization is the path forward in this research (which we also state in the manuscript, See Discussion section). We are in the process of developing machine learning methods for multiobjective optimization problems, such as those desired by this referee.

Response to Referee #2
Comment: The authors report an approach for theory-guided materials synthesis as applied to ferroelectric perovskites. They utilize machine learning on a known body of materials to establish relationship between materials structural and compositional descriptors and Curie temperature, and further utilize this knowledge (and associated uncertainties) to guide experimental synthesis. This data-rich experiment-limited approach is ideally suited for classical ceramics synthesis routines, which generally do not allow fast sampling of large parameter spaces, and will be of interest for a broad community of solid state chemists. Furthermore, I believe the same approach can be further extended towards oxide film growth, where substrate temperature, laser fluency, and gas pressure offer additional parameters. Overall, I am very impressed by this work, and recommend it for publications Response: We thank the referee for his/her interest in our work and recommending it for publication.
The authors have adequately addressed nearly all of my concerns. Regarding the omission of the tetragonality ratio and domain mobility, it is understandable why they would leave these out of their machine learning model. However it would be helpful to include these values for the new compositions listed in Table 1, so that the reader can better assess the merits of the new compositions. Of course it would be appropriate to explain that these values have not been optimized by the machine learning model and are only provided to give the readers more information about the new compositions.