Identifying superionic conductors by materials informatics and high-throughput synthesis

Combinatorial chemistry has been proven effective in the search for novel functional materials, especially in the field of organic chemistry, and is being used to identify functional inorganic compounds. However, there is a growing need for approaches that predict and experimentally realize new materials, beyond composition optimization of known systems. Application of combinatorial chemistry to materials discovery is typically hindered by a limited ability to search a wide chemical composition space, and by our ability to experimentally screen promising compounds. Here, a combinatorial scheme is proposed that combines a materials informatics technique to define a chemical search space with high-throughput synthesis and evaluation. We identify high-performance superionic conductors in the Ca-(Nb,Ta)-Bi-O system, demonstrating the effectiveness of this approach for accelerated materials discovery. High-throughput prediction and synthesis are vital for obtaining new materials that deviate from existing compositions. Here, machine learning is combined with high-throughput synthesis to identify superionic conductors based on Ca-(Nb,Ta)-Bi-O.

I n conventional materials development, researchers propose candidate materials based on their physical and chemical knowledge. In other words, development is guided by the intuition of experienced scientists. Optimal materials are identified through a trial-and-error process of synthesis and evaluation, so obtaining the desired material can be costly and timeconsuming. Combinatorial chemistry has been applied effectively to identify novel functional materials, primarily organic compounds, for quite some time 1,2 . Since the ground-breaking study by Xiang et al. 3 , many inorganic chemists have also employed high-throughput screening technology, which involves combinatorial chemistry and high-throughput characterization 4 to search for functional materials via making comprehensive compositional libraries. However, the chemical space is arbitrary, and the regions in which new materials may be found are not known in advance. Thus, in practice, high-throughput technology has been used for compositional optimization of known materials limiting improvement of desirable properties [5][6][7] . Another obstacle is encountered during experimental searches, which arises from their rather limited total throughput. This manifests itself as a bottleneck, such as the need for phase determination for the entire library or evaluation of transport properties at various temperatures.
Materials informatics has attracted a great deal of interest over the past decade in efforts to accelerate materials discovery [8][9][10] . Virtual screening is a frequently employed materials informatics procedure, in which a machine learning model is constructed from training data that contains descriptors of a material and a target property. The model is then applied to predict the target property in materials that have been added to a database. A large amount of training data is required to build a good model; however, the available materials data are usually quite limited and localized within a small chemical space.
We here propose an effective way of paring combinatorial chemistry with materials informatics. In this scheme, a highthroughput experimental system for materials discovery is enhanced when a machine learning model is able to identify a finite chemical search space to explore. Figure 1 summarizes our scheme for discovering new ion conductors. A virtual screening based on our machine learning model ranks materials in the inorganic crystal structure database (ICSD) that are expected to be superior ionic conductors. The model also predicts elemental combinations with a similarity metric to narrow the practical search space for subsequent combinatorial synthesis. The highthroughput evaluation is introduced to increase the total throughput of experimental screening.
As an application of our method, we focus on oxide ion conductors. The oxide ion conductors are key materials in solid oxide fuel cells (SOFCs) used for various sources of power, such as mobile generators and large-scale energy cogeneration. Thermally stable oxide ion conductors with high ionic conductivity are desired for this purpose, and active searches for them are underway worldwide. However, practical materials that are superior to the conventional high-temperature conductor, i.e., yttria-stabilized zirconia (YSZ), are yet to be developed [11][12][13][14][15] .
In this paper, we apply the proposed scheme to a search for new oxide ion conductors, and then identify materials in the Ca-(Nb,Ta)-Bi-O system, which exhibit high conductivity and durability. This demonstrates effectiveness of our scheme where the pairing of prediction by materials informatics and highthroughput combinatorial chemistry plays a vital role.

Results
Machine learning model to define chemical search space. We first designed 92 hand-crafted, general crystal structure descriptors. These included the distortion of the polyhedron, the radial distribution function, the void dimensions, and specific descriptors based on the bond valence sum (BVS), which were presumed to be related to ionic conduction. To construct a machine learning model we extensively collected experimental oxide ion conductivities from the literature, and compiled them as 29 effective training data. Since as many as 92 descriptors for 29 training data could lessen the prediction accuracy, partial least squares (PLS) regression was used to reduce the number of descriptors 16 . The magnitude of correlation between each descriptor and conductivity was evaluated with the variable importance in projection (VIP) metric. To achieve the lowest root mean square error (RMSE), 26 important descriptors with VIP scores >0.8 were selected. The cumulative contribution ratio, which reflected the information provided by each descriptor, was set above 80%. The RMSE of the predicted conductivity was confirmed to be 1.45 (in log 10 σ [Scm −1 ]), using the Leave-One-Out method. Investigating the relationships between the Fig. 1 Overview of the high-throughput screening approach. Materials Informatics proposes elemental combinations for which high-throughput experimental system identifies new and good superionic conductors. The film libraries are prepared with an automatic pipetting system. Their crystal phases are identified via high-throughput XRD analysis at the SPring-8 synchrotron radiation facility. The conductivity and thermal stability of the compounds are evaluated with the high-throughput conductivity measurement system. descriptors and conductivity in the training data revealed that a highly distorted oxide polyhedron and a small vacancy radius correlated positively with ionic conductivity. Deriving this type of information with the machine learning model may yield physical insights and suggest design rules that can be applied to an extrapolated chemical space. Details of the descriptors and the training data are provided in Supplementary Note 1 (see Supplementary Tables 1 and 2). The conductivity of 13,384 oxides was then predicted using the machine learning model. Of the 13,384 materials in the ICSD, 48% were predicted to be superior to YSZ in terms of conductivity.
The prediction accuracy at this stage was insufficient due to the small amount of training data. This situation is typical in materials informatics, so it is often necessary to increase the amount of training data. Instead, we devised a new strategy that could be readily adapted for combinatorial chemistry. Because we used mainly local structural descriptors around an oxygen atom in the crystal for virtual screening, we could assume that similarity between the local structure and the training data was a measure of prediction accuracy. The local structural similarity between each of the training data and candidate materials was evaluated by distance (d): where A and B represent two different local structures around an oxygen atom, and k(A, B) represents the smooth overlap of atomic positions (SOAP) kernel 17 . We focused on V 7 CuBi 16 O 42 , which has a high conductivity of 1.35 × 10 −1 Scm −1 at 973 K 18 , but like most bismuth-based oxides, exhibits low thermal stability [19][20][21] . Figure 2a shows the predicted conductivities of the materials in the ICSD whose local structures were evaluated by the SOAP metric with respect to V 7 CuBi 16 O 42 . The materials with high conductivity and the local structures similar to V 7 CuBi 16 O 42 were considered candidates. While the candidate materials were just pinpoint selections, the candidate local structures suggested they had great potential for combinatorial synthesis. To select the local structure, we listed the cations that appeared in the candidate materials (Fig. 2b). Considering the practical aspects and processing feasibility, we eliminated toxic elements, noble metals, and 3d-transition metals with a tendency to form electrical conductors. As a result, Bi, Nb, Ta, and alkaline earth metals (Ca, Sr, and Ba) were selected for the combinatorial chemistry experiments. Note that selecting other good material than V 7 CuBi 16 O 42 as a reference for the SOAP distance, we may find other elements that provide another opportunity of the combinatorial chemistry experiments (see Supplementary Fig. 1 and Supplementary Table 3 in Supplementary Note 2).
High-throughput experimental screening. The characterization of a fabricated materials library is often a bottleneck in highthroughput experiments, as it may require structural identification or measurement of physical properties. To address this problem, high-throughput methods for X-ray diffraction (XRD) and conductivity measurements have been developed to characterize combinatorial libraries 22 . To identify the phases in each library quickly, we employed synchrotron radiation with 2D detection at the SPring-8 facility 23 . To evaluate conductivity, a probe-contact impedance measurement system with automated 2D scanning and temperature profiling was installed. This measurement system also facilitated high-throughput evaluation of thermal stability. Additional experimental details are provided in the Methods section. Six-by-six combinatorial libraries were fabricated on alumina substrates by chemical solution deposition (CSD). In each library, each composition was determined by the ratio of the metal-organic deposition solutions. The materials were sintered at 873-1073 K, and the process was repeated until each composition reached a thickness of approximately 1 μm (see Methods and Supplementary Fig. 2). The deposited film of each library was uniformly prepared within a 3 mm × 3 mm segment. Such a finite spatial resolution of each composition was required for accurate determination of its conductivity via probe-contact measurements. The chemical space in this study was expressed as where the bismuth content (z) in each library was fixed, and the Ca (x) and Ta (y) contents within each library were varied. The conductivity of each library film was measured with the highthroughput system at 873 and 973 K. The conductivity maps of three typical libraries at 973 K are shown in Fig. 3a The first chemical trend observed in the libraries was that the conductivity generally increased as the Bi content. This is not all that surprising, as many bismuth-based oxides exhibit high ionic conductivity. The second trend we observed was an increase in conductivity with Ca content, which was significant at x = 0.7 and 0.8. Finally, conductivity could be further optimized by adjusting the ratio of Ta to Nb. We noted that the conductivity when x = 1 was lower than it was when x = 0. 8 Fig. 2 The results of virtual screening based on our machine learning model. a Selection of the candidate materials having the SOAP distance less than 0.6 with respect to V 7 CuBi 16 O 42 and the predicted conductivity higher than 10 −3 S cm −1 . The marks (∇) locate training data and the red colour indicates higher conductivity in the map. b Selected elements based on elemental frequency appeared in the candidate materials. of x (Ca) in Fig. 3b. Here, the ratio of Ta to Nb (y) yielding the highest conductivity for given values of x and z was selected. The compositions in which x = 0.7-0.8 exhibited much higher conductivity than those in which x = 0.3, and their conductivities exceeded that of YSZ 24 when z ≥ 0.5. The maximum conductivity of 2.2 × 10 −1 Scm −1 was observed at 973 K when z = 0.8. Through high-throughput XRD analysis, we found that highly conductive compositions correlated well with the monoclinic CaO-doped bismuth oxide phase 25,26 , hereafter referred to as m-BC, particularly when x = 0.7-0.8 (see Supplementary Fig. 3b and corresponding text in Supplementary Note 3). This demonstrated another feature of our high-throughput system, by which the contribution of each crystalline phase to the target property or multiphase effects could be determined from the systematically constructed libraries.
We further examined the thermal stability of library films that exhibited relatively high conductivity in which z ≥ 0.5. Figure 3c shows the initial conductivities of the films in which z = 0.5 and 0.8 at 873 K. Retention of conductivity after annealing the films for 5 h at 873 K is shown in Fig. 3d. Clearly, the conductivity was degraded upon annealing at z = 0.8. On the other hand, compositions with z = 0.5 and x = 0.7-0.8 exhibited a conductivity as high as that of YSZ, along with good thermal stability.

Discussion
In the high-throughput experiments, the maximum conductivity in each library at a given value of z was almost always observed when x = 0.7 or 0.8 and y = 0.4. To confirm this finding, sintered compact (bulk) samples were prepared by a conventional solidstate reaction. The details of bulk sample preparation and characterization are summarized in Methods and Supplementary Note 4. The conductivities of the bulk samples were slightly higher, but the chemical trend was similar to that observed in the high-throughput libraries as shown in Fig. 4a. The thermal stability was evaluated for the bulk samples held at 873 K, as shown in Fig. 4b. The conductivity of the samples with z ≥ 0.7 degraded, whereas the samples with z ≤ 0.6 clearly exhibited high thermal stability. When the crystalline phases in the bulk samples were examined ( Supplementary Fig. 6 and Supplementary   24 . The highest conductivities in the third column from the left and those in the second column from the right in (a) are plotted as a function of z. c Conductivities of samples in which z = 0.8 or 0.5 at 873 K. d Retentions of conductivity (%) after annealing at 873 K for 5 h for the library with initial conductivities of (c).
found that the m-BC phase correlated well with high conductivity (Fig. 4c). Particularly, the conductivity increased abruptly when the m-BC phase exceeded 92%, which correspond to z = 0.5 and 0.6. Interestingly, Bi 0.74 Ca 0.26 O 1.37 with a single m-BC phase is not stable at high temperatures, whereas the coexistence of Ca and Nb/Ta stabilises the m-BC phase even at 973 K (see Supplementary Figs. 6 and 7 and the corresponding Supplementary Note 4). These bulk results were consistent with those obtained with the high-throughput libraries, and the highthroughput experiment was validated. Finally, we confirmed these compounds were oxide ion conductors with transport numbers around 70−80% at 973 K by evaluating their behaviour in oxygen concentration cells.
In this study, we proposed pairing combinatorial chemistry with materials informatics and demonstrated its effectiveness. We developed a new materials informatics algorithm with virtual screening and the similarity metric of the local structure, which identified the chemical space to be examined through combinatorial chemistry. In addition, implementing high-throughput conductivity measurements and high-throughput XRD allowed us to increase the total experimental throughput. The application of our method to oxide ion conductors led to the discovery of materials in the Ca-(Nb,Ta)-Bi-O system that exhibited high conductivity and durability. The pairing of materials informatics and combinatorial chemistry as demonstrated in this study could minimize problems often associated with each, such as small amounts of training data for the machine learning and the vast search spaces of combinatorial chemistry, thereby opening a new avenue for efficient materials discovery.

Methods
Library fabrication. Combinatorial chemistry libraries with the chemical formula [Ca x (Nb 1−y Ta y ) 1−x ] 1−z Bi z O δ were prepared on alumina substrates by chemical solution deposition (CSD). Each substrate contained 36 compositional films comprised of six rows and six columns, and a total of 288 compositional library films on eight substrates were investigated comprehensively. Metal-organic solutions of bismuth oxide, calcium oxide, niobium oxide, and tantalum oxide were purchased from Kojundo Chemical Laboratory Co., Ltd. (Japan) and used as raw materials. The solutions were dispensed in the desired ratios onto a microplate with a SM300DSZ automatic pipetting device (Musashi Engineering Co., Ltd., Japan). They were then transferred onto an alumina substrate through a steel mask containing 36 square holes and dried at 393 K for 5 min. The mask was removed, and the materials were sintered at 873 K for 30 min. The same process was repeated about six times to achieve a final film thickness of > 1 μm. The final sintering temperature was optimized by annealing the films at temperature ranging from 873 to 1073 K for 30 min to densify them.
High-throughput evaluations. The surface and cross-section microstructures of the library films were evaluated by field-emission scanning electron microscopy (FE-SEM, JSM-7000F, JEOL Ltd.). Crystal structures in the library were analysed on a high-throughput XRD system using synchrotron radiation with a wavelength of 0.8 Å at the SPring-8 facility 23 along with a 2D detector (PILATUS) 22 . Each theta scan data in the XRD measurement was taken within a few seconds. After measuring the film thicknesses, an interdigitating Pt electrode was fabricated by sputtering over a metal mask placed on the library. The library was then annealed at 873 K for 30 min in air. The high-throughput measurement system comprises a pair of tungsten probes, a stage with a cartridge heater, a measurement chamber, an LCR meter (3522-50; Hioki Co., Ltd., Japan), and a control device 22 . A Cole-Cole plot of each library film was then constructed using a 3522-50 LCR meter along with spatial scanning of the library. During measurement, the oxygen concentration in the chamber was kept below 20 ppm with flowing nitrogen gas. Each sample was heated at 873 and 973 K, and the impedance data in the frequency range 1-100 kHz with 100 mV applied voltage were automatically recorded and stored at each temperature. To calculate conductivity, we used the form factor of the interdigital Pt electrode calibrated with the conductivity of a YSZ-sputtered film 22 . Measurements were performed over the course of 5 h at 873 K to evaluate durability.
Preparation and characterization of bulk samples. Sintered bodies with the selected compositions were prepared by solid-state synthesis to confirm the accuracy of the high-throughput measurements. CaCO 3 (99.9%), Bi 2 O 3 (99.9%), Nb 2 O 5 (99.9%), and Ta 2 O 5 (99.9%) were purchased from Kojundo Chemistry, Co. Ltd. (Japan) and used as raw materials. The compounds were weighed to obtain target materials with the composition [Ca x (Nb 0.6 Ta 0.4 ) 1−x ] 1−z Bi z O δ (x = 0.7 or 0.8, and z = 0.2-0.8) and placed in a 80 ml zirconia cup together with zirconia balls and 20 ml ethanol solvent. The zirconia cup was placed in a Planet M2-3F planetary ball milling system (Nagao System Inc., Japan), and the mixture of raw materials was pulverized and mixed at 300 rpm for 1 h. After removing the solvent, each moulded mixture of raw materials was calcined at temperatures ranging from 973 to 1173 K for 5 h to obtain the target materials. After the calcined powder was ground and sieved, each sample underwent cold isostatic pressing (CIP) at 200 MPa to yield a compact green body. The green bodies were heat-treated for 2 h in an oxygen atmosphere at temperatures ranging from 1023 to 1173 K to form sintered bodies.
The sintered compacts were then characterized by XRD analysis using CuKα radiation (Rigaku, Ultima IV). Both surfaces of each sintered sample were polished, and a Pt electrode was added by sputtering. The conductivity of each sample was measured with the LCR meter in air from 773 to 973 K. Continuous measurements were performed at 873 K over 9 h to evaluate thermal stability.
Disk-shaped specimens 18 mm in diameter and 1.0 mm in thickness were prepared to estimate their oxygen transport numbers from electromotive force (EMF) measurements in oxygen concentration cells. A sintered compact body was set between two quartz tubes and fixed with a spring. Nitrogen and oxygen flowed from the ends of the tubes, and EMF was measured in an electric furnace at 973 K. The oxygen partial pressure was determined to be 1 atm on the oxygen supply side and 46 ppm on the nitrogen supply side with a YSZ standard sample.