Accelerated search for materials with targeted properties by adaptive design

Finding new materials with targeted properties has traditionally been guided by intuition, and trial and error. With increasing chemical complexity, the combinatorial possibilities are too large for an Edisonian approach to be practical. Here we show how an adaptive design strategy, tightly coupled with experiments, can accelerate the discovery process by sequentially identifying the next experiments or calculations, to effectively navigate the complex search space. Our strategy uses inference and global optimization to balance the trade-off between exploitation and exploration of the search space. We demonstrate this by finding very low thermal hysteresis (ΔT) NiTi-based shape memory alloys, with Ti50.0Ni46.7Cu0.8Fe2.3Pd0.2 possessing the smallest ΔT (1.84 K). We synthesize and characterize 36 predicted compositions (9 feedback loops) from a potential space of ∼800,000 compositions. Of these, 14 had smaller ΔT than any of the 22 in the original data set.

where the minimum of martensite disappears in the free energy; (g) T > T1, parent phase is the only stable phase. During heating martensite can exist up to T1, whereas during cooling the parent phase can exist down to Tc. The thermal hysteresis (∆T) during heating and cooling can be considered as the temperature difference between T1 and Tc. Such metastability is not considered by the λ2 = 1 criterion, thus it is not a sufficient condition for discovering low thermal hysteresis alloys.
Supplementary Figure 4 | For all samples we synthesized during our design iteration loops (Table S3), ∆T = P heating − P cooling is linearly correlated with ∆T = 1 2 (As + A f − Ms − M f ). The uncertainties in the latter are much greater than the former, as shown in Supplementary    [1,2]. Similarly, in the Ti50Ni46Fe4 only R-phase has been experimentally identified [3]. ∆T values can be found in Supplementary Supplementary Figure 8 | Activation barrier calculation along a reaction coordinate pathway for B2-B19 (black) and B2-R (red) phase transformation in Ti50Ni48Fe2 alloy. Green dashed line represents the reference line that corresponds to the total energy of the B2 cubic phase. and δ indicate collective lattice strains and atomic displacements, respectively. Irreducible representations Γ + 1 , Γ + 3 and Γ + 5 denote the change in volume of the crystal, tetragonal strain and shear strain, respectively. In the B2-B19 and B2-R phase transformations, the order parameters are Γ + 3 and Γ + 5 , respectively. Subscript GS stands for ground state. Reaction coordinate 0 is the high-symmetry cubic austenite phase (B2). In reaction coordinates 1 -6, the atomic displacements (δ) were frozen and constrained to be in the unrelaxed high-symmetry positions of the B19 and R-phase. We denote this as δ=0. On the other hand, the lattice strains (order parameters) were incrementally increased from 1(Γ + 1 , Γ + 3 ,Γ + 5 ) to GS(Γ + 1 , Γ + 3 , Γ + 5 ). Activation barrier for B2-B19 and B2-R transformation is estimated as 24.48 and 5.5 meV/atom, respectively. The exact values for Γ + 1 , Γ + 3 and Γ + 5 strains in the ground state ( GS) for B19 and R structures are given in Supplementary Table 3. Reaction coordinate 7 represents the fully relaxed ground state structure.  Ni-Ti based alloys can undergo a cubic to rhombhohedral (B2 → R) or cubic to orthorhombic, monoclinic (B2 → B19, B2 → B19 ) transformation as shown in Supplementary Figure 1. Although λ 2 = 1 is widely used to explain the low thermal hysteresis in shape memory alloys, it is an aspect of elastic compatibility, i.e.,λ 2 = 1 only ensures that strain compatibility is exactly satisfied between austenite and martensite. [1,[5][6][7][8][9] However, two or more alloys with λ 2 close to 1 can possess quite different values of thermal hysteresis, ∆T. As shown in Supplementary Figure 2, the thermal hysteresis (∆T) varies from 13 K down to close to 0 K in different systems, even though in all cases λ 2 is very close to 1. [1,[5][6][7][8][9] Note that we use the definition ∆T = 1 2 (A s + A f − M s − M f ) using estimates of the transition temperatures from differential scanning calorimetry (DSC) data in this plot, expect for the alloy TiNiCuPd for which ∆T is estimated from resistivity measurements. If we define ∆T as the difference in temperatures corresponding to the peak to peak height on heating and cooling, then for TiNiCuPd it is 16 K. [1] Even using data from the experimental group, the range of ∆ T is from 3 K to 10 K for λ 2 = 1. [6] Therefore, λ 2 = 1 is a necessary but not sufficient condition for small hysteresis. Alloys which undergo a B2 to R transformation typically have a ∆T (measured peak to peak) in the range 3-5 K.

Supplementary
Thermodynamics is also an important aspect of thermal hysteresis, so that the metastability associated with first order phase transitions needs to be taken into account. As shown in Supplementary Figure 3 Metastable phases appear between T c and T 1 . For example, at temperatures above T c , but below T 0 , the stable phase is martensite but the parent phase can co-exist as a metastable phase [as shown in Supplementary Figure 3(c)]. The system has to overcome an activation barrier in traversing the transition. Similarly, at temperature above T 0 but below T 1 , the stable phase is the parent phase, but martensite can be metastable [as shown in Supplementary Figure 3(d)]. Upon heating, martensite can exist up to T 1 if thermal fluctuations are not considered. Upon cooling, the parent phase expands to T c , without considering thermal fluctuations. Therefore, the martensitic transformation does not take place at the same temperature and the thermal hysteresis (∆T) during heating and cooling can be considered as the temperature difference between T 1 and T c . T 1 − T c is the thermodynamic contribution to ∆T, and can be dependent upon chemistry (concentration and alloying elements). This explains why there can be large variations in ∆T for the same λ 2 = 1 condition (see Supplementary Figure 2).

SUPPLEMENTARY NOTE 2 Search Space
We constrain our problem to the Ni 50−x−y−z Ti 50 Cu x Fe y Pd z family. The concentration x, y, and z are varied by step of 0.1%, and with constraints of 50% − x − y − z 30%, x 20%, y 5% and z 20%. The size of our search space, training set, and virtual space are list below.
• search space size, N = 797504 • training set size, n = 22 out of N = 797504 • virtual set size, N -n = 797482 Note that to avoid complexity from processing conditions, raw materials, microstructure, all samples both in training set and our newly made samples were synthesized and measured in our group under the exact same conditions. Supplementary Table 1 shows our training set with concentrations, features and properties.

Thermal hysteresis (∆T) from DSC curves
The desired property (∆T) values were measured by using differential scanning calorimetry (DSC) with ∆T = P heating − P cooling , dictated by the need to have a reliable diagnostic. [10] We used the temperatures for the peaks P heating and P cooling on heating and cooling, respectively, from the DSC scans. Typically, ∆T is obtained using the so called tangent method using ∆T = 1 where M s and M f are the start and finish temperatures for the martensitic transformation, and A s and A f are the start and finish temperatures for the reverse transformation. We calculated the ∆T using both definitions for all of our newly made samples in the training set, and the results are shown in Supplementary Figure 4. The ∆T from the tangent method is linearly correlated with that obtained using peak to peak height with an R 2 value of 0.979. A similar dependence is obtained if we were to plot many of the other alloys from the literature. However, the uncertainitities associated with the tangent method and laboratory to laboratory variations gives a smaller R 2 value.
The uncertainities (± 0.5 K ) in using ∆T = 1 2 (A s + A f − M s − M f ) is due to the choice associated with the start and finish temperatures when employing the tangent method. As shown in Supplementary Figure 5 for our best alloy Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 , ∆T can vary from 1.70 K to 2.25 K. However, the peak to peak ∆T has a much smaller error < 0.001. We thus use P heating and P cooling both in our training set and for all new samples.

SUPPLEMENTARY NOTE 3
Adaptive Design Loop After determining that the best regressor:selector combination for our data was SVR rbf :KG using the procedure in Methods of main text, we trained the SVR rbf on our n = 22 training data. We used it to predict the mean value (µ) and associated standard deviation (σ) for all samples in the virtual set by using bootstrap 1000 samples, the results are shown in Supplementary Figure 6 . We then employed KG as our selector to choose the next new compound by maximizing the expected improvement. Note that since our experimental setup allowed for the synthesis of four compositions at a time, we utilized the Kriging believer approach [11] to choose the best four candidates at a given time. This involved choosing the first alloy as before but the second choice was made after augmenting the training set with the predicted result of the first alloy, treating it as the actual measurement. This procedure was repeated for the third and fourth choices. The four selected compound were synthesized and the property (∆T) measured using DSC. Amongst the four samples in this 1st iteration, one of them had no martensitic transformation whereas another became the alloy with the second best ∆T in the data set. The predicted versus measured ∆T are shown in Supplementary Figure 6 to evaluate the performance of the regressor (until 6 th iteration). We find that with successive iteration, the inference model improves in predicting the ∆T. This is clearly seen in the first and the sixth iteration for the data points whose ∆T∼30 K (i.e. alloys that do not have martensitic transformation) and whose uncertainties are the largest.

SUPPLEMENTARY NOTE 4 Density Functional Theory
In Supplementary Figure 1, we schematically show the austenite [ Supplementary Figure 1(a)] and the two martensite structures [ Supplementary Figure 1(b) and (c)]. The space groups of the optimized structures from DFT were determined using FINDSYM [12], mode decomposition analysis was performed using ISODISTORT [4] and the crystal structures were visualized in VESTA [13].
Addition of alloying elements (e.g. Cu, Pd and Fe) to the binary Ti 50 Ni 50 compound modifies the transformation product and/or transformation route [3]. For example, in Ti 50 Ni 50 , the phase transformation occurs between B2 cubic (austenite) and B19 monoclinic (martensite) phases. Addition of Fe to Ti 50 Ni 50 (where Fe occupies the Ni-site) introduces an intermediate R martensitic phase (rhombohedral) such that the Ti 50 Ni 50−x Fe x alloy shows a two-stage B2-R-B19 transformation [3]. Furthermore, at higher concentrations of Fe (x > 3%) the B19 phase is completely suppressed [14]. Similarly, addition of Cu and Pd elements also affect the transformation product and route. Ti 50 Ni 50−x Cu x alloy with 5% or more Cu shows B2-B19-B19 , where B19 is an intermediate orthorhombic martensite phase. On the other hand, Pd substitution changes the transformation to B2-B19, instead of B2-B19 . As a result, there is a complex interplay between the alloying elements, their relative concentrations, martensite product and the transformation route.
The purpose of our DFT calculation is not to establish one-to-one mapping between composition and thermal dissipation. We use inference methods (see Supplementary Figure 6) to accomplish this objective. Neither do we uncover new DFT-based descriptors that could serve as a proxy for thermal dissipation. Our goal is to glean insights into the low ∆T behavior of Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 alloy and to do so, we perform two types of DFT calculations: 1. We choose alloys whose ∆T values and martensite phases are known from experiments and have different valence electron number (VEN) values. For these alloys, we compute the total energies in both the austenite (B2) and martensite phases (R or B19, whichever is experimentally observed). We then calculate the total energy difference, ∆E (where ∆E = E martensite −E B2 ). For small thermal dissipation, it is reasonable to assume that an alloy should have negative ∆E that provides an adequate driving force for martensite transformation and yet the magnitude (|∆E|) should be relatively small as this is a measure of the depth of the potential that has to be overcome on cooling and heating.
2. In addition to having a favorable |∆E|, the activation barrier should also be small for low ∆T. This is schematically shown in Supplementary Figure 2. We calculate the activation barrier by choosing a TiNiFe alloy whose Feconcentration is more or less similar to the newly discovered Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 alloy and has the same VEN. It is strictly not necessary to know the experimental ∆T value for this TiNiFe alloy (unlike the previous case in Bullet 1, where the knowledge of ∆T is critical). We estimate its activation barriers for B2-R and B2-B19 along a path dictated by the lattice strain (order parameter here) and compare them to determine the energetically favorable transformation product.
In Supplementary Figure 7, the total energy differences (∆E) from DFT between the austenite and martensite phases for the Ti 50 Ni 34 Pd 16 , Ti 50 Ni 34 Cu 16 and Ti 50 Ni 46 Fe 4 compositions are shown. As noted in the main manuscript and Bullet 1 above, the Ti 50 Ni 34 Cu 16 and Ti 50 Ni 46 Fe 4 alloys with valence electron number (VEN) values 7.16 and 6.92, respectively, fall in the two minima in Figure 3b (in the main manuscript) and Ti 50 Ni 34 Pd 16 (VEN=7) corresponds to a data point away from the minima. We considered only the B19-phase for Ti 50 Ni 34 Pd 16 and Ti 50 Ni 34 Cu 16 , based on recent experimental findings [1,2]. Similarly, in the Ti 50 Ni 46 Fe 4 only R-phase has been experimentally identified [3]. Experimental ∆T for these alloys in their respective martensite phases can be found in the Supplementary Table 1. It should be noted that our DFT calculations use the virtual crystal approximation (VCA), where we average the pseudopotentials of the two atoms that have partial site occupancy and do not consider the local effects. As a result, the outcome and energy trends from our DFT calculations are at best qualitative.
In addition to the three alloys given in Supplementary Figure 7, we also performed DFT calculations on Ti 50 Ni 48 Fe 2 to accomplish the objective stated in Bullet 2. Note that Ti 50 Ni 48 Fe 2 has Fe-concentration similar to that of our Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 alloy and same VEN value of 6.96. However, adequate ∆T data for Ti 50 Ni 48 Fe 2 from our DSC measurement is not available for comparison. The goal is to determine the activation barrier (Supplementary Figure 3) for B2-B19 and B2-R transformation in the Ti 50 Ni 48 Fe 2 alloy and gain insights on the low ∆T behavior of Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 alloy. It is well known that the crystallography of the martensitic phase (whether it is B19 or R) has a key effect on the hysteresis. Generally, the R-phase has a smaller dissipation compared to the B19-phase, due to the smaller transformation strain involved during the austenite to martensite phase transformation [3].
For Ti 50 Ni 48 Fe 2 , we fully relaxed the lattice and atomic positions in both B19 and R structures. In Supplementary  Table 3, the relative amplitudes for the lattice strain modes (order parameters) for the Ti 50 Ni 48 Fe 2 alloy in B19 and R structures are given. Clearly, the tetragonal strain (Γ + 3 ) and shear strain (Γ + 5 ) manifests as the order parameters for B19-and R-phases, respectively. Furthermore, the amplitude of Γ + 5 for B2-R is less than that of the Γ + 3 for B2-B19. We then performed a series of computational simulations to estimate the activation barrier for Ti 50 Ni 48 Fe 2 in the B2-B19 and B2-R phase transformation. The results are given in Supplementary Figure 8. Our calculations show that the activation barrier for B2-R is ∼5 times smaller than that for the B2-B19 transformation. For completeness, the barrier for B2-B19 is also estimated to be 67.01 meV/atom, which is much higher than that for the B2-B19 transformation (data not shown in Supplementary Figure 8). Ideally, a variable-cell nudged elastic band calculation [15] would provide a more accurate estimation of the activation barrier; however this method is not implemented in the Quantum ESPRESSO package [16] used in this work.