Introduction

The martensitic transformation (MT) is a class of first-order displacive phase transformations primarily observed in phase-transforming metals and their alloys when subjected to high temperature and/or pressure or stress. The transformation involves interesting mechanical properties, from strengthening to superelasticity, and shape memory effects.1,2,3,4 Therefore, understanding the atomistic mechanisms underlying the MT plays an important role in achieving the desired material properties. The development of the microstructure arising from these transformations is strongly governed by the crystallographic symmetry or geometry of the phases;5,6 hence, atomistic computer simulations can be especially valuable to gain insight into the details of the phase transformation kinetics.7,8,9,10 However, this is still a challenge for certain allotropic metals due to the lack of reliable interatomic potentials. Despite the availability of a large number of semiempirical potentials for single-component metals,11,12,13 few faithfully reproduce phase equilibrium and transformations, limiting the applicability of the atomistic simulations.

The allotropic metal zirconium, with transition temperatures and pressures that are relatively accessible,14,15,16 makes an ideal candidate for studying phase transformation behavior. At ambient conditions, the equilibrium crystalline structure of Zr is hexagonal close packed (hcp) below 1180 K and becomes body-centered cubic (bcc) above this temperature. This hcp to bcc phase transition occurs, in the context of phonons, due to the anomalies at the N-point phonon of the T1 branch along the [ξ ξ 0] direction.17,18 Under pressure, Zr exhibits the crystal structure sequence hcp → ω → bcc, and first-principle calculations indicate that such a sequence is due to an increasing occupation of the d-states with pressure.19 The pressure-induced hcp to ω transformation is particularly important because the high-pressure ω phase can strengthen the metal while also greatly lowering its toughness and ductility.20,21 This phase transformation can occur in α-Zr under hydrostatic, shock loading, or high-pressure torsion conditions.14,21,22 The room temperature (RT) α to ω phase transition has been observed to occur between 2 and 7 GPa, depending on the experimental technique, the pressure or shock environment, and the sample purity.21 Moreover, the transformation is martensitic, driven by shear and shuffles.23,24

To understand these transformations in Zr from the atomistic level, several embedded-atom (EAM) and modified embedded-atom (MEAM) potentials have been developed.13,25,26,27 Unfortunately, the global phase diagram of pure Zr is still poorly described by existing semiempirical potentials due to the inevitable compromise in accuracy for predicting the properties of different phases during parameterization. The overwhelming majority of these potentials were designed to reproduce physical properties of the hcp phase and point defects, but only a few can describe the bcc–hcp transitions at atmospheric pressure.25,28 These potentials do not reproduce, for example, the transformation behavior of α(hcp) → ω under pressure, hindering their use for studying the mechanisms underlying this transformation and its effects on the microstructure and properties under driven conditions. An additional challenge is that current semiempirical potentials tend to predict rather low basal stacking fault energies, which fail to capture the deformation behavior of α-Zr (i.e., prism and twinning slip).16,28,29

A class of machine-learning (ML)-based simulation methods has recently emerged as a promising means for enabling atomic simulations with quantum-mechanical (QM) accuracy but affordable computational costs. The key idea is to map a set of atomic environments directly onto numerical values for energies and forces. In contrast to semiempirical methods (with analytical functional forms), the energies and forces are “learned” from a set of higher-level (quantum-mechanics-based) reference calculations using the ML algorithm. This essential scheme has been applied to several single-component metals showing better accuracy of atomic forces or energies,30,31,32,33 including for the bulk RT phases for Ti.32 However, such single-phase-based learning strategies are not transferable to situations that have complex MT processes.

Our objective is to construct an interatomic potential that can capture complex MTs in transforming metals. In this paper, we developed a machine-learned interatomic potential for Zr by incorporating domain knowledge of the MT, such as phase transformation pathways and electronic density changes. The interatomic potential reproduces not only the properties of the bcc, hcp, and ω phases, but also the MT between them at a given temperature or pressure, consistent with experimental results. The results of testing this potential suggest that it can be employed in future classical molecular dynamics (MD) simulations of thermal and mechanical processing as well as the related microstructure development. Our approach, based on previous efforts in the literature,30,33 is general and can be applied to other metals, where several competing phases can occur under extreme pressures or temperatures.

Results

Strategy

We adopt a Gaussian process-type ML approach31,34 to match the true potential energy surface (PES), aimed at capturing different phases in Zr. In order to simulate large systems, the total energy is expressed as a linear combination of the sum of local energy contributions from all the atoms.35 In this scenario, each atomic energy contribution depends only on its local environment, which is represented by a feature space vector or fingerprint so as to make the problem more amenable to a ML representation. Figure 1 illustrates the three key steps of the ML interatomic potential construction process, which includes collecting reference data, fingerprinting the atomic environments, and establishing a robust mapping between fingerprints and energies.

Fig. 1
figure 1

Scheme of the machine-learning interatomic potential development for allotropic metals. The ab inito molecular dynamic runs and NEB technique are used to accumulate the reference database, where the atomic configurations are transformed to numerical fingerprint vectors. The training database is then adopted to establish the mapping between fingerprints and atomic energies, which generates the interatomic potential

The accuracy of the ML potential will strongly depend on the selection of the fingerprint. Such a fingerprint should differentiate dissimilar configurations with adequate accuracy and be invariant under translation, rotation, and the permutation of atoms. While several such prescriptions have been proposed for solids in the past,33,34 a complete description of allotropic metals places a higher demand for the choice of feature vectors as it needs to capture the local atomic environment of not only the stable phases but also the transition states along the transformation pathways. In our scheme, three different types of local environments related to structural phase transformations (Fig. 1) are fingerprinted for Zr atoms, namely, the change in bond length (pairwise terms), shape change (three-body terms), as well as volume change (many-body terms). The atomic energy can thus be expressed as a linear combination of these, i.e.,

$$E_i = \varepsilon _i^{2b} + \varepsilon _i^{3b} + \varepsilon _i^{Mb}$$
(1)

where “2b,” “3b,” and “Mb” refer to two-, three-, and many-body terms, respectively. The local energy corresponding to each component d {2b,3b,Mb} is given by a linear combination of kernel functions:

$$\varepsilon _i^d = \mathop {\sum}\limits_t {w_t^d} (V_i^d,V_t^d) + b_0$$
(2)

Here, K is a linear kernel function of the form K(x,y) = x*y, whereas wt and b0 denote the weighting coefficient and a constant obtained from the fitting procedure, respectively. t labels each reference atomic environment and \(V_t^d\) is its corresponding fingerprint vector.

The fingerprints for pairwise contributions are generated by Gaussian functions suggested by Behler and Parrinello.35,36 Specifically, the two-body component comprises radial functions, that is,

$$v_i^{2b}(\eta _s) = \mathop {\sum}\limits_{j \ne i} {e^{ - (r_{ij}/\eta _s)^2}f_c(r_{ij})}$$
(3)

where rij is the distance between atoms i and j, ηs is the Gaussian function width. \(f_c(r_{ij}) = 0.5[1 + \cos (\pi r_{ij}/R_c)]\) is a damping function for atoms within the cutoff distance Rc, and is zero elsewhere. Here, eight different ηs (Table 1) are adopted to construct the fingerprint vector for the two-body interactions in Zr, that is \(V_i^{2b} = \{ v_i^{2b}(\eta _1),...,v_i^{2b}(\eta _7)\}\).

Table 1 Adjustable parameters for the machine-learning model of Zr created in this work (see the Strategy section for definitions)

Triple-atom interactions are captured by incorporating the angular dependence into the fingerprint function, which is important for describing the shape change of lattices during the phase transformations in Zr. For each atom i, the function is constructed using

$$v_i^{3b}(\eta _s) = \mathop {\sum}\limits_{k \ne i} {\mathop {\sum}\limits_{j \ne i,j < k} {g^n\left( {\cos \theta _{jk}} \right) \cdot e^{ - \left( {r_{ij}^2 + r_{ik}^2} \right)/4\eta _S^2}} f_c\left( {r_{ij}} \right)f_c\left( {r_{ik}} \right)}$$
(4)

where θjk is the angle between atoms i, j, and k centered on atom i. gn(cos θjk) represents a polynomial function of cos θjk. Similar to \(V_i^{2b}\), the various elements of \(V_i^{3b}\) are calculated with eight different ηs, and three different gn(cos θjk) (provided in Supporting Information).

For the many-body contributions (physically related to the electronic density with volume changes and structural phase transformations), we generate the fingerprint in a simple functional form similar to the embedding energy term of the MEAM potential.26 Our approach considers the neighborhood density of a given atom i, defined as

$$\rho _i^m(\mu ,\sigma ) = \mathop {\sum}\limits_{j \ne i} {e^{ - (r_{ij} - \mu )^2/\sigma ^2}f_c(r_{ij})}$$
(5)

where μ and σ are adjustable parameters. Further fingerprints for many-body components are generated using

$$v_i^{Mb}(\mu ,\sigma ){\mathrm{In(}}\rho _i^m{\mathrm{(}}\mu {\mathrm{,}}\sigma {\mathrm{))}}$$
(6)

For the present Zr potential, 16 sets of (μ, σ) (Table 1) are used to build the elements of the many-body term fingerprint vector \(V_i^{Mb}\). A similar fingerprint form of pairwise interaction has been used to describe the RT hcp-Ti phase properties;32 however, our work incorporates the three-body and many-body terms that are crucial for capturing the MT. It is important to note that the scheme also coincides with the generalized pseudopotential theory, which provides a first-principles approach to multi-ion interatomic potentials in d-band transition metals.37,38

Force field performance

To analyze the quality of our ML potential, we test how much the predicted energies deviate from Density Function Theory (DFT) reference values. Figure 2 compares the predicted energies to the DFT calculations for all configurations of Zr that are used in the training phase and for validation. In order to check the robustness of our potential for different phases, we test the configurations of experimentally observed bcc, hcp, and ω phases individually by calculating their mean absolute errors (MAEs), which monitors the difference between the first-principle calculation and our potential prediction. As shown in Fig. 2, the MEAs of the prediction model for the average energy of atoms in the three phases (in our test database) are 6.7 meV/atom, 5.8 meV/atom, and 4.3 meV/atom, respectively. It is of the order of the expected numerical and theoretical accuracy of the reference quantum-mechanics-based calculations. It is interesting to note that the training points with the lowest overall energy show the lowest fitting errors if we further break down the ML model’s performance according to configuration types: high-pressure or low-temperature configurations will be easier for an ML potential to fit as they are closer to the perfect crystalline structures. Subsequently, we test the performance of the model for a hypothetical fcc-Zr structure. Although the fcc crystal structure does not appear on the usual pressure–temperature phase diagram for Zr, the epitaxial growth of fcc-Zr thin films has been reported.39 Given that such fcc data were never used in the training set during the “training” process, we would expect that energies of such configurations will be difficult to predict. Surprisingly, correlation plots of energies show an acceptable fitting error with an MAE of 6.3 meV/atom (Fig. 2d). This indicates good transferability of the present ML model to various structural environments.

Fig. 2
figure 2

Comparison of the potential energy predicted using the ML potential compared to AIMD calculations for a β-Zr supercell, b α-Zr supercell, c ω-Zr supercell, and d hypothetical fcc-Zr supercell. Note that the β-Zr, α-Zr, and ω-Zr data are included in the “training” dataset. A perfect correlation with the DFT values would correspond to the black lines. MAE represents mean absolute error

We next consider the effects of volume change on atomic potential energies as this determines the most stable phases under pressure. Figure 3a shows the energy for different volumes for α, β, and ω phases. The DFT calculations and ML prediction follow the same trend: the β phase has much higher enthalpy than α or ω, and the most stable phase changes from α to ω upon compression at 0 K. To estimate the transition pressure for the α→ω transformation in our ML potential, we calculated the enthalpy difference between α and ω as a function of external pressure, as shown in Fig. 3b. At zero temperature, α-Zr transforms to ω at a pressure of 3.5 GPa. This is close to the estimates of the α → ω transformation pressure (3.4 GPa) at RTs from Zhang et al.’s experiment.15

Fig. 3
figure 3

Potential energies of α, β, and ω Zr as a function of volume or pressure for the present ML potential. a The volume-dependent energy of α (black), β (red), and ω (blue) at several volumes. The ML potential predicted curves are lines. The DFT data of α, β, and ω phase appear as an open triangle, cube, and a circle, respectively. b The enthalpy difference between α and ω as a function of pressure. The enthalpy difference suggests a transition pressure of 3.5 GPa

Discussion

To evaluate the reliability of the ML potential, we computed several physical properties of Zr related to phase transformations. The elastic constants Cij of α, β, and ω phases were first computed by applying a set of small volume-conserving strains and fitting the energy change with a parabolic function, which allows for relaxation of the atomic positions. Accurate elastic constants are important for the correct description of the long-range strain fields around martensitic variants and defect structures (such as dislocations). As indicated in Table 2, the maximum deviation between ML and DFT elastic constants is 10%, indicating the accuracy of the potential for effects of strain on Zr. Note that elastic constants in the table were not included in the “training” data, but came out to be in good agreement with the DFT data. This agreement indicates adequate model transferability.

Table 2 Calculated elastic constants (units: GPa) of α-Zr, β-Zr, and ω-Zr using the present ML potential in comparison with DFT data

Surface energies are among the least predictable properties for classical atomic potentials as they often involve large changes of coordination number or atomic environments. The most representative ones are stacking fault energies as they control the dislocation slip or deformation twinning processes, thus influencing the development of martensitic variants. Here we consider stacking fault energies corresponding to the basal plane as they are easily underestimated by existing EAM or MEAM potentials.28,29 There are three possible faults on the basal plane. The two intrinsic faults have one of the two stacking sequences ABABCBCB (I1) and ABABCACA (I2), while the extrinsic fault has the stacking sequence ABABCABAB (E). The most important fault is the I2-type intrinsic stacking fault, which determines the dissociation of a dislocation on the basal plane into partial dislocations.28 The ML potential correctly predicts a metastable I2 stacking fault with a stacking fault energy of 205.5 mJ/m2, close to its DFT counterpart (i.e., 201.0 mJ/m2).29

Figure 3 compares the phonon spectra obtained with the present ML potential for α, β, and ω Zr with the available experimental data and results from DFT calculations. Phonon–dispersion curves at 0 K were computed by Fourier transformation of the dynamical matrix in several high-symmetry directions. For all three phases, the potential accurately reproduces the low-frequency regions of the phonon spectrum but overestimates the frequencies of the optical branches. It should be noted that quite similar discrepancies were found in all previous calculations with classic potentials.16 Both acoustic and optical phonons are needed to describe the shuffle and strain degrees of freedom in the martensitic phase transformations.

For the high-temperature β-Zr, the bcc lattice becomes mechanically unstable at lower temperatures (Fig. 4a) and shows an unstable phonon branch along the T-[110] direction. This corresponds to the Burgers mechanism of the hcp–bcc (α−β) transition. The zero-temperature phonon results for the present ML potential and DFT reproduce this instability by showing an imaginary phonon branch at the N point. However, the experimental data40 did not show this due to the fact that the neutron-scattering experiment was performed at high temperatures. Note that the DFT calculation also shows another unstable phonon branch which is responsible for the (111) plane collapse mechanism of the β → ω transformation. This is associated with the ω phase being stable from the DFT calculation, whereas the α phase is most stable for our ML potential.

Fig. 4
figure 4

The phonon spectra of α, β, and ω Zr. The ML results are compared to DFT and to experimental data38 for the α and β phases (blue circles). For the α and ω phase, the ML potential accurately reproduces the low-energy acoustic branches but overestimates optical and high-energy acoustic branches due to the inaccurate prediction of point-defect energies. The β phase is mechanically unstable at low temperatures and transforms to α or ω. Imaginary frequencies of unstable modes are plotted as negative values

The phonon spectra of the α and ω phases are shown in Fig. 4b, c, respectively. For the hcp (α) Zr phase, although we reproduce the overall trend of the experimental phonon branches, the ML potential overestimates the experimental optical-phonon frequencies by about 20–30%. Since there are no experimental data for the high-pressure ω phase, we compare the phonon spectrum for the ML potential to DFT calculations and find better agreement than that for the α phase. In contrast to the α phase, the optical branches of the ω phase are underestimated by the ML approach and the deviations are much smaller. Previous studies suggest that the deviations of the optical phonons are due to the inaccurate prediction of point-defect energies, which do not adversely affect the energy barrier for the structural phase transformations and the phase diagram.16,41

We have shown that the present ML potential adequately describes a number of basic physical properties of the pure Zr system. In this section, we present several core applications for which our potential is particularly well suited. In particular, classical MD simulations with the present ML interatomic potential are used to study the different phases and martensitic transformations between them. The simulations determine the phase stability and temperature–pressure phase diagram of α, β, and ω Zr, as well as possible phase transformation mechanisms.

To estimate the stability range of the α, β, and ω phases, we perform MD starting from a 12 × 12 × 12 bcc (β) supercell with 3456 Zr atoms that is commensurate with all three phases if properly strained. MD simulations were performed using a time step of 1 fs. Periodic boundary conditions were applied along all three dimensions. The Nose–Hoover thermostat42 and the Parrinello–Rahman barostat43 was used for controlling temperature and pressure, respectively. For each pressure and temperature value, we simulate up to 0.5 ns and observe the phase evolution of the system. There are significant mechanical constraints which cause additional transformation hysteresis in the experiment. The Parrinello–Rahman barostat allows changes in size and shape of the simulation box, easing the mechanical constraint.

Figure 5a shows a cross-sectional view of the β → α phase transformation at low pressures. Upon continued cooling to 300 K at 0 GPa, the product α phase begins to nucleate randomly within the bcc matrix, characterized by the coexistence of bcc (blue atoms) and hcp (orange atoms) lattices, as shown in the middle figure of Fig. 5a. The growth of the hcp nucleus gives rise to the formation of two α variants (top right). This multi-domain martensite structure has been frequently observed in previous simulations for the bcc–hcp phase transformations.44 Moreover, indexing of the initial bcc and the final hcp structures clearly indicates that the phase transformation occurs via the Burgers mechanism,2,45 which has the orientation relationship (110)bcc (0001)hcp and [111]bcc [11–20]hcp.

Fig. 5
figure 5

Typical microstructure evolution of β-Zr during cooling at P = 0 GPa and P = 8.0 GPa using the present ML potential. The initial perfect bcc structure transforms to the hcp lattice including two domain boundaries a while it transforms to ω phase at high pressure b. The blue color represents the ideal bcc structure, green is the distorted bcc structure, orange represents the ideal hcp structure, and the microstructure with orange and blue stacking is the ω phase

The increase of pressure changes the product phase from α to ω. A typical β → ω phase transformation process is shown in Fig. 5b. At 8 GPa, two ω variants are successively formed with their basal axis along different [111]bcc directions. The narrow ω variant is embedded within the matrix of the other, and the dark blue atoms show the corresponding domain boundaries (bottom middle). To minimize the total energy of the system, the large ω variant then grows at the expense of the narrow ω variant (bottom right). In contrast to the transition at low pressures, the final ω structure maintains a perfect single-domain structure. The corresponding structures before and after the phase transformation indicate that the β → ω transformation in Zr indeed follows the (111)bcc plane collapse mechanism.21

Figure 6 shows the predicted Zr equilibrium phase diagram as a function of pressure and temperature for the ML potential. At pressures below 4.0 GPa and temperatures below 900 K, the β phase transforms into the α phase. At pressures above 4.0 GPa and temperatures below about 800 K, the β phase transforms into the ω phase. The transition temperatures of both β → α and β → ω drop gradually with increasing pressure (dP/dT < 0), and the β phase becomes more stable under higher pressures. In contrast, the phase boundary between the α and ω phases shows a positive slope (dP/dT > 0). In addition, the triple point between the α, β, and ω phases is inferred to occur at about 4 GPa and 800 K.

Fig. 6
figure 6

Predicted phase diagram of pure Zr as a function of pressure and temperature. The ML potential accurately describes all three solid phases α, β, and ω of Zr and captures the martensitic phase transformations between them. The open squares, circles, triangles, and the orange lines are obtained from ML–MD simulations, while black lines46 and solid symbols belong to previous experimental data.46,47,48,49,50

The ML predicted phase diagram agrees reasonably well with experimental observations and previous theoretical calculations.46,47,48,49,50 At zero pressure, the calculated bcc–hcp transformation temperature is about 1000 K, slightly lower than the experimental value (1180 K). Experimental measurements of the RT α–ω transformation pressure are highly dispersed (between 2 and 7 GPa) due to the large hysteresis of the α–ω transformation. The α → ω transformation for the present ML potential occurs at 4 GPa, consistent with the region of experimental measurements.46 Experimental values for the triple point range from 3 to 4 GPa and 800 to 1000 K, similar to the ML potential values of 4 GPa and 800 K. Although not very accurate as yet, the agreement of the ML transition pressure with experimental data provides confidence that we can accurately simulate pressure-induced MT, enabling us to access atomic simulation of large supercells and understand the transformation mechanisms, kinetics, as well as mechanical properties under different boundary conditions.

The structural motif of the Zr’s phases is distinct and dominated by local interactions allowing the interatomic potential to capture the emergent dynamics with such success. Different from classical potential, our ML potential is directly learned from high-throughput QM calculations, and thus can accurately grasp the QM information necessary (including temperature effect) of different phases in Zr during the dynamic processes, i.e., learned from ab initio molecular dynamics simulations (AIMD) simulations of Zr at finite temperatures, enabling safe interpolation.

In summary, we have developed an interatomic potential for the phase-transforming metal, Zr, by directly learning from reference quantum MD simulations. The resultant ML approach predicts energies that are mainly of the order of meV/atom and demonstrates good transferability to various structural environments. In contrast to the existing semiempirical potentials, the ML potential reproduces properties of the α, β, and ω phases with reasonable accuracy. Several applications have been presented for which our potential is particularly well suited. In particular, classical MD simulations have been performed to investigate both temperature- and pressure-induced MT. The simulations reasonably reproduce the Zr equilibrium phase diagram as a function of pressure and temperature, as well as the transformation mechanisms between α, β, and ω phases. The strategies outlined here to construct an interatomic potential for Zr that shows complex pressure- or temperature-driven MT behavior can be applied to other phase-transforming systems (such as Fig. S2 for potassium).

Methods

Reference database preparation

Structures for reference atomic environments and benchmarks were accumulated from ab initio MD runs and nudged elastic band (NEB) calculations.51 Both were performed using the Vienna ab initio simulation package (VASP)52 within the Perdew–Burke–Ernzerhof generalized gradient approximation (GGA)53 for the exchange-correlation function. To ensure the transferability of the potential to a wide variety of atomistic situations, Zr in different geometric arrangements was considered, including modest-sized bulk samples in the bcc, hcp, or ω phase (the lattice constants and size of the supercells are listed in Supporting Information). For each AIMD run, the sample was generated at the appropriate density and held at constant temperatures (i.e., 100 K, 800 K, and 1600 K) for 6000 steps. The time step was 1 fs in all AIMD simulations. AIMD snapshots were extracted from the AIMD trajectories and recalculated with a higher cutoff energy of 420 eV and denser k-point mesh of 3 × 3 × 3 to determine accurate forces and energies for the training process. Once the AIMD training database of the bcc, hcp, and ω structures had been generated, it was further extended by including configurations, which are generated from biasing the atoms collectively from one phase to the other along the well-known transformation pathways of the β → α, β → ω, and α → ω phase transformations23,24 with NEB technique (provided in Supporting Information).

Interatomic potential parameterization

As shown in Fig. 1, the combined reference database (including numerical fingerprint vectors and the corresponding target values) was machine-learned to estimate the coefficients in Eq. (2). To achieve a better precision, all the fingerprints in the database are initially normalized to [0,1] before the regression process. All coefficients \(w_t^d\) and b0 are then learned by using the kernel ridge regression method.54 Furthermore, atomic forces can be found by differentiating the analytic form of the potential, and because all fingerprints are based on interatomic separations, the forces on atom i can be written as a sum of contributions from its neighbors, j, each acting along the vector rij. Accordingly, the pressure of the system can also be calculated using the method developed by Thompson and Plimpton.55