Introduction

Most mechanical responses of structural metals and alloys are governed by defect interactions at the atomistic scale and their evolution at the meso- to macro-scales. Depending on the material system and deformation conditions, point, line and planar defects play distinct roles over a diverse range of length and time scales in determining material yield, hardening, creep and fracture behaviour. Understanding defect interaction and evolution over realistic length and time scales is thus important. Quantitative modelling of defects requires accurate description of atomic interactions, which in turn are dictated by the underlying electronic structure. While quantum mechanics-based methods (e.g., density-functional theory, DFT1) are general and robust, they tend to be computationally expensive and scale poorly with increasing system size. As a result, they are routinely limited to simulations of systems containing a few hundred valence electrons and for only a few pico seconds time. Explicit simulations of extended defects, such as dislocations, grain and interphase boundaries, require much larger scale atomic systems that are well beyond the reach of current DFT calculations.

Interactions between atoms can be approximated by analytical or numerical functions in the form of empirical/semi-empirical interatomic potentials/force-fields. By omitting the explicit treatment of the complex electronic structure, empirical/semi-empirical interatomic potentials improve efficiency at the expense of loss of transferability and accuracy. While the functional forms and parameters are empirically chosen and fitted in such a manner to capture essential features relevant to the intended application and material system, they frequently are inadequate to describe properties of interest. For example, one long standing, unresolved challenge for classical interatomic potentials is to reproduce fundamental properties (e.g., generalised stacking fault energy γ-lines2 which measure energy variation as a function of slip between two atomic planes) of the competing slip systems of hexagonal close-packed titanium (HCP Ti); resolving such challenges is pre-requisite to accurate simulations of plasticity and fracture in all metals and alloys. The lack of accuracy, transferability, heavy reliance on empiricism and uncertainty are primary drawbacks of classical interatomic potentials. In this work, we introduce a general procedure for training accurate, neural-network interatomic potentials fit-for-purpose and demonstrate this approach by training an interatomic potential for accurate simulations of the mechanical response of Ti.

Pure Ti exhibits three distinct crystal structures (HCP-α, BCC-β, and hexagonal-ω) and undergoes allotropic phase transformations between these as a function of temperature and pressure. All three structures are elastically and plastically anisotropic and present a variety of dislocation and twinning behaviours. Different classes and formulations of interatomic potentials for Ti have been proposed and applied to provide insight into the complex properties of Ti (EAM3,4, MEAM5,6,7,8, tight-binding9, bond-order10). However, these potentials (except bond-order, which has other inconsistencies, e.g., no minimum on the γ-surface of the prism plane10,11) yield inaccurate γ-line profiles and/or generally predict low stacking fault energy on basal planes (with respect to DFT, as shown in Supplementary Fig. 1). To overcome the systematic inadequacy of these empirical/semi-empirical interatomic potentials, we focus on machine learning neural network-based interatomic potentials12. In particular, we develop an interatomic potential using the Deep Potential (DP) method13,14,15, which provides a robust and flexible approach to describe atomic environments/interactions to replace some of the empiricism of classical interatomic potentials. We adopt the recently established Deep Potential Generator (DP-GEN) scheme16,17 to train potentials in an efficient and systematic manner to further reduce the demand on empiricism. While many different machine learning approaches to training interatomic potentials18,19,20,21,22,23,24,25 are emerging (some for Ti26,27), we explicitly focus on developing machine learning potentials for the prediction of mechanical properties of α − β Ti.

Here, we propose a specialising step, an extension to the current DP-GEN scheme, to systematically train machine learning interatomic potentials to reproduce crystal structures, elastic constant tensors, surface and stacking fault energies of Ti. The training datasets include experimental data from the literature and DFT data calculated in this work. The resulting potential not only closely reproduces the defect properties that control mechanical behaviour used in the training, but also captures a wide-range of properties not explicitly included in the training dataset, including vacancy formation energy, the γ-lines of all relevant slip planes in the HCP and BCC phases. Most critically, the ordering of the stacking fault energies between the basal {0001}, prism \(\{10\bar{1}0\}\), pyramidal I \(\{10\bar{1}1\}\), and pyramidal II \(\{11\bar{2}2\}\) planes in HCP (see the schematic in Fig. 3) is correctly captured. Furthermore, this potential reproduces the general features of the γ-lines on {110}, {112}, and {123} planes of the BCC structure, including the lack of meta-stable points and negative stacking fault energies. Calculation of the free energies of different phases further demonstrate that the potential reproduces the phase stability between the HCP, BCC and liquid phases over a range of pressures and temperatures. The developed interatomic potential thus enables molecular statics and molecular dynamics (MD) simulations of dislocation core structure, dynamics, fracture, and phase transitions. Our approach, applied here for Ti, is general and applicable to train interatomic potentials for accurate reproduction of the mechanical properties of a wide range of materials.

Results and discussion

Strategy and workflow

Figure 1 shows the workflow for training and specialising DP models. The workflow consists of three steps: Initialisation, DP-GEN Loop, and Specialisation. In the Initialisation step, primitive cells of BCC, FCC, and HCP Ti are constructed and equilibrated at zero stress and 0 K using the Vienna Ab initio Simulation Package (VASP28,29, see Methods section for details). Super cells consisting of 2 × 2 × 2 equilibrated primitive cells are constructed and scaled (strained) by ±2, ±4, and ±6% uniformly in all directions. Additional random perturbations are applied to ion positions and super cell vectors. Ab initio MD (AIMD) simulations are performed for 5 time steps at 100 K for each structure. Atom coordinates, forces, total energies, and virial tensors of each AIMD simulation configuration are recorded to form training sets “0".

Fig. 1: Workflow for training the deep potential.
figure 1

Ri is the atomic coordinate of atom i, E is the total energy of one configuration, V is the virial (stress) tensor of one configuration, fi is the force on atom i, n is the number of atoms in one configuration, DP1 is the first ensemble of trial DPs, α labels the αth DP in the ensemble, σ is the standard deviation, and ϵlo and ϵhi are two thresholds in DP-GEN.

In the DP-GEN Loop step, training sets “0" are input into the DeePMD-kit package13,14,15 to train an ensemble of first trial DPs (DP1{α}) based on different random seeds (see ref. 16,17 for details). MD simulations of different structures (perturbed bulk or structures with free surfaces) are performed at selected temperatures (50–3687.9 K, see Supplementary Methods) using the LAMMPS MD package30 with potential DP1{α}. The simulations explore thousands of different configurations along the MD trajectory. For each MD configuration (using DP1{α}), the atomic force on atom i using the different DP1{α}, \({{{{\bf{f}}}}}_{i}^{\alpha }\), is computed, as well as its standard deviation \(\sigma \{{{{{\bf{f}}}}}_{i}^{\alpha }\}\) over the ensemble of trial DP models. If the maximum of \(\sigma \{{{{{\bf{f}}}}}_{i}^{\alpha }\}\) falls within a selected range [ϵlo, ϵhi], the corresponded configuration is chosen as a “candidate configuration”. The total energy, virial tensor, and atomic forces for candidate configurations are then computed using DFT to form additional training datasets. Another DP-GEN iteration is performed using all current training datasets to generate another ensemble of trial DPs. The DP-GEN Loop iterates and is considered converged when no “candidate configurations” are added. All training data generated in this DP-GEN Loop form the “Classic" training set, which serve as input to the Specialisation step.

In the Specialisation step, “special" structures relevant to the intended applications are created (e.g., sheared configurations along the γ-lines). The atomic forces, total energy, and virial tensor are calculated for each special structure using DFT; these are the “Special" training sets. DeePMD-kit13,14,15 is used again to train a final ensemble of DPs based on the “Classic" and “Special" training sets. The final DPs are further tested and the DP model with the best overall performance is selected. While the Initialisation and DP-GEN Loop step settings are described in the Supplementary Methods, the DFT calculations and the Specialisation step settings are in the Methods section. We refer to the specialised DP approach as “DPspecX”, where X refers to the properties for which it is specialised; here, specialisation is for the mechanical response of Ti, i.e., “Ti-DPspecMech”.

Table 1 summarises the four types of training sets used to train Ti-DPspecMech. The first type is the training sets used in the Initialisation step, including the perturbed crystal structures at different volumes, which give configurations along the classical energy versus volume equation of state (EOS) curve. The second and third types are the DP-GEN bulk datasets and DP-GEN surface datasets from the DP-GEN Loop step. The DP-GEN bulk datasets are crystals structures (HCP, BCC, FCC) at finite temperatures that capture properties associated with atomic vibrations, elastic constants, and thermal expansion. DP-GEN surface datasets are crystals with free surfaces that provide information relevant to surface energy and atomic relaxation on surfaces. The last are the γ-line datasets from the Specialisation step; these include selected atomic configurations along portions of the classical γ-line on the basal, prism, pyramidal I narrow (there are narrowly and widely spaced slip planes for pyramidal I, please refer to ref. 31 for a more complete description), and pyramidal II planes of the HCP structure. These datasets help train the model to represent dislocation (stacking fault) properties.

Table 1 Summary of the training sets to train DP for Ti.

Bulk properties and surface energies

Table 2 shows the basic properties of the final DP for Ti as well as an EAM4 and an MEAM6 potential in comparison with the corresponding DFT and experimental values. The DP reproduces the lattice parameters and energies of the HCP, BCC, and FCC structures in excellent agreement with the target DFT values; the differences are smaller than 0.002 Å and 1 meV/atom for the lattice parameter and energy. The EAM and MEAM potentials also have accurate lattice parameters of the three phases; the deviations are around 1% from the DFT and experimental results. The target value for the HCP cohesive energy is chosen to be 4.85 eV/atom from experiments32 (absolute value of DFT cohesive energies are not precise33). The target values for the BCC and FCC cohesive energies are calculated based on their relative energies from DFT and the experimental HCP reference value; this yields calibrated target cohesive energies of 4.74 and 4.79 eV/atom for BCC and FCC structures. The DP is fit to exactly reproduce the target (experimental) cohesive energy of HCP Ti (this corrects the DFT “errors” for the isolated atom). This corrected isolated atom energy was used to determine the cohesive energies of BCC and FCC phases. The MEAM potential has nearly the same cohesive energies, while the EAM potential has cohesive energies close to the DFT values.

Table 2 Lattice parameters, energies (E), cohesive energies (Ecoh), energy differences (ΔE), and elastic constants of HCP, BCC and FCC Ti, relaxed surface energies (σ), vacancy formation energy (Ev) of HCP Ti and the unstable stacking fault energy (γusf) of BCC Ti from DFT, experiment (Expt), DP, EAM4, and MEAM6.

In addition, we examine the efficacy of the DP model in reproducing the properties of larger (3 × 3 × 3) DFT supercells (Supplementary Fig. 2). For the perturbed BCC and HCP structures, the root mean square errors (RMSEs) of the energies are 0.5 and 1.4 meV/atom, respectively. The RMSEs of the atomic forces are 15.3 and 29.2 meV/Å, respectively. These errors are within typical DFT accuracy. We also examined the effect of adding the larger DFT supercell to the DP training set and refit the DP; adding the larger DFT supercell and enlarging the training set did not improve the DFT/DP agreement. Therefore, we conclude that the original DFT supercells/training sets are sufficiently large to produce reliable DP models. Furthermore, the DP model is fitted to both the energies and their derivatives (energies, forces, virials). This strategy improves the smoothness of the energy function and reduces overfitting to some extent. Nevertheless, the DP model is based on a neural net framework, rather than a physical model. As in all neural network approaches, there is a real risk of overfitting and its transferability is not guaranteed. For applications well beyond those in the training set, it would be prudent to exercise caution. For example, we find that the BCC generalised stacking fault energies (not in the training set) are overestimated by the DP model (Fig. 5).

The 0-K DP elastic constants are in good agreement with corresponding DFT values for all three structures and available experimental results. The elastic constants of the DP and DFT are obtained by (i) applying a set of small strains (−1, −0.5, 0.5, and 1%) for each strain components (εxx, εyy, εzz, εxy, εxz, εyz), (ii) calculating the resultant (global) stress for each strain, and (iii) performing a linear least-squares-fit of the obtained stress-strain data. For HCP Ti, the DP elastic constants match well with DFT and experimental values at 4 K34; the deviations are less than 10% from DFT and around 20% from experiment (similar deviations exist between DFT and experiment). The DP elastic constants of the BCC structure are within ±5% of DFT values. As appropriate, both DP and DFT predict that the BCC structure is unstable at 0 K (i.e., C11 < C1235,36). The DP also accurately reproduces the elastic constants of FCC Ti from DFT (not experimentally measured). In comparison, the MEAM potential also reproduces the elastic constants of all three phases well, but the EAM potential shows large deviations in the BCC structure and the FCC structure is unstable in the calculation.

Turning to defect properties, Table 2 shows that DP predictions of the vacancy and surface energies in HCP Ti are in good agreement with DFT results. The DP model has its HCP surface energies (basal, prism, pyramidal I, and pyramidal II) within 1% from DFT values (the largest discrepancy is 0.02 J/m2 on the pyramidal II plane). The EAM potential overestimates surface energies by ~15% and the MEAM potential underestimates them by ~15%. The vacancy formation energy (Ev) of the DP model is ~0.35 eV (17%) higher than the DFT value. The EAM and MEAM potentials have Ev ~ 0.3 eV (15%) lower and 0.13 eV (6%) higher than the DFT value, respectively (see the convergence test of vacancy formation energy in Supplementary Table 1). Note that vacancy configurations are not explicitly included in the training datasets. The DP thus shows transferability and predictive capabilities on basic material properties.

Based on the above results, the MEAM6 potential performed exceptionally well in elastic constants and cohesive energies compared to experiment values. In the following, we mainly focus on the MEAM potential for comparisons with the DP model. The phonon spectra for DP, MEAM6, and experiment are shown in Supplementary Fig. 3. For both the BCC and HCP phases, the DP model and the MEAM potential reproduce the experimental acoustic mode data better than the optical modes. For the HCP phase, both DP and MEAM are in good agreement with the overall trend in the experimental data. Both the DP and MEAM overestimate the optical L-[001] phonon frequencies at Γ and underestimate the optical phonons at K and M. For the BCC phase, both DP and MEAM show unstable phonon branches reflecting the instability of the BCC phase at zero K (C12 > C11).

The energy versus volume (EOS) curve is important for accurate prediction of mechanical response; Fig. 2 shows the 0-K EOS curves for the HCP, BCC, and FCC phases. For each point on the EOS curve, a supercell is pre-strained to the desired volume and equilibrated in DFT (i.e., the supercell volume is fixed but the supercell shape and ion positions are unconstrained). For consistency, the DP energy per atom was calculated using atomic coordinates from the equilibrated compressed/dilated DFT supercell. For all three structures, the energies per atom of the DP agree well with DFT values; the RMSE between DP and DFT is smaller than 1 meV/atom. The DP and DFT EOSs for the three phases are in excellent agreement over the entire volume range examined (14–20 Å3/atom, corresponding to a ±20% volumetric strain).

Fig. 2: The equations of state for three Ti crystal structures.
figure 2

a HCP. b BCC. c FCC.

γ-Line and γ-Surface

In HCP Ti, plastic deformation is carried by dislocations with 〈a〉 and 〈c + a〉 Burgers vectors primarily on the prism and pyramidal I planes, respectively. Slip on basal and pyramidal II planes and deformation twinning are also observed in Ti at high temperatures and in Ti alloys. The primary slip planes are typically those with the lowest screw dislocation dissociation energy. Dislocation nucleation, dissociation and glide behaviour are strongly influenced by the generalised stacking fault energy, or γ-line on each slip plane. Therefore, accurate γ-lines on all relevant slip planes are essential for modelling dislocation and plasticity behaviour. Figure 3 shows the γ-lines for these slip planes, as determined by DP, MEAM6, and DFT2. In this figure, the sheared (slipped) configurations indicated in the dashed boxes are included in the special training datasets. For the basal plane (Fig. 3a), the γ-line is computed along the \([0\bar{1}10]\) direction (corresponding to the Burgers vectors of the partial dislocation dissociated from the 〈a〉 dislocation). The DP accurately reproduces the DFT γ-line for slip from 0 to 40% of the \([0\bar{1}10]\) translation vector. While the stacking fault energies from 40 to 100% along this path are not well-reproduced, this is not relevant since the energy is very high and is not along the minimum energy path for slip. Most critically, the DP gives accurate unstable and stable stacking fault energies at 20 and 33% of the translation vector, respectively; the former controls dislocation nucleation and the latter governs dislocation dissociation (as discussed below). For the MEAM potential, the stable and unstable stacking fault energies (γsf and γusf) are 44 and 47% lower than the corresponding DFT values, which will make dislocation nucleation and dissociation much easier on the basal plane.

Fig. 3: The generalised stacking fault energy (γ-lines) on several planes in HCP Ti.
figure 3

Figures show the data for the (a) basal, (b) prism, (c) pyramidal I narrow, (d) pyramidal I wide, and (e) pyramidal II planes calculated using DFT, the DP model, and an MEAM potential6. The stable and unstable stacking fault energies are labeled as γsf and γusf. The configurations in the dashed black box and at zero slip (origin) are included in the training sets. All configurations on the pyramidal I narrow γ-line (c) are included in the training dataset and no configurations on the pyramidal I wide γ-line (d) are included in the training. The black arrows in the schematics show the slip directions and planes of each γ-line.

The DP model also reproduces the general shape of the γ-line in the \([\bar{2}110]/3\) slip direction (i.e., the 〈a〉 dislocation Burgers vector) on the prism and pyramidal I narrow planes (Fig. 3b, c). On the prism plane, γsf and γusf are ~23 and ~9% lower than the DFT values, while on the pyramidal I narrow plane, γsf and γusf are 6 and 4% higher from DFT values. The MEAM potential has shallow metastable points on the prism and pyramidal I narrow γ-lines; its γsf is ~15% higher and 6% lower than the corresponding DFT values on the two planes. Figure 3d, e show the γ-lines along \([\bar{2}113]/3\) (i.e., the 〈c + a〉 direction on the pyramidal I wide and II planes). The DP again reproduces the overall shape (positions and energies of the stable and unstable SFs) of the DFT γ-lines. Note that only a section of the pyramidal II γ-line information is included in the training dataset and no pyramidal I wide γ-line information is included; this further indicates the transferability and predictive capability of the current DP model. The MEAM potential also shows similar γ-lines on the two pyramidal planes in good agreement with the DFT results.

The γ-lines in Fig. 3 were determined using the standard method (atom displacements constrained along slip plane normals). However, stacking fault energies may be further reduced by full atomic relaxation especially for the HCP pyramidal planes31,37,38. Table 3 shows the stable stacking fault positions and energies calculated by allowing only out-of-plane and full atomic relaxation using DFT, DP, and MEAM6.

Table 3 Metastable stacking fault positions and energies before and after in-plane relaxation.

For the basal and prism planes, the stable stacking fault positions and energies remain almost unchanged upon full relaxation in DFT, DP (except for the decrease of prism γsf by 16%), and MEAM. For the pyramidal I narrow plane, the stable stacking fault shifts from the initial position along the 〈a〉 direction to another position towards the e2 direction (Fig. 4) and with substantial reductions in γsf (57, 79, and 53% in DFT, DP, and MEAM).

Fig. 4: The generalised stacking fault energy surface (γ-surface) on different planes of HCP Ti calculated by the DP model.
figure 4

a Basal. b Prism. c Pyramidal I narrow. d Pyramidal I wide. e Pyramidal II. The red crosses show all the metastable stacking fault positions from DFT in Table 3. The expected dissociations of the 〈a〉 and 〈c + a〉 dislocations are indicated by the dashed arrows in (c), (d), respectively.

The DP model underestimates γsf by similar amounts (−60, −85, and −95 mJ/m2) on the three competing planes (basal, prism, pyramidal I narrow) for the 〈a〉 dislocation. In comparison, the MEAM potential has deviations of −133, +35, and +8 mJ/m2 with respect to DFT. The metastable point governs dislocation core dissociation and DFT calculations show that the 〈a〉 dislocation can dissociate into a pair of partials on both the pyramidal I and prism planes, but not on the basal plane39. The DP model captures the correct ordering of γsf among the three planes. The energy ordering, \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{basal}}}}\, > \,{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{prism}}}} \,> \,{\gamma }_{{{\mathrm{sf}}}}^{{\mathrm{pyramidal}}\,{\mathrm{I}}\,{\mathrm{narrow}}}\), is different from previous models5,6,7,8 with \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{basal}}}}\, < \,{\gamma }_{{{\mathrm{sf}}}}^{{{{\mathrm{pyramidal}}}}\,{\mathrm{I}}\, {\mathrm{narrow}}}\, < \,{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{prism}}}}\). The correct ordering helps to reproduce the ground state dislocation core structure on the pyramidal I narrow plane, as discussed below.

For the pyramidal I wide plane, the stable stacking fault position along the 〈c + a〉 direction shifts to another position along the e2 direction (Fig. 4d). The fully relaxed stable stacking fault position is consistent with symmetry requirements and suggests that the pure screw 〈c + a〉 dislocation will dissociate into partials of mixed character on the pyramidal I wide plane. In addition, the similarities in the stacking fault location and energy in DFT and DP indicate that the in-plane relaxations should be similar, even though the stacking fault on the pyramidal I wide plane is not directly included in the training dataset. The MEAM potential overestimates γsf by 27%; its stacking fault position is similar to that of the DP. For the pyramidal II plane, the stable stacking fault shifts to another positions along the 〈c + a〉 direction with a decrease in γsf of 0.113 J/m2 upon in-plane relaxation in DFT, 0.122 J/m2 in DP calculations, and 0.126 J/m2 in MEAM calculations. The DP model overestimates the fully relaxed γsf by 7%, while the MEAM potential overestimates γsf by 22%. The fully relaxed γsf difference between the two competing plane, i.e., \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{pyr. II}}}}-{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{pyr. I}}}}\), are 0.187, 0.207, and 0.223 J/m2 in DFT, DP, and MEAM, respectively. DFT, DP, and MEAM all suggest that it is much more (energetically) favorable for 〈c + a〉 screw dislocations to dissociate on the pyramidal I planes than on pyramidal II planes, consistent with experimental observations which show that pyramidal I 〈c + a〉 slip is dominant in Ti40.

The γ-lines in Fig. 3 represent possible minimum energy paths (based upon crystal symmetry) for slip along a crystallographic plane. To ensure that these do represent the true minimum energy paths, we calculate the entire γ-surface for four crystallographic planes in HCP. Figure 4 shows the DP γ-surface. The red cross-symbols (’X’) indicate the stacking fault positions determined from DFT (see Table 3). The overall γ-surfaces are consistent with crystal symmetry and all stable stacking faults are properly reproduced in accordance with the DFT results. While the smoothness of the γ-surfaces are not guaranteed with neural network-based potentials, the high degree of smoothness here indicates that the DP potential is suitable for atomistic simulations. The quantitative and qualitative features of all of the γ-surfaces, together with properties presented earlier, suggest that the current DP is appropriate for modelling HCP Ti mechanical response.

BCC Ti has important structural material applications along with intriguing features. Yet, its properties and behaviour are less well understood than the HCP allotrope of Ti. The BCC phase is entropically stabilised above 1155 K. To further test the capabilities of the DP for BCC Ti, we first compare γ-lines on the {110}, {112}, and {123} planes with DFT predictions (obtained using only out-of-plane atomic relaxation).

Figure 5 shows the γ-lines along the close packed 〈111〉 direction on the three slip planes. The DP reproduces the overall profiles and main features of all of the γ-lines calculated by DFT. In DFT, DP, and MEAM, all three lines show negative stacking fault energies at small slip distances ±0.15. This is not surprising since the BCC structure is not the ground state at 0 K. In addition, no metastable stacking fault is seen near half 〈a〉 (i.e., [111]/4), consistent with the broad observation that no metastable stacking fault has been found in any BCC metals41. Furthermore, the γ-lines are symmetric about half the translation vector x = 0.5 on the {110} plane, but slightly asymmetric on {112} and {123}. While the DP overestimates γusf (energy maximum) along the γ-lines as compared with DFT and the exceptionally good MEAM, it importantly captures the same peak-energy/barrier ordering between the three planes, i.e., \({\gamma }_{\,{{\mathrm{usf}}}\,}^{\{110\}}\, < \,{\gamma }_{\,{{\mathrm{usf}}}\,}^{\{123\}}\, < \,{\gamma }_{\,{{\mathrm{usf}}}\,}^{\{112\}}\). The DP thus reproduces the overall shape and key features of the γ-lines, demonstrating its predictive capabilities, despite the fact that no BCC γ-line information is used in the training dataset.

Fig. 5: The generalised stacking fault energy (γ-line) of BCC Ti are shown on three planes.
figure 5

a {110}. b {112}. c {123}. The data were obtained based upon DFT, the DP model, and an MEAM potential6.

HCP lattice and elastic constants at finite temperatures

All of the properties presented above were calculated at 0 K to provide base-line material properties. Since titanium alloys are also employed for medium temperature (<600 K) applications, we now investigate several properties at finite temperatures, as shown in Fig. 6. We first compare the temperature dependence of HCP Ti lattice parameters (a, c, c/a) with experimental data34,42,43. The DP shows that the lattice parameter a increases nearly linearly from 0 to 1000 K with similar thermal expansion coefficients as experimental measurements. The DP underestimates a by ~0.01 Å as compared to experiments, since it is trained with DFT results which also underestimate a (similarly for c). The coefficient of linear thermal expansion for a is nearly identical to experiment up to 1000 K. On the other hand, the coefficient of linear thermal expansion for c from the DP is larger than that from experiment. The atomic volume can be calculated from the lattice parameters a and c. While the DP underestimates the volume per atom of the HCP structure (inherited from the DFT training dataset), the volumetric thermal expansion coefficient (β) of the DP is 3.05 × 10−5 K−1 at 300 K, as compared with the experiment value of ~2.70 × 10−5 K−1 at the same temperature44. While the experimental data show a weak, nearly linear reduction of c/a (important for twinning) with increasing temperature, the DP results show a weak increase in c/a with temperature (c/a varies <1% from 0 to 1000 K). Overall, the temperature-dependent lattice parameters and atomic volume from the DP are in good agreement with experiment; the discrepancies are associated with differences between DFT predictions and experiment. For MEAM, it underestimates a by about 1% and overestimates c by about 1% compared to experiment. The MEAM results also show a weak increase in c/a with temperature (c/a varies about 1% from 0 to 1000 K).

Fig. 6: Finite-temperature properties of HCP Ti.
figure 6

The temperature-dependence of the lattice parameters a, c, c/a, and the elastic constants Cij of HCP Ti from the DP, MEAM6, and experiment34,42,43. a Lattice parameter a. b Lattice parameter c. c c/a ratio. d C11 and C33. e C12 and C13. f C44.

Figure 6d–f show the elastic constants of DP and MEAM as a function of temperature in comparison with experimental results. The finite temperature elastic constants calculations were performed by applying a ±1% strain individually for each strain component (εxx, εyy, εzz, εxy, εxz, εyz) at each temperature. The elastic constants represent (6000 time step) time averages of the global stress and averages over ± strains.

There exist small discrepancies between the DP elastic constants and the experimental data at 0 K. With increasing temperature, C11 from experiment decreases continuously; the DP predictions initially decrease very rapidly between 0 and 50 K, and then slowly and approach experimental values at higher temperatures (Fig. 6d). MEAM shows a continuous decrease and is very close to DP on C11 at high temperatures. The DP C33 is relatively accurate in the entire temperature range. It increases slightly from 0 to 150 K and then decreases continuously from 150 to 1000 K—asymptotically approaching the monotonically decreasing experimental data. The MEAM C33 decreases very rapidly between 0 and 50 K and then slowly at higher temperatures. It is about 15% lower than the experimental data. Similar levels of agreements were obtained between DP, MEAM predictions and experimental data for other Cij. These results are perhaps sufficiently accurate for HCP Ti MD simulations. For both the DP and MEAM potentials, the discrepancies of elastic constants at finite temperatures are not expected and highlight the challenges in modeling HCP systems.

Phase transition temperatures

Phase stability over a temperature and pressure range is important for MD simulations of HCP-BCC Ti. To determine the phase stability of the DP, we compute the free energies of the HCP, BCC and liquid structures using thermodynamic integration45 and determine the phase transition temperatures as a function of pressure. Figure 7 shows the transition temperatures of the DP model as a function of pressure in comparison with experimental data46,47 and MEAM6. At zero pressure, the DP exhibits an HCP to BCC transition at 1140 K and BCC melting at 1886 K, in remarkable agreement with 1155 and 1941 K from experiments. The DP melting point increases with increasing pressure, consistent with experimental results and thermodynamic expectation (molar volume of BCC < that of the liquid). However, the MEAM melting point first increases and then decreases with increasing pressure. The HCP to BCC structural phase transition temperature shows little dependence on pressure for both DP and MEAM, while the experimental data suggest that the transition temperature decreases slowly with increasing pressure. The mismatch in the slope of the HCP-BCC phase boundary is likely due to a minor inadequacy of the DP and MEAM in reproducing the atomic volume of the two phases, which can be improved if necessary. Overall, the DP shows good agreement with experiment results on the HCP-BCC transition temperatures and the BCC melting temperatures. Accurate DP phase transition temperatures is likely a consequence of including high temperature configurations in the training dataset (DP-GEN bulk datasets). This suggests that high temperature thermodynamic properties can be reproduced by training the DP with the current framework and that the DP is effective for describing phase transitions and mechanical behaviour over the full solid-state temperature range.

Fig. 7: Ti phase transition temperatures at different pressures.
figure 7

The phase boundaries between the HCP-BCC and BCC-liquid structures calculated by the DP model and an MEAM potential6 in comparison with experiments46,47.

Dislocation core structures

In this section, we examine dislocation core structures using the DP. We focus on the core structure of the screw 〈a〉 dislocation in HCP, which governs many intriguing features of slip in Ti48. We first created a Volterra 〈a〉 dislocation in a 303 × 163 × 9 Å supercell with periodic boundary conditions in the glide (303 Å) and dislocation line directions (9 Å). At 300 K, the dislocation adopts several structures on the pyramidal I and prism planes, and occasionally adopts a compact core (Supplementary Fig. 4). This suggests the energies of the competing core configurations are very close and can be influenced by applied strains, solutes and temperatures. No dissociation on the basal plane was observed (in 105 MD steps).

To quantitatively analyse the core structure, MD configurations were quenched every 2000 time steps, followed by energy minimisation (Fig. 8). Two distinct core configurations were observed, corresponding to dissociations on the pyramidal I and prism planes; similar core structures were seen in DFT calculations48. In the DP model, the ground state dissociation plane for the screw 〈a〉 dislocation is the pyramidal I plane. The dissociated core on the prism plane exhibits a slightly higher energy (~1.4 meV/Å, Supplementary Fig. 5), while DFT calculations based on quadrupolar configurations48 show a higher energy of 5.7 meV/Å. This energy difference between these configurations is small (the MEAM potential slightly favours dissociation on the prism plane49). While some discrepancies still exist between the ground state core structure of the present DP and DFT (cf. Fig. 8b and Fig. 4c in ref. 48), further tuning the DP model for core structures increases the risk of overfitting; core structures with better agreement with DFT may be obtained at the expense of less-accurate C11. Nevertheless, the DP should provide an accurate description of dislocation glide behavior at intermediate and elevated temperatures. We refer the reader to a more extensive discussion of the screw 〈a〉 dislocation core structures and energies in the Supplementary Information. Core structures and dynamics of other dislocations (e.g., the edge 〈a〉 dislocation, edge and screw 〈c + a〉 dislocations, twinning dislocations, and dislocations in the BCC phase) are equally important and are currently under study.

Fig. 8: Differential displacement plot of the screw 〈a〉 dislocation.
figure 8

The dissociations are shown on the (a) prism and (b) pyramidal I plane. The atom shading and crystallographic orientations are shown in (c).

Computational cost of DP

Finally, we compare the speed of DP, EAM, and MEAM implementations on both CPUs and GPUs (Supplementary Fig. 6). On CPUs, the DP model is 200–300 times slower than EAM potentials and 30–40 times slower than the MEAM potential. For system with several dozen atoms, DP is faster than DFT (use the VASP settings in Methods section) by a factor of over 106. On GPUs, DP is 20–30 times slower than the EAM potential (MEAM is currently not ported to GPU in LAMMPS). Additional optimisation of the networks at the heart of the DP is possible by further optimisations on different operators, on the computational graph, and on multiple hardware devices50. All potentials show a linear scaling with the number of atoms. Because of this linearity and speed, the DP model can be used to perform large scale MD simulations of many defects including dislocations, grain and phase boundaries with relatively good accuracy and speed that are far outside the reach of DFT (except for very simple cases).

In summary, we reported a procedure for specialising a general purpose neural network (DP) interatomic potential to reproduce important physical properties for specific applications (X); i.e., DPspecX . In particular, we developed a DP for modelling the mechanical response of Ti; Ti-DPspecMech. The resulting DP accurately reproduces a comprehensive range of properties both within and outside the training datasets for HCP-BCC Ti. The DP thus enables a wide range of molecular statics and dynamics simulations in HCP-BCC Ti, including dislocation, interphase interfaces, fracture, solid–solid and solid–liquid phase transition behaviour/properties. Comprehensive benchmarks also show that the MEAM potential6 performed extraordinarily well in many aspects, suggesting that classical interatomic potential models will also remain relevant in the foreseeable future.

While no empirical interatomic potential provides accurate reproduction of all material properties over all temperatures and stress states, the approach provided here begins with a general purpose potential and specialises it for classes of properties of interest. The current procedure is general and can be applied to development of interatomic potentials for materials beyond Ti and applications beyond mechanical response within the DP framework or other machine learning based approaches. The selection of special dataset can be standardised and neural network parameters can be optimised by standard algorithms/codes (e.g., TensorFlow). This approach thus represents a shift from empiricism-driven to machine-driven interatomic potentials fit-to-purpose.

Methods

DFT calculations

All DFT calculations are performed using the VASP28,29 with the Perdew–Burke–Ernzerhof51 generalised gradient approximation exchange-correlation functional. The cutoff energy of the plane-wave basis set is 650 eV and core electrons are replaced with the projector-augmented-wave method52. K-points with grid spacing of 0.1 Å−1 are sampled in the Brillouin zone by the Monkhorst–Pack Mesh method53. The Methfessel–Paxton smearing method54 with order 1 and smearing width σ = 0.22 eV is used for partial electron occupancy. Self-consistent convergence is assumed when the energy variation is below 10−3 meV.

Specialisation step details

“Special” training sets were generated along HCP γ-lines and equilibrated in DFT using the standard γ-line calculation approach and parameters above. The DFT-calculated γ-lines are shown in Fig. 3 and are in good agreement with previous calculations31. Among all configurations, we choose those near the unstable and stable stacking fault displacements to form the special training sets. For example, seven structures at slip distances 0, 0.15, 0.20, 0.25, 0.30, 0.35, and 0.40 were chosen from the γ-line on the basal plane, as shown in Fig. 3a. The atomic coordinates, forces, total energy, and virial tensor of these structures are input to the training iteration. These structures represent configurations of importance for plastic deformation. We therefore increase their weight to 100 in the loss function of the neural network calibration (configurations in the general training sets have a default weight of 1). This is equivalent to generating 7 × 100 special training sets from the basal plane γ-line. Special training sets are chosen and weighted similarly for the prism, pyramidal I narrow and pyramidal II plane γ-lines. In total, the “special” training sets include 4600 configurations from four γ-lines on different planes, as seen in Table 1. No pyramidal I wide plane γ-line configurations were included in the training set for cross-validation purposes. The selection and weight of the special training sets are flexible and can be further optimised for desired targets.

The DeePMD-kit package14 is used for training a smooth Ti DP15. These training data consist of “classic” and “special” training sets. The embedding and fitting net sizes are (25, 50, 100), and (240, 240, 240), respectively. The cutoff radius of the DP is 9.0 Å and includes at least the third-nearest neighbours in the HCP structure. Four models are trained, starting with different random seeds, but using the same neural network architecture and training sets. The pyramidal I narrow plane γ-line is not included initially for training general models. The learning rate starts at 1 × 10−3 and ends at 5 × 10−8 after 8 × 106 training steps. The atomic forces and total energy of all the training sets are included in the training, but only the virial tensor of the training sets from the Initialisation step (1469 datasets in Table 1) is used to obtain more accurate description of elastic constants near equilibrium. In addition, the prefactors for the energy, atomic force and virial tensor in the loss functions are \({p}_{\,{{\mathrm{e}}}}^{{{\mathrm{start}}}\,}\)=10, \({p}_{\,{{\mathrm{e}}}}^{{{\mathrm{limit}}}\,}\)=100, \({p}_{\,{{\mathrm{f}}}}^{{{\mathrm{start}}}\,}\)=1, \({p}_{\,{{\mathrm{f}}}}^{{{\mathrm{limit}}}\,}\)=1, \({p}_{\,{{\mathrm{v}}}}^{{{\mathrm{start}}}\,}\)=10, and \({p}_{\,{{\mathrm{v}}}}^{{{\mathrm{limit}}}\,}\)=10. These parameters give more weights to the energy and virial tensor, as compared to the atomic forces. Afterwards, the pyramidal I narrow plane γ-line is included to tweak the dislocation core energy difference. One DP model is further trained starting from the current neural network and the learning rate starts at 1 × 10−4 and ends at 5 × 10−8 after 1.6 × 107 training steps.