Abstract
Large scale atomistic simulations provide direct access to important materials phenomena not easily accessible to experiments or quantum mechanicsbased calculation approaches. Accurate and efficient interatomic potentials are the key enabler, but their development remains a challenge for complex materials and/or complex phenomena. Machine learning potentials, such as the Deep Potential (DP) approach, provide robust means to produce general purpose interatomic potentials. Here, we provide a methodology for specialising machine learning potentials for high fidelity simulations of complex phenomena, where general potentials do not suffice. As an example, we specialise a general purpose DP method to describe the mechanical response of two allotropes of titanium (in addition to other defect, thermodynamic and structural properties). The resulting DP correctly captures the structures, energies, elastic constants and γlines of Ti in both the HCP and BCC structures, as well as properties such as dislocation core structures, vacancy formation energies, phase transition temperatures, and thermal expansion. The DP thus enables direct atomistic modelling of plastic and fracture behaviour of Ti. The approach to specialising DP interatomic potential, DPspecX, for accurate reproduction of properties of interest “X”, is general and extensible to other systems and properties.
Similar content being viewed by others
Introduction
Most mechanical responses of structural metals and alloys are governed by defect interactions at the atomistic scale and their evolution at the meso to macroscales. Depending on the material system and deformation conditions, point, line and planar defects play distinct roles over a diverse range of length and time scales in determining material yield, hardening, creep and fracture behaviour. Understanding defect interaction and evolution over realistic length and time scales is thus important. Quantitative modelling of defects requires accurate description of atomic interactions, which in turn are dictated by the underlying electronic structure. While quantum mechanicsbased methods (e.g., densityfunctional theory, DFT^{1}) are general and robust, they tend to be computationally expensive and scale poorly with increasing system size. As a result, they are routinely limited to simulations of systems containing a few hundred valence electrons and for only a few pico seconds time. Explicit simulations of extended defects, such as dislocations, grain and interphase boundaries, require much larger scale atomic systems that are well beyond the reach of current DFT calculations.
Interactions between atoms can be approximated by analytical or numerical functions in the form of empirical/semiempirical interatomic potentials/forcefields. By omitting the explicit treatment of the complex electronic structure, empirical/semiempirical interatomic potentials improve efficiency at the expense of loss of transferability and accuracy. While the functional forms and parameters are empirically chosen and fitted in such a manner to capture essential features relevant to the intended application and material system, they frequently are inadequate to describe properties of interest. For example, one long standing, unresolved challenge for classical interatomic potentials is to reproduce fundamental properties (e.g., generalised stacking fault energy γlines^{2} which measure energy variation as a function of slip between two atomic planes) of the competing slip systems of hexagonal closepacked titanium (HCP Ti); resolving such challenges is prerequisite to accurate simulations of plasticity and fracture in all metals and alloys. The lack of accuracy, transferability, heavy reliance on empiricism and uncertainty are primary drawbacks of classical interatomic potentials. In this work, we introduce a general procedure for training accurate, neuralnetwork interatomic potentials fitforpurpose and demonstrate this approach by training an interatomic potential for accurate simulations of the mechanical response of Ti.
Pure Ti exhibits three distinct crystal structures (HCPα, BCCβ, and hexagonalω) and undergoes allotropic phase transformations between these as a function of temperature and pressure. All three structures are elastically and plastically anisotropic and present a variety of dislocation and twinning behaviours. Different classes and formulations of interatomic potentials for Ti have been proposed and applied to provide insight into the complex properties of Ti (EAM^{3,4}, MEAM^{5,6,7,8}, tightbinding^{9}, bondorder^{10}). However, these potentials (except bondorder, which has other inconsistencies, e.g., no minimum on the γsurface of the prism plane^{10,11}) yield inaccurate γline profiles and/or generally predict low stacking fault energy on basal planes (with respect to DFT, as shown in Supplementary Fig. 1). To overcome the systematic inadequacy of these empirical/semiempirical interatomic potentials, we focus on machine learning neural networkbased interatomic potentials^{12}. In particular, we develop an interatomic potential using the Deep Potential (DP) method^{13,14,15}, which provides a robust and flexible approach to describe atomic environments/interactions to replace some of the empiricism of classical interatomic potentials. We adopt the recently established Deep Potential Generator (DPGEN) scheme^{16,17} to train potentials in an efficient and systematic manner to further reduce the demand on empiricism. While many different machine learning approaches to training interatomic potentials^{18,19,20,21,22,23,24,25} are emerging (some for Ti^{26,27}), we explicitly focus on developing machine learning potentials for the prediction of mechanical properties of α − β Ti.
Here, we propose a specialising step, an extension to the current DPGEN scheme, to systematically train machine learning interatomic potentials to reproduce crystal structures, elastic constant tensors, surface and stacking fault energies of Ti. The training datasets include experimental data from the literature and DFT data calculated in this work. The resulting potential not only closely reproduces the defect properties that control mechanical behaviour used in the training, but also captures a widerange of properties not explicitly included in the training dataset, including vacancy formation energy, the γlines of all relevant slip planes in the HCP and BCC phases. Most critically, the ordering of the stacking fault energies between the basal {0001}, prism \(\{10\bar{1}0\}\), pyramidal I \(\{10\bar{1}1\}\), and pyramidal II \(\{11\bar{2}2\}\) planes in HCP (see the schematic in Fig. 3) is correctly captured. Furthermore, this potential reproduces the general features of the γlines on {110}, {112}, and {123} planes of the BCC structure, including the lack of metastable points and negative stacking fault energies. Calculation of the free energies of different phases further demonstrate that the potential reproduces the phase stability between the HCP, BCC and liquid phases over a range of pressures and temperatures. The developed interatomic potential thus enables molecular statics and molecular dynamics (MD) simulations of dislocation core structure, dynamics, fracture, and phase transitions. Our approach, applied here for Ti, is general and applicable to train interatomic potentials for accurate reproduction of the mechanical properties of a wide range of materials.
Results and discussion
Strategy and workflow
Figure 1 shows the workflow for training and specialising DP models. The workflow consists of three steps: Initialisation, DPGEN Loop, and Specialisation. In the Initialisation step, primitive cells of BCC, FCC, and HCP Ti are constructed and equilibrated at zero stress and 0 K using the Vienna Ab initio Simulation Package (VASP^{28,29}, see Methods section for details). Super cells consisting of 2 × 2 × 2 equilibrated primitive cells are constructed and scaled (strained) by ±2, ±4, and ±6% uniformly in all directions. Additional random perturbations are applied to ion positions and super cell vectors. Ab initio MD (AIMD) simulations are performed for 5 time steps at 100 K for each structure. Atom coordinates, forces, total energies, and virial tensors of each AIMD simulation configuration are recorded to form training sets “0".
In the DPGEN Loop step, training sets “0" are input into the DeePMDkit package^{13,14,15} to train an ensemble of first trial DPs (DP^{1}{α}) based on different random seeds (see ref. ^{16,17} for details). MD simulations of different structures (perturbed bulk or structures with free surfaces) are performed at selected temperatures (50–3687.9 K, see Supplementary Methods) using the LAMMPS MD package^{30} with potential DP^{1}{α}. The simulations explore thousands of different configurations along the MD trajectory. For each MD configuration (using DP^{1}{α}), the atomic force on atom i using the different DP^{1}{α}, \({{{{\bf{f}}}}}_{i}^{\alpha }\), is computed, as well as its standard deviation \(\sigma \{{{{{\bf{f}}}}}_{i}^{\alpha }\}\) over the ensemble of trial DP models. If the maximum of \(\sigma \{{{{{\bf{f}}}}}_{i}^{\alpha }\}\) falls within a selected range [ϵ_{lo}, ϵ_{hi}], the corresponded configuration is chosen as a “candidate configuration”. The total energy, virial tensor, and atomic forces for candidate configurations are then computed using DFT to form additional training datasets. Another DPGEN iteration is performed using all current training datasets to generate another ensemble of trial DPs. The DPGEN Loop iterates and is considered converged when no “candidate configurations” are added. All training data generated in this DPGEN Loop form the “Classic" training set, which serve as input to the Specialisation step.
In the Specialisation step, “special" structures relevant to the intended applications are created (e.g., sheared configurations along the γlines). The atomic forces, total energy, and virial tensor are calculated for each special structure using DFT; these are the “Special" training sets. DeePMDkit^{13,14,15} is used again to train a final ensemble of DPs based on the “Classic" and “Special" training sets. The final DPs are further tested and the DP model with the best overall performance is selected. While the Initialisation and DPGEN Loop step settings are described in the Supplementary Methods, the DFT calculations and the Specialisation step settings are in the Methods section. We refer to the specialised DP approach as “DPspecX”, where X refers to the properties for which it is specialised; here, specialisation is for the mechanical response of Ti, i.e., “TiDPspecMech”.
Table 1 summarises the four types of training sets used to train TiDPspecMech. The first type is the training sets used in the Initialisation step, including the perturbed crystal structures at different volumes, which give configurations along the classical energy versus volume equation of state (EOS) curve. The second and third types are the DPGEN bulk datasets and DPGEN surface datasets from the DPGEN Loop step. The DPGEN bulk datasets are crystals structures (HCP, BCC, FCC) at finite temperatures that capture properties associated with atomic vibrations, elastic constants, and thermal expansion. DPGEN surface datasets are crystals with free surfaces that provide information relevant to surface energy and atomic relaxation on surfaces. The last are the γline datasets from the Specialisation step; these include selected atomic configurations along portions of the classical γline on the basal, prism, pyramidal I narrow (there are narrowly and widely spaced slip planes for pyramidal I, please refer to ref. ^{31} for a more complete description), and pyramidal II planes of the HCP structure. These datasets help train the model to represent dislocation (stacking fault) properties.
Bulk properties and surface energies
Table 2 shows the basic properties of the final DP for Ti as well as an EAM^{4} and an MEAM^{6} potential in comparison with the corresponding DFT and experimental values. The DP reproduces the lattice parameters and energies of the HCP, BCC, and FCC structures in excellent agreement with the target DFT values; the differences are smaller than 0.002 Å and 1 meV/atom for the lattice parameter and energy. The EAM and MEAM potentials also have accurate lattice parameters of the three phases; the deviations are around 1% from the DFT and experimental results. The target value for the HCP cohesive energy is chosen to be 4.85 eV/atom from experiments^{32} (absolute value of DFT cohesive energies are not precise^{33}). The target values for the BCC and FCC cohesive energies are calculated based on their relative energies from DFT and the experimental HCP reference value; this yields calibrated target cohesive energies of 4.74 and 4.79 eV/atom for BCC and FCC structures. The DP is fit to exactly reproduce the target (experimental) cohesive energy of HCP Ti (this corrects the DFT “errors” for the isolated atom). This corrected isolated atom energy was used to determine the cohesive energies of BCC and FCC phases. The MEAM potential has nearly the same cohesive energies, while the EAM potential has cohesive energies close to the DFT values.
In addition, we examine the efficacy of the DP model in reproducing the properties of larger (3 × 3 × 3) DFT supercells (Supplementary Fig. 2). For the perturbed BCC and HCP structures, the root mean square errors (RMSEs) of the energies are 0.5 and 1.4 meV/atom, respectively. The RMSEs of the atomic forces are 15.3 and 29.2 meV/Å, respectively. These errors are within typical DFT accuracy. We also examined the effect of adding the larger DFT supercell to the DP training set and refit the DP; adding the larger DFT supercell and enlarging the training set did not improve the DFT/DP agreement. Therefore, we conclude that the original DFT supercells/training sets are sufficiently large to produce reliable DP models. Furthermore, the DP model is fitted to both the energies and their derivatives (energies, forces, virials). This strategy improves the smoothness of the energy function and reduces overfitting to some extent. Nevertheless, the DP model is based on a neural net framework, rather than a physical model. As in all neural network approaches, there is a real risk of overfitting and its transferability is not guaranteed. For applications well beyond those in the training set, it would be prudent to exercise caution. For example, we find that the BCC generalised stacking fault energies (not in the training set) are overestimated by the DP model (Fig. 5).
The 0K DP elastic constants are in good agreement with corresponding DFT values for all three structures and available experimental results. The elastic constants of the DP and DFT are obtained by (i) applying a set of small strains (−1, −0.5, 0.5, and 1%) for each strain components (ε_{xx}, ε_{yy}, ε_{zz}, ε_{xy}, ε_{xz}, ε_{yz}), (ii) calculating the resultant (global) stress for each strain, and (iii) performing a linear leastsquaresfit of the obtained stressstrain data. For HCP Ti, the DP elastic constants match well with DFT and experimental values at 4 K^{34}; the deviations are less than 10% from DFT and around 20% from experiment (similar deviations exist between DFT and experiment). The DP elastic constants of the BCC structure are within ±5% of DFT values. As appropriate, both DP and DFT predict that the BCC structure is unstable at 0 K (i.e., C_{11} < C_{12}^{35,36}). The DP also accurately reproduces the elastic constants of FCC Ti from DFT (not experimentally measured). In comparison, the MEAM potential also reproduces the elastic constants of all three phases well, but the EAM potential shows large deviations in the BCC structure and the FCC structure is unstable in the calculation.
Turning to defect properties, Table 2 shows that DP predictions of the vacancy and surface energies in HCP Ti are in good agreement with DFT results. The DP model has its HCP surface energies (basal, prism, pyramidal I, and pyramidal II) within 1% from DFT values (the largest discrepancy is 0.02 J/m^{2} on the pyramidal II plane). The EAM potential overestimates surface energies by ~15% and the MEAM potential underestimates them by ~15%. The vacancy formation energy (E_{v}) of the DP model is ~0.35 eV (17%) higher than the DFT value. The EAM and MEAM potentials have E_{v} ~ 0.3 eV (15%) lower and 0.13 eV (6%) higher than the DFT value, respectively (see the convergence test of vacancy formation energy in Supplementary Table 1). Note that vacancy configurations are not explicitly included in the training datasets. The DP thus shows transferability and predictive capabilities on basic material properties.
Based on the above results, the MEAM^{6} potential performed exceptionally well in elastic constants and cohesive energies compared to experiment values. In the following, we mainly focus on the MEAM potential for comparisons with the DP model. The phonon spectra for DP, MEAM^{6}, and experiment are shown in Supplementary Fig. 3. For both the BCC and HCP phases, the DP model and the MEAM potential reproduce the experimental acoustic mode data better than the optical modes. For the HCP phase, both DP and MEAM are in good agreement with the overall trend in the experimental data. Both the DP and MEAM overestimate the optical L[001] phonon frequencies at Γ and underestimate the optical phonons at K and M. For the BCC phase, both DP and MEAM show unstable phonon branches reflecting the instability of the BCC phase at zero K (C_{12} > C_{11}).
The energy versus volume (EOS) curve is important for accurate prediction of mechanical response; Fig. 2 shows the 0K EOS curves for the HCP, BCC, and FCC phases. For each point on the EOS curve, a supercell is prestrained to the desired volume and equilibrated in DFT (i.e., the supercell volume is fixed but the supercell shape and ion positions are unconstrained). For consistency, the DP energy per atom was calculated using atomic coordinates from the equilibrated compressed/dilated DFT supercell. For all three structures, the energies per atom of the DP agree well with DFT values; the RMSE between DP and DFT is smaller than 1 meV/atom. The DP and DFT EOSs for the three phases are in excellent agreement over the entire volume range examined (14–20 Å^{3}/atom, corresponding to a ±20% volumetric strain).
γLine and γSurface
In HCP Ti, plastic deformation is carried by dislocations with 〈a〉 and 〈c + a〉 Burgers vectors primarily on the prism and pyramidal I planes, respectively. Slip on basal and pyramidal II planes and deformation twinning are also observed in Ti at high temperatures and in Ti alloys. The primary slip planes are typically those with the lowest screw dislocation dissociation energy. Dislocation nucleation, dissociation and glide behaviour are strongly influenced by the generalised stacking fault energy, or γline on each slip plane. Therefore, accurate γlines on all relevant slip planes are essential for modelling dislocation and plasticity behaviour. Figure 3 shows the γlines for these slip planes, as determined by DP, MEAM^{6}, and DFT^{2}. In this figure, the sheared (slipped) configurations indicated in the dashed boxes are included in the special training datasets. For the basal plane (Fig. 3a), the γline is computed along the \([0\bar{1}10]\) direction (corresponding to the Burgers vectors of the partial dislocation dissociated from the 〈a〉 dislocation). The DP accurately reproduces the DFT γline for slip from 0 to 40% of the \([0\bar{1}10]\) translation vector. While the stacking fault energies from 40 to 100% along this path are not wellreproduced, this is not relevant since the energy is very high and is not along the minimum energy path for slip. Most critically, the DP gives accurate unstable and stable stacking fault energies at 20 and 33% of the translation vector, respectively; the former controls dislocation nucleation and the latter governs dislocation dissociation (as discussed below). For the MEAM potential, the stable and unstable stacking fault energies (γ_{sf} and γ_{usf}) are 44 and 47% lower than the corresponding DFT values, which will make dislocation nucleation and dissociation much easier on the basal plane.
The DP model also reproduces the general shape of the γline in the \([\bar{2}110]/3\) slip direction (i.e., the 〈a〉 dislocation Burgers vector) on the prism and pyramidal I narrow planes (Fig. 3b, c). On the prism plane, γ_{sf} and γ_{usf} are ~23 and ~9% lower than the DFT values, while on the pyramidal I narrow plane, γ_{sf} and γ_{usf} are 6 and 4% higher from DFT values. The MEAM potential has shallow metastable points on the prism and pyramidal I narrow γlines; its γ_{sf} is ~15% higher and 6% lower than the corresponding DFT values on the two planes. Figure 3d, e show the γlines along \([\bar{2}113]/3\) (i.e., the 〈c + a〉 direction on the pyramidal I wide and II planes). The DP again reproduces the overall shape (positions and energies of the stable and unstable SFs) of the DFT γlines. Note that only a section of the pyramidal II γline information is included in the training dataset and no pyramidal I wide γline information is included; this further indicates the transferability and predictive capability of the current DP model. The MEAM potential also shows similar γlines on the two pyramidal planes in good agreement with the DFT results.
The γlines in Fig. 3 were determined using the standard method (atom displacements constrained along slip plane normals). However, stacking fault energies may be further reduced by full atomic relaxation especially for the HCP pyramidal planes^{31,37,38}. Table 3 shows the stable stacking fault positions and energies calculated by allowing only outofplane and full atomic relaxation using DFT, DP, and MEAM^{6}.
For the basal and prism planes, the stable stacking fault positions and energies remain almost unchanged upon full relaxation in DFT, DP (except for the decrease of prism γ_{sf} by 16%), and MEAM. For the pyramidal I narrow plane, the stable stacking fault shifts from the initial position along the 〈a〉 direction to another position towards the e_{2} direction (Fig. 4) and with substantial reductions in γ_{sf} (57, 79, and 53% in DFT, DP, and MEAM).
The DP model underestimates γ_{sf} by similar amounts (−60, −85, and −95 mJ/m^{2}) on the three competing planes (basal, prism, pyramidal I narrow) for the 〈a〉 dislocation. In comparison, the MEAM potential has deviations of −133, +35, and +8 mJ/m^{2} with respect to DFT. The metastable point governs dislocation core dissociation and DFT calculations show that the 〈a〉 dislocation can dissociate into a pair of partials on both the pyramidal I and prism planes, but not on the basal plane^{39}. The DP model captures the correct ordering of γ_{sf} among the three planes. The energy ordering, \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{basal}}}}\, > \,{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{prism}}}} \,> \,{\gamma }_{{{\mathrm{sf}}}}^{{\mathrm{pyramidal}}\,{\mathrm{I}}\,{\mathrm{narrow}}}\), is different from previous models^{5,6,7,8} with \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{basal}}}}\, < \,{\gamma }_{{{\mathrm{sf}}}}^{{{{\mathrm{pyramidal}}}}\,{\mathrm{I}}\, {\mathrm{narrow}}}\, < \,{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{prism}}}}\). The correct ordering helps to reproduce the ground state dislocation core structure on the pyramidal I narrow plane, as discussed below.
For the pyramidal I wide plane, the stable stacking fault position along the 〈c + a〉 direction shifts to another position along the e_{2} direction (Fig. 4d). The fully relaxed stable stacking fault position is consistent with symmetry requirements and suggests that the pure screw 〈c + a〉 dislocation will dissociate into partials of mixed character on the pyramidal I wide plane. In addition, the similarities in the stacking fault location and energy in DFT and DP indicate that the inplane relaxations should be similar, even though the stacking fault on the pyramidal I wide plane is not directly included in the training dataset. The MEAM potential overestimates γ_{sf} by 27%; its stacking fault position is similar to that of the DP. For the pyramidal II plane, the stable stacking fault shifts to another positions along the 〈c + a〉 direction with a decrease in γ_{sf} of 0.113 J/m^{2} upon inplane relaxation in DFT, 0.122 J/m^{2} in DP calculations, and 0.126 J/m^{2} in MEAM calculations. The DP model overestimates the fully relaxed γ_{sf} by 7%, while the MEAM potential overestimates γ_{sf} by 22%. The fully relaxed γ_{sf} difference between the two competing plane, i.e., \({\gamma }_{\,{{\mathrm{sf}}}}^{{{\mathrm{pyr. II}}}}{\gamma }_{{{\mathrm{sf}}}}^{{{\mathrm{pyr. I}}}}\), are 0.187, 0.207, and 0.223 J/m^{2} in DFT, DP, and MEAM, respectively. DFT, DP, and MEAM all suggest that it is much more (energetically) favorable for 〈c + a〉 screw dislocations to dissociate on the pyramidal I planes than on pyramidal II planes, consistent with experimental observations which show that pyramidal I 〈c + a〉 slip is dominant in Ti^{40}.
The γlines in Fig. 3 represent possible minimum energy paths (based upon crystal symmetry) for slip along a crystallographic plane. To ensure that these do represent the true minimum energy paths, we calculate the entire γsurface for four crystallographic planes in HCP. Figure 4 shows the DP γsurface. The red crosssymbols (’X’) indicate the stacking fault positions determined from DFT (see Table 3). The overall γsurfaces are consistent with crystal symmetry and all stable stacking faults are properly reproduced in accordance with the DFT results. While the smoothness of the γsurfaces are not guaranteed with neural networkbased potentials, the high degree of smoothness here indicates that the DP potential is suitable for atomistic simulations. The quantitative and qualitative features of all of the γsurfaces, together with properties presented earlier, suggest that the current DP is appropriate for modelling HCP Ti mechanical response.
BCC Ti has important structural material applications along with intriguing features. Yet, its properties and behaviour are less well understood than the HCP allotrope of Ti. The BCC phase is entropically stabilised above 1155 K. To further test the capabilities of the DP for BCC Ti, we first compare γlines on the {110}, {112}, and {123} planes with DFT predictions (obtained using only outofplane atomic relaxation).
Figure 5 shows the γlines along the close packed 〈111〉 direction on the three slip planes. The DP reproduces the overall profiles and main features of all of the γlines calculated by DFT. In DFT, DP, and MEAM, all three lines show negative stacking fault energies at small slip distances ±0.15. This is not surprising since the BCC structure is not the ground state at 0 K. In addition, no metastable stacking fault is seen near half 〈a〉 (i.e., [111]/4), consistent with the broad observation that no metastable stacking fault has been found in any BCC metals^{41}. Furthermore, the γlines are symmetric about half the translation vector x = 0.5 on the {110} plane, but slightly asymmetric on {112} and {123}. While the DP overestimates γ_{usf} (energy maximum) along the γlines as compared with DFT and the exceptionally good MEAM, it importantly captures the same peakenergy/barrier ordering between the three planes, i.e., \({\gamma }_{\,{{\mathrm{usf}}}\,}^{\{110\}}\, < \,{\gamma }_{\,{{\mathrm{usf}}}\,}^{\{123\}}\, < \,{\gamma }_{\,{{\mathrm{usf}}}\,}^{\{112\}}\). The DP thus reproduces the overall shape and key features of the γlines, demonstrating its predictive capabilities, despite the fact that no BCC γline information is used in the training dataset.
HCP lattice and elastic constants at finite temperatures
All of the properties presented above were calculated at 0 K to provide baseline material properties. Since titanium alloys are also employed for medium temperature (<600 K) applications, we now investigate several properties at finite temperatures, as shown in Fig. 6. We first compare the temperature dependence of HCP Ti lattice parameters (a, c, c/a) with experimental data^{34,42,43}. The DP shows that the lattice parameter a increases nearly linearly from 0 to 1000 K with similar thermal expansion coefficients as experimental measurements. The DP underestimates a by ~0.01 Å as compared to experiments, since it is trained with DFT results which also underestimate a (similarly for c). The coefficient of linear thermal expansion for a is nearly identical to experiment up to 1000 K. On the other hand, the coefficient of linear thermal expansion for c from the DP is larger than that from experiment. The atomic volume can be calculated from the lattice parameters a and c. While the DP underestimates the volume per atom of the HCP structure (inherited from the DFT training dataset), the volumetric thermal expansion coefficient (β) of the DP is 3.05 × 10^{−5} K^{−1} at 300 K, as compared with the experiment value of ~2.70 × 10^{−5} K^{−1} at the same temperature^{44}. While the experimental data show a weak, nearly linear reduction of c/a (important for twinning) with increasing temperature, the DP results show a weak increase in c/a with temperature (c/a varies <1% from 0 to 1000 K). Overall, the temperaturedependent lattice parameters and atomic volume from the DP are in good agreement with experiment; the discrepancies are associated with differences between DFT predictions and experiment. For MEAM, it underestimates a by about 1% and overestimates c by about 1% compared to experiment. The MEAM results also show a weak increase in c/a with temperature (c/a varies about 1% from 0 to 1000 K).
Figure 6d–f show the elastic constants of DP and MEAM as a function of temperature in comparison with experimental results. The finite temperature elastic constants calculations were performed by applying a ±1% strain individually for each strain component (ε_{xx}, ε_{yy}, ε_{zz}, ε_{xy}, ε_{xz}, ε_{yz}) at each temperature. The elastic constants represent (6000 time step) time averages of the global stress and averages over ± strains.
There exist small discrepancies between the DP elastic constants and the experimental data at 0 K. With increasing temperature, C_{11} from experiment decreases continuously; the DP predictions initially decrease very rapidly between 0 and 50 K, and then slowly and approach experimental values at higher temperatures (Fig. 6d). MEAM shows a continuous decrease and is very close to DP on C_{11} at high temperatures. The DP C_{33} is relatively accurate in the entire temperature range. It increases slightly from 0 to 150 K and then decreases continuously from 150 to 1000 K—asymptotically approaching the monotonically decreasing experimental data. The MEAM C_{33} decreases very rapidly between 0 and 50 K and then slowly at higher temperatures. It is about 15% lower than the experimental data. Similar levels of agreements were obtained between DP, MEAM predictions and experimental data for other C_{ij}. These results are perhaps sufficiently accurate for HCP Ti MD simulations. For both the DP and MEAM potentials, the discrepancies of elastic constants at finite temperatures are not expected and highlight the challenges in modeling HCP systems.
Phase transition temperatures
Phase stability over a temperature and pressure range is important for MD simulations of HCPBCC Ti. To determine the phase stability of the DP, we compute the free energies of the HCP, BCC and liquid structures using thermodynamic integration^{45} and determine the phase transition temperatures as a function of pressure. Figure 7 shows the transition temperatures of the DP model as a function of pressure in comparison with experimental data^{46,47} and MEAM^{6}. At zero pressure, the DP exhibits an HCP to BCC transition at 1140 K and BCC melting at 1886 K, in remarkable agreement with 1155 and 1941 K from experiments. The DP melting point increases with increasing pressure, consistent with experimental results and thermodynamic expectation (molar volume of BCC < that of the liquid). However, the MEAM melting point first increases and then decreases with increasing pressure. The HCP to BCC structural phase transition temperature shows little dependence on pressure for both DP and MEAM, while the experimental data suggest that the transition temperature decreases slowly with increasing pressure. The mismatch in the slope of the HCPBCC phase boundary is likely due to a minor inadequacy of the DP and MEAM in reproducing the atomic volume of the two phases, which can be improved if necessary. Overall, the DP shows good agreement with experiment results on the HCPBCC transition temperatures and the BCC melting temperatures. Accurate DP phase transition temperatures is likely a consequence of including high temperature configurations in the training dataset (DPGEN bulk datasets). This suggests that high temperature thermodynamic properties can be reproduced by training the DP with the current framework and that the DP is effective for describing phase transitions and mechanical behaviour over the full solidstate temperature range.
Dislocation core structures
In this section, we examine dislocation core structures using the DP. We focus on the core structure of the screw 〈a〉 dislocation in HCP, which governs many intriguing features of slip in Ti^{48}. We first created a Volterra 〈a〉 dislocation in a 303 × 163 × 9 Å supercell with periodic boundary conditions in the glide (303 Å) and dislocation line directions (9 Å). At 300 K, the dislocation adopts several structures on the pyramidal I and prism planes, and occasionally adopts a compact core (Supplementary Fig. 4). This suggests the energies of the competing core configurations are very close and can be influenced by applied strains, solutes and temperatures. No dissociation on the basal plane was observed (in 10^{5} MD steps).
To quantitatively analyse the core structure, MD configurations were quenched every 2000 time steps, followed by energy minimisation (Fig. 8). Two distinct core configurations were observed, corresponding to dissociations on the pyramidal I and prism planes; similar core structures were seen in DFT calculations^{48}. In the DP model, the ground state dissociation plane for the screw 〈a〉 dislocation is the pyramidal I plane. The dissociated core on the prism plane exhibits a slightly higher energy (~1.4 meV/Å, Supplementary Fig. 5), while DFT calculations based on quadrupolar configurations^{48} show a higher energy of 5.7 meV/Å. This energy difference between these configurations is small (the MEAM potential slightly favours dissociation on the prism plane^{49}). While some discrepancies still exist between the ground state core structure of the present DP and DFT (cf. Fig. 8b and Fig. 4c in ref. ^{48}), further tuning the DP model for core structures increases the risk of overfitting; core structures with better agreement with DFT may be obtained at the expense of lessaccurate C_{11}. Nevertheless, the DP should provide an accurate description of dislocation glide behavior at intermediate and elevated temperatures. We refer the reader to a more extensive discussion of the screw 〈a〉 dislocation core structures and energies in the Supplementary Information. Core structures and dynamics of other dislocations (e.g., the edge 〈a〉 dislocation, edge and screw 〈c + a〉 dislocations, twinning dislocations, and dislocations in the BCC phase) are equally important and are currently under study.
Computational cost of DP
Finally, we compare the speed of DP, EAM, and MEAM implementations on both CPUs and GPUs (Supplementary Fig. 6). On CPUs, the DP model is 200–300 times slower than EAM potentials and 30–40 times slower than the MEAM potential. For system with several dozen atoms, DP is faster than DFT (use the VASP settings in Methods section) by a factor of over 10^{6}. On GPUs, DP is 20–30 times slower than the EAM potential (MEAM is currently not ported to GPU in LAMMPS). Additional optimisation of the networks at the heart of the DP is possible by further optimisations on different operators, on the computational graph, and on multiple hardware devices^{50}. All potentials show a linear scaling with the number of atoms. Because of this linearity and speed, the DP model can be used to perform large scale MD simulations of many defects including dislocations, grain and phase boundaries with relatively good accuracy and speed that are far outside the reach of DFT (except for very simple cases).
In summary, we reported a procedure for specialising a general purpose neural network (DP) interatomic potential to reproduce important physical properties for specific applications (X); i.e., DPspecX . In particular, we developed a DP for modelling the mechanical response of Ti; TiDPspecMech. The resulting DP accurately reproduces a comprehensive range of properties both within and outside the training datasets for HCPBCC Ti. The DP thus enables a wide range of molecular statics and dynamics simulations in HCPBCC Ti, including dislocation, interphase interfaces, fracture, solid–solid and solid–liquid phase transition behaviour/properties. Comprehensive benchmarks also show that the MEAM potential^{6} performed extraordinarily well in many aspects, suggesting that classical interatomic potential models will also remain relevant in the foreseeable future.
While no empirical interatomic potential provides accurate reproduction of all material properties over all temperatures and stress states, the approach provided here begins with a general purpose potential and specialises it for classes of properties of interest. The current procedure is general and can be applied to development of interatomic potentials for materials beyond Ti and applications beyond mechanical response within the DP framework or other machine learning based approaches. The selection of special dataset can be standardised and neural network parameters can be optimised by standard algorithms/codes (e.g., TensorFlow). This approach thus represents a shift from empiricismdriven to machinedriven interatomic potentials fittopurpose.
Methods
DFT calculations
All DFT calculations are performed using the VASP^{28,29} with the Perdew–Burke–Ernzerhof^{51} generalised gradient approximation exchangecorrelation functional. The cutoff energy of the planewave basis set is 650 eV and core electrons are replaced with the projectoraugmentedwave method^{52}. Kpoints with grid spacing of 0.1 Å^{−1} are sampled in the Brillouin zone by the Monkhorst–Pack Mesh method^{53}. The Methfessel–Paxton smearing method^{54} with order 1 and smearing width σ = 0.22 eV is used for partial electron occupancy. Selfconsistent convergence is assumed when the energy variation is below 10^{−3} meV.
Specialisation step details
“Special” training sets were generated along HCP γlines and equilibrated in DFT using the standard γline calculation approach and parameters above. The DFTcalculated γlines are shown in Fig. 3 and are in good agreement with previous calculations^{31}. Among all configurations, we choose those near the unstable and stable stacking fault displacements to form the special training sets. For example, seven structures at slip distances 0, 0.15, 0.20, 0.25, 0.30, 0.35, and 0.40 were chosen from the γline on the basal plane, as shown in Fig. 3a. The atomic coordinates, forces, total energy, and virial tensor of these structures are input to the training iteration. These structures represent configurations of importance for plastic deformation. We therefore increase their weight to 100 in the loss function of the neural network calibration (configurations in the general training sets have a default weight of 1). This is equivalent to generating 7 × 100 special training sets from the basal plane γline. Special training sets are chosen and weighted similarly for the prism, pyramidal I narrow and pyramidal II plane γlines. In total, the “special” training sets include 4600 configurations from four γlines on different planes, as seen in Table 1. No pyramidal I wide plane γline configurations were included in the training set for crossvalidation purposes. The selection and weight of the special training sets are flexible and can be further optimised for desired targets.
The DeePMDkit package^{14} is used for training a smooth Ti DP^{15}. These training data consist of “classic” and “special” training sets. The embedding and fitting net sizes are (25, 50, 100), and (240, 240, 240), respectively. The cutoff radius of the DP is 9.0 Å and includes at least the thirdnearest neighbours in the HCP structure. Four models are trained, starting with different random seeds, but using the same neural network architecture and training sets. The pyramidal I narrow plane γline is not included initially for training general models. The learning rate starts at 1 × 10^{−3} and ends at 5 × 10^{−8} after 8 × 10^{6} training steps. The atomic forces and total energy of all the training sets are included in the training, but only the virial tensor of the training sets from the Initialisation step (1469 datasets in Table 1) is used to obtain more accurate description of elastic constants near equilibrium. In addition, the prefactors for the energy, atomic force and virial tensor in the loss functions are \({p}_{\,{{\mathrm{e}}}}^{{{\mathrm{start}}}\,}\)=10, \({p}_{\,{{\mathrm{e}}}}^{{{\mathrm{limit}}}\,}\)=100, \({p}_{\,{{\mathrm{f}}}}^{{{\mathrm{start}}}\,}\)=1, \({p}_{\,{{\mathrm{f}}}}^{{{\mathrm{limit}}}\,}\)=1, \({p}_{\,{{\mathrm{v}}}}^{{{\mathrm{start}}}\,}\)=10, and \({p}_{\,{{\mathrm{v}}}}^{{{\mathrm{limit}}}\,}\)=10. These parameters give more weights to the energy and virial tensor, as compared to the atomic forces. Afterwards, the pyramidal I narrow plane γline is included to tweak the dislocation core energy difference. One DP model is further trained starting from the current neural network and the learning rate starts at 1 × 10^{−4} and ends at 5 × 10^{−8} after 1.6 × 10^{7} training steps.
Data availability
The Ti DP model and the training sets in this work have been uploaded to the online open data repository http://dplibrary.deepmd.net/.
References
Kohn, W. & Sham, L. J. Selfconsistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965).
Vítek, V. Intrinsic stacking faults in bodycentred cubic crystals. Philos. Mag. A 18, 773–786 (1968).
Oh, D. & Johnson, R. Simple embedded atom method model for fcc and hcp metals. J. Mater. Res. 3, 471–478 (1988).
Mendelev, M., Underwood, T. & Ackland, G. Development of an interatomic potential for the simulation of defects, plasticity, and phase transformations in titanium. J. Chem. Phys. 145, 154102 (2016).
Kim, Y.M., Lee, B.J. & Baskes, M. I. Modified embeddedatom method interatomic potentials for Ti and Zr. Phys. Rev. B 74, 014101 (2006).
Hennig, R., Lenosky, T., Trinkle, D., Rudin, S. & Wilkins, J. W. Classical potential describes martensitic phase transformations between the α, β, and ω titanium phases. Phys. Rev. B 78, 054121 (2008).
Ko, W.S., Grabowski, B. & Neugebauer, J. Development and application of a NiTi interatomic potential with high predictive accuracy of the martensitic phase transition. Phys. Rev. B 92, 134107 (2015).
Dickel, D., Barrett, C., Carino, R., Baskes, M. & Horstemeyer, M. Mechanical instabilities in the modeling of phase transitions of titanium. Model. Simul. Mater. Sci. Eng. 26, 065002 (2018).
Trinkle, D. et al. Empirical tightbinding model for titanium phase transformations. Phys. Rev. B 73, 094123 (2006).
Girshick, A., Bratkovsky, A. M., Pettifor, D. G. & Vitek, V. Atomistic simulation of titanium. I. A bondorder potential. Philos. Mag. A 77, 981–997 (1998).
Girshick, A., Pettifor, D. G. & Vitek, V. Atomistic simulation of titanium. II. Structure of 1/3 〈1210〉 screw dislocations and slip systems in titanium. Philos. Mag. A 77, 999–1012 (1998).
Behler, J. & Parrinello, M. Generalized neuralnetwork representation of highdimensional potentialenergy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Zhang, L., Han, J., Wang, H., Car, R. & E, W. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Wang, H., Zhang, L., Han, J. & E, W. Deepmdkit: A deep learning package for manybody potential energy representation and molecular dynamics. Comput. Phys. Commun. 228, 178 (2018).
Zhang, L. et al. Endtoend symmetry preserving interatomic potential energy model for finite and extended systems. In Bengio, S. et al. (eds.) Advances in Neural Information Processing Systems, vol. 31, 4436–4446 (Curran Associates, Inc., 2018).
Zhang, L., Lin, D. Y., Wang, H., Car, R. & E, W. Active learning of uniformly accurate interatomic potentials for materials simulation. Phys. Rev. Mater. 3, 023804 (2019).
Zhang, Y. et al. DPGEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun. 253, 107206 (2020).
Artrith, N. & Urban, A. An implementation of artificial neuralnetwork potentials for atomistic materials simulations: performance for TiO_{2}. Comput. Mater. Sci. 114, 135 – 150 (2016).
Schütt, K., Sauceda, H., Kindermans, P.J., Tkatchenko, A. & Müller, K.R. SchNet—A deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Hy, T., Trivedi, S., Pan, H., Anderson, B. & Kondor, R. Predicting molecular properties with covariant compositional networks. J. Chem. Phys. 148, 241745 (2018).
Bartók, A., Payne, M., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Deng, Z., Chen, C., Li, X.G. & Ong, S. An electrostatic spectral neighbor analysis potential for lithium nitride. npj Comput. Mater. 5, 75 (2019).
Shapeev, A. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14, 1153–1173 (2016).
Wen, T. et al. Development of a deep machine learning interatomic potential for metalloidcontaining PdSi compounds. Phys. Rev. B 100, 174101 (2019).
Zong, H., Pilania, G., Ding, X., Ackland, G. & Lookman, T. Developing an interatomic potential for martensitic phase transformations in zirconium by machine learning. npj Comput. Mater. 4, 48 (2018).
Takahashi, A., Seko, A. & Tanaka, I. Conceptual and practical bases for the high accuracy of machine learning interatomic potentials: application to elemental titanium. Phys. Rev. Mater. 1, 063801 (2017).
Dickel, D., Francis, D. & Barrett, C. Neural network aided development of a semiempirical interatomic potential for titanium. Comput. Mater. Sci. 171, 109157 (2020).
Kresse, G. & Furthmüller, J. Efficiency of abinitio total energy calculations for metals and semiconductors using a planewave basis set. Comput. Mater. Sci. 6, 15–50 (1996).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio totalenergy calculations using a planewave basis set. Phys. Rev. B 54, 11169 (1996).
Plimpton, S. Fast parallel algorithms for shortrange molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Yin, B., Wu, Z. & Curtin, W. A. Comprehensive firstprinciples study of stable stacking faults in hcp metals. Acta Mater. 123, 223–234 (2017).
Kittel, C. Introduction to Solid State Physics, 8th edition (New York, Wiley, 2005).
Schimka, L., Gaudoin, R., Klimeš, J., Marsman, M. & Kresse, G. Lattice constants and cohesive energies of alkali, alkalineearth, and transition metals: Random phase approximation and density functional theory results. Phys. Rev. B 87, 214102 (2013).
Simmons, G. & Wang, H. Single Crystal Elastic Constants and Calculated Aggregate Properties. A Handbook, 2nd edition (The MIT Press, 1971).
Born, M. & Huang, K. Dynamical Theory of Crystal Lattices. International series of monographs on physics (Clarendon Press, 1988).
Mouhat, F. & Coudert, F.X. Necessary and sufficient elastic stability conditions in various crystal systems. Phys. Rev. B 90, 224104 (2014).
Kwaśniak, P., Śpiewak, P., Garbacz, H. & Kurzydłowski, K. Plasticity of hexagonal systems: Split slip modes and inverse peierls relation in αTi. Phys. Rev. B 89, 144105 (2014).
Kwaśniak, P., Garbacz, H. & Kurzydłowski, K. Solid solution strengthening of hexagonal titanium alloys: restoring forces and stacking faults calculated from first principles. Acta Mater. 102, 304 (2016).
Kwaśniak, P. & Clouet, E. Basal slip of 〈a〉 screw dislocations in hexagonal titanium. Scr. Mater. 162, 296–299 (2019).
Numakura, H., Minonishi, Y. & Koiwa, M. \({\langle \bar{1}\bar{1}23\rangle}\,{\langle \bar{1}\bar{1}23\rangle}\) slip in titanium polycrystals at room temperature. Scr. Metall. 20, 1581–1586 (1986).
Rodney, D., Ventelon, L., Clouet, E., Pizzagalli, L. & Willaime, F. Ab initio modeling of dislocation core properties in metals and semiconductors. Acta Mater. 124, 633–659 (2017).
Barrett, C. & Massalski, T. Structure of Metals (New York, McGrawHill, 1966).
Souvatzis, P., Eriksson, O. & Katsnelson, M. I. Anomalous thermal expansion in αTitanium. Phys. Rev. Lett. 99, 015901 (2007).
Mal’ko, P., Arensburger, D., Pugin, V., Nemchenko, V. & L’Vov, S. Thermal and electrical properties of porous titanium. Powder Metall. Met. Ceram. 9, 642–644 (1970).
Vega, C., Sanz, E., Abascal, J. & Noya, E. Determination of phase diagrams via computer simulation: methodology and applications to water, electrolytes and proteins. J. Phys.: Condens. Matter 20, 153101 (2008).
Tonkov, E. & Ponyatovsky, E. Phase Transformations of Elements under High Pressure (Boca Raton, CRC Press, 2004).
Stutzmann, V., Dewaele, A., Bouchet, J., Bottin, F. & Mezouar, M. Highpressure melting curve of titanium. Phys. Rev. B 92, 224110 (2015).
Clouet, E., Caillard, D., Chaari, N., Onimus, F. & Rodney, D. Dislocation locking versus easy glide in titanium and zirconium. Nat. Mater. 14, 931–936 (2015).
Ghazisaeidi, M. & Trinkle, D. Core structure of a screw dislocation in Ti from density functional theory and classical potentials. Acta Materialia 60, 1287–1292 (2012).
Lu, D. et al. DP train, then DP compress: Model compression in deep potential molecular dynamics. Preprint at https://arxiv.org/abs/2107.02103 (2021).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Blöchl, P. Projector augmentedwave method. Phys. Rev. B 50, 17953 (1994).
Monkhorst, H. & Pack, J. Special points for brillouinzone integrations. Phys. Rev. B 13, 5188 (1976).
Methfessel, M. & Paxton, A. Highprecision sampling for brillouinzone integration in metals. Phys. Rev. B 40, 3616 (1989).
Acknowledgements
This work is supported by the Research Grants Council, Hong Kong SAR through the Collaborative Research Fund under project number 8730054 and Early Career Scheme Fund under project number 21205019. T.Q.W. acknowledges the support of the Hong Kong institute for Advanced Study, City University of Hong Kong through a postdoctoral fellowship. The work of H.W. is supported by the National Science Foundation of China under Grant No. 11871110 and the Beijing Academy of Artificial Intelligence (BAAI). L.F.Z. acknowledges the support of the BAAI. We are also grateful for Dr. Wanrun Jiang, Fengbo Yuan, and Denghui Lu for helpful discussions on the training, free energy calculations, and model compression.
Author information
Authors and Affiliations
Contributions
T.Q.W. performed the research and analysed the data. R.W. conducted and analysed dislocation simulations. L.Y.Z. provided some DFT data. L.F.Z. and H.W. provided guidance on data generation, training, and data analysis. D.J.S. and Z.W. conceived and directed the project. T.Q.W., D.J.S., and Z.W. wrote and all authors provided input on the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wen, T., Wang, R., Zhu, L. et al. Specialising neural network potentials for accurate properties and application to the mechanical response of titanium. npj Comput Mater 7, 206 (2021). https://doi.org/10.1038/s4152402100661y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4152402100661y
This article is cited by

Pretraining of attentionbased deep learning potential model for molecular simulation
npj Computational Materials (2024)

Accurate machine learning force fields via experimental and simulation data fusion
npj Computational Materials (2024)

Coherent and semicoherent α/β interfaces in titanium: structure, thermodynamics, migration
npj Computational Materials (2023)