Introduction

The mesoscopic method of Discrete Dislocation Dynamics (DDD) aims to compute crystal plasticity response from the motion and interaction of dislocations1,2,3,4,5,6,7. DDD is defined on dislocation lines represented as interconnected dislocation segments thus reducing the number of degrees of freedom by many orders of magnitude compared to an atomistic model of the same crystal. Given its relative computational efficiency, DDD has been long regarded as a promising scale-bridging method capable of directly connecting macroscopic crystal plasticity to the underlying microscopic mechanisms of dislocation motion. It is only relatively recently, owing to the steadily growing HPC capabilities, that fully atomistic Molecular Dynamics (MD) simulations of bulk crystal plasticity have become possible8,9,10,11. Still, owing to their high computational cost, MD simulations of crystal plasticity remain limited to deformation rates many orders higher than rates typical of quasistatic mechanical tests. Furthermore, MD simulations are presently limited to sub-micron length scales that are too small compared to length scales characteristic of dislocation microstructures observed under quasistatic deformation conditions. Thus, when it comes to quasistatic deformation, DDD is likely to remain the method of choice for predicting crystal plasticity from dislocation motion for years to come.

The key material-specific ingredient of a DDD model encapsulating the physics of dislocation motion is a dislocation mobility function used to predict velocity of a dislocation segment in response to mechanical force or stress seen by the segment. Typically, dislocation motion is assumed to be over-damped (ignoring inertia) and its velocity is computed as

$${{\boldsymbol{V}}}_{i}=\frac{d{{\boldsymbol{r}}}_{i}}{dt}={\mathcal{M}}\left[{{\boldsymbol{F}}}_{i}\left(\{{{\boldsymbol{r}}}_{j}\},\{{{\boldsymbol{b}}}_{jk}\}\right)\right]$$
(1)

where rj and bjk are positions of nodes and the Burgers vectors of segments defining the set of vertices and edges of a geometric graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\) representing the dislocation network.

Physical fidelity of a DDD simulation is predicated on availability of an accurate dislocation mobility function \({\mathcal{M}}\). Owing to difficulties in extracting the force-velocity relationship from experiments, dislocation theory has been used to predict dislocation mobility, e.g. by computing the nucleation energy barrier and the critical stress for dislocation motion controlled by kink-pair mechanisms12,13,14 or by extracting a dislocation drag coefficient from the spectrum of thermal fluctuations15. It has become common practice to calibrate dislocation mobility functions against MD simulations of individual dislocations. Within this approach, a parametric functional form of the mobility function is “hand-crafted” based on physical knowledge or intuition and its parameters are fitted to reproduce dislocation velocities observed in MD calculations of isolated dislocations of various line orientations (e.g. edge, screw, mixed) performed under stress and temperature conditions of interest14,16,17,18,19,20,21,22. Widely adopted, the conventional calibration workflow is based on a rarely articulated assumption of transferability23, namely that mobility of individual/isolated dislocations is representative of mobility of dislocation segments incorporated in a dislocation networks formed under deformation. Such an assumption leaves aside possible effects of dislocation connectedness on dislocation mobility, such as was recently observed in ref. 24.

Here we propose an alternative approach to calibration of dislocation mobility functions that does not rely on the assumption of transferability and accounts for network effects on dislocation motion. Instead of extracting mobility parameters from MD simulations of individual dislocations, in our proposed DDD+ML approach dislocation mobility is predicted by a graph neural network (GNN) trained on dislocation trajectories extracted from massive MD simulations of crystal plasticity. In training our GNN-based mobility functions we regard MD simulations as the ground-truth that subsequent DDD+ML simulations are intended to reproduce. Using body-centered cubic (BCC) tungsten as a testbed material model, we show that GNN can learn non-trivial physics of dislocation motion and, when subsequently used as a mobility function in DDD simulations, matches plasticity response observed in MD simulations used in training as well as in MD simulations performed under previously unseen deformation conditions.

Results

DDD+ML model

Viewed as a network, dislocation microstructure is a natural geometric graph (Fig. 1). Recently explored in a simplified model25, this correspondence is further exploited here by training a GNN to predict mobility of dislocation segments comprising dislocation networks. For training the GNN we use dislocation network configurations extracted using the dislocation extraction algorithm (DXA)26,27 from large-scale (ground-truth) MD simulations.

Fig. 1: The DDD+ML approach.
figure 1

a A schematic showing two consecutive dislocation networks \({{\mathcal{G}}}^{t}\) and \({{\mathcal{G}}}^{t+\Delta t}\) extracted using DXA from two MD trajectory snapshots at times t (solid lines) and t + Δt (dashed lines). As returned by DXA, two network configurations are defined by nodal positions r = {ri} and Burgers vectors of their dislocation segments b = {bij}, but are generally non-isomorphic and contain no information about mechanical forces driving dislocation motion. Not having the same nodes present in both networks leaves it uncertain how to define nodal velocities. For instance, three arrows of different colors pointing out of node 2 show three possible ways to define its velocity V2. b A schematic of inference and training loops of our proposed workflow for developing a GNN dislocation mobility function. c An illustration of the Nye’s tensor field-matching approach circumventing the ill-defined problem of matching nodal velocities. As exemplified here, our field-matching procedure is agnostic to details of line discretization and network topology.

In a nodal version of DDD widely adopted by its practitioners, training a GNN to serve as a dislocation mobility function entails learning a relationship between the input (force on the node) and the output (node velocity) vectors given the instantaneous geometry and topology of the dislocation network (Eq. (1)). Although the task may appear straightforward, two practical complications arise. First, it is difficult to accurately define mechanical force acting on a dislocation node in an atomistic MD model (more on this later). Second, the DXA algorithm extracts only instantaneous configurations of dislocation networks thus velocities of dislocation nodes have to be inferred from displacements of dislocation nodes over the time interval between two subsequent DXA snapshots. Consider two graphs \({{\mathcal{G}}}^{t}\) and \({{\mathcal{G}}}^{t+\Delta t}\) representing consecutive states of a dislocation network at times t and t + Δt, respectively (Fig. 1a). More often than not, graphs \({{\mathcal{G}}}^{t}\) and \({{\mathcal{G}}}^{t+\Delta t}\) constructed by the DXA algorithm are not topologically isomorphic, meaning in particular that some of the graph nodes present at time t no longer exist at time t + Δt and vice versa. This complication is due to different meshing of dislocation lines in two DXA networks as well as due to topological events (e.g. dislocation collisions, cutting, etc) taking place over the time interval between t and t + Δt. This lack of isomorphism makes it impossible to establish a one-to-one correspondence between dislocation nodes in two consecutive network configurations as needed to define nodal velocities. As such, learning on nodal velocity vectors is not a well-defined task for machine-learning (ML).

We propose to address both these challenges of unknown ground-truth force and velocity vectors in the following way. First, to address the issue of defining the mechanical force Fi on a dislocation node i, we assume that a substantial part of it seen by each node in an MD simulation can be estimated from the standard solutions of dislocation theory commonly employed in DDD simulations, e.g.

$${{\boldsymbol{F}}}_{i}^{{\rm{DDD}}}={{\boldsymbol{F}}}_{i}^{{\rm{ext}}}+{{\boldsymbol{F}}}_{i}^{{\rm{int}}}+{{\boldsymbol{F}}}_{i}^{{\rm{core}}}$$
(2)

where \({{\boldsymbol{F}}}_{i}^{{\rm{ext}}}\) is mechanical force due to an externally applied stress, \({{\boldsymbol{F}}}_{i}^{{\rm{int}}}\) is due to elastic interactions among dislocation segments, and \({{\boldsymbol{F}}}_{i}^{{\rm{core}}}\) is associated with the dislocation core. That this DDD force is only an approximation of the true force \({{\boldsymbol{F}}}_{i}^{{\rm{MD}}}\) exerted on node i in an MD simulation can be expressed as

$${{\boldsymbol{F}}}_{i}^{{\rm{MD}}}={{\boldsymbol{F}}}_{i}^{{\rm{DDD}}}+{{\boldsymbol{F}}}_{i}^{{\rm{corr}}}$$
(3)

where \({{\boldsymbol{F}}}_{i}^{{\rm{corr}}}\) is an unknown correction term accounting for the error made in approximating the true force using Eq. (2). Given that the external and the interaction terms in the DDD force are smoothly varying functions that are accurately and efficiently computed in DDD, the correction term in Eq. (3) is expected to be a ”shortsighted” function of the local network topology and geometry which we leave for the GNN model to learn. Thus we define the nodal mobility function \({{\mathcal{M}}}_{{\boldsymbol{\theta }}}^{{\rm{GNN}}}\) as a parametric function with learnable parameters θ,

$${{\boldsymbol{V}}}_{i}={{\mathcal{M}}}_{{\boldsymbol{\theta }}}^{{\rm{GNN}}}\left({{\boldsymbol{F}}}_{i}^{{\rm{DDD}}},\{{N}_{j}\},\{{E}_{jk}\}\right),$$
(4)

and let the GNN model learn the resulting nodal velocity vector Vi as a function of the estimated force \({{\boldsymbol{F}}}_{i}^{{\rm{DDD}}}\) and a local graph neighborhood of node i with nodal and edge attributes Nj and Ejk (see Methods for a detailed definition of the local graph and its attributes). So-defined and properly trained, the mobility function in Eq. (4) is expected to learn the unknown but local correction to the nodal force \({{\boldsymbol{F}}}_{i}^{{\rm{corr}}}\). To the extent that the correction term is indeed learnable from the ground-truth MD data and considering that nodal forces in DDD are always computed using Eq. (2), our formulation should ensure accuracy of subsequent rollout DDD simulations.

To address uncertainty of defining nodal velocity vectors in the ground-truth MD data we resort to a recently proposed sweep-tracing algorithm (STA)10 that deals with precisely the same uncertainty. STA employs a dual line/field representation of a dislocation network to “reconnect” any two successive dislocation networks even when they are not topologically isomorphic. In STA, Nye’s tensor fields produced by two dislocation networks are first computed and mapped on a static spatial grid. The task of reconnection is then defined as an optimization problem that seeks to minimize a properly defined distance between Nye’s tensor28 field representations of two networks10.

Borrowing from the STA approach, training of the GNN-based mobility function \({{\mathcal{M}}}_{{\boldsymbol{\theta }}}^{{\rm{GNN}}}\) in Eq. (4) is achieved by minimizing over parameters θ the following loss function

$${\mathcal{L}}({\boldsymbol{\theta }})=\sum _{s}\sum _{{\boldsymbol{g}}}\sum _{kl}{\left[{\alpha }_{kl}^{{\boldsymbol{g}}}\left({{\boldsymbol{r}}}^{{t}_{s}}+{{\boldsymbol{{\mathcal{M}}}}}_{{\boldsymbol{\theta }}}^{{\rm{GNN}}}\Delta {t}_{s},{{\boldsymbol{b}}}^{{t}_{s}}\right)-{\alpha }_{kl}^{{\boldsymbol{g}}}\left({{\boldsymbol{r}}}^{{t}_{s}+\Delta {t}_{s}},{{\boldsymbol{b}}}^{{t}_{s}+\Delta {t}_{s}}\right)\right]}^{2}$$
(5)

where the sums are over the set of training examples s, grid points g and Cartesian components of the Nye’s tensor field computed on the grid points using the method introduced in29. During training, the GNN learns to predict nodal velocity vectors Vi at time ts, by best matching its predictions to the field representation of the network at a later time ts + Δts, Fig. 1c. Defining and computing training loss on a static grid addresses the seemingly ill-defined problem of matching topology of two generally non-isomorphic graphs.

Calculations of the Nye’s tensor field produced by dislocation networks are only required for computing and back-propagating parameters θ during training. Once trained, only the GNN-based mobility function in Eq. (4) needs to be evaluated at the inference time, Fig. 1b. Our GNN model and its training protocol are implemented in PyTorch30.

Validation on DDD trajectories

Although our intent is to train GNN-based mobility functions on dislocation trajectories extracted from MD simulations, it can be insightful to first test the method by observing if and how well a GNN trained on a DDD simulation itself can reproduce its known (user-defined) mobility function. For this purpose here we use a generic linear function previously developed for BCC metals31 and implemented in our DDD code ParaDiS7

$${{\boldsymbol{V}}}_{i}={{\boldsymbol{{\mathcal{M}}}}}^{{\rm{DDD}}}({M}_{e},{M}_{s})\cdot {{\boldsymbol{F}}}_{i}^{{\rm{DDD}}}$$
(6)

where \({{\boldsymbol{{\mathcal{M}}}}}^{{\rm{DDD}}}\) is a mobility tensor parameterized by two scalar coefficients Me = 2600(Pas)−1 and Ms = 20(Pas)−1 defining mobility of an isolated edge and an isolated screw dislocation, respectively.

To generate training data, we ran five DDD simulations starting from five different initial distributions of 12 prismatic loops seeded at random positions. Identical except for their initial states, five crystals were subjected to compressive straining along their [001] axes at a constant strain rate of 2 × 108/s which are also the conditions of MD simulations intended for training (see next section). The five simulations were run to 0.5 ns over which time dislocation networks were saved at regular intervals of 0.25 ps thus producing 10,000 configurations for training. Following the procedure described in “DDD+ML model”, for each training configuration the DDD nodal forces and Nye’s tensor fields were computed and the loss (Nye’s tensor mismatch, Eq. (5)) was minimized by back propagation.

Once trained, the resulting GNN-based mobility function can be used to infer nodal velocities in any dislocation configuration, including two simple cases of infinite edge and screw dislocations moving in response to applied shear stress τ. Such simple exercise allows us to pick into the otherwise “black-box” trained mobility function. As shown in Fig. 2a, the linear mobility employed in the DDD simulations is well reproduced by the learned GNN mobility function. Linear regression on predicted stress-velocity curves yields mobility coefficients of \({\hat{M}}_{e}\approx 2564\,{({\rm{Pa}}\cdot {\rm{s}})}^{-1}\) and \({\hat{M}}_{s}\approx 6\,{({\rm{Pa}}\cdot {\rm{s}})}^{-1}\) for the edge and screw dislocations, respectively, in agreement with our-defined mobility parameters for pure screw and pure edge dislocations. A notable discrepancy in the screw mobility is likely caused by relative scarcity in the training data of instances in which screw dislocations actually move. Under assumed extreme anisotropy of dislocation mobility (ratio Me/Ms = 23) and initiated from 12 prismatic loops of pure edge character, it is mostly edge and mixed dislocations that move over the first 0.1 of strain covered by five DDD simulations while the screws are passively drawn while remaining largely motionless. Training on more extended DDD simulations reaching strains at which screws become more active, is likely to bring the learned \({{\mathcal{M}}}_{{\boldsymbol{\theta }}}^{{\rm{GNN}}}\) in closer agreement with the user-defined mobility parameter for screw dislocations.

Fig. 2: Stress-velocity predictions.
figure 2

a Velocities of isolated edge and screw dislocations predicted using a GNN mobility function trained on DDD trajectories. The learned GNN model accurately reproduces linear mobilities employed in DDD simulation. b Velocities of isolated edge and screw dislocations predicted using a GNN mobility function trained on large-scale MD trajectories of BCC tungsten deformed under [001] tension and compression. The trained GNN mobility function predicts a non-linear force-velocity relation for the edge dislocation and a strong asymmetry in the velocities of screw dislocations sheared in the twinning (T) and anti-twinning (AT) direction.

Training on MD trajectories

We now apply our method to learning a dislocation mobility function on data from large-scale MD simulations. To illustrate the approach, here we focus on plasticity in body-centered-cubic (BCC) metals challenging for its strong plastic anisotropy32,33,34 manifested in the notorious tension/compression asymmetry10,35,36 and rather uncommon slip crystallography37. We choose BCC tungsten as a testbed material because it is nearly elastically isotropic making it particularly convenient for comparisons with DDD simulations which are simpler and much more expedient to perform in an elastically isotropic crystal. For MD simulations here we employ a computationally efficient inter-atomic potential (IAP) of the embedded-atom (EAM) type previously developed for tungsten38 with its Zener anisotropy ratio computed at 300 K close to 1.0 at A = 2C44/(C11C12) = 1.15.

We first perform two MD simulations with ~35 millions atoms, much the same way as in the DDD validation simulations described earlier. BCC crystals are initially seeded with prismatic dislocation loops and then deformed at 300 K under uniaxial tension and compression along the [001] direction. Again consistent with the earlier DDD simulations, the crystals are deformed at a true strain rate of 2 × 108/s until reaching a true strain of 1.0. Consistent with our previous results for BCC tantalum10,39, the simulations predict a well defined tension/compression asymmetry, with flow stress of ~4.2 GPa under compression and ~2.8 GPa under tension. During both runs, DXA is used to extract dislocation configurations at equal time intervals of 1 ps thus producing a total of 10,000 DXA snapshots subsequently used to train a GNN-based mobility function as previously described.

When initially tested in predicting velocities of isolated edge and screw dislocations, the model exhibits smooth and well-behaved functions, Fig. 2b. Velocity predicted for an edge dislocation is linearly proportional to stress up to ~300 MPa, followed by a non-linear regime reaching an asymptotic velocity of ~2000 m/s at a stress of 2 GPa. Qualitatively, this behavior is fully consistent with the expected phonon-drag controlled motion of edge dislocations14 confirmed earlier in MD simulations of BCC metals18,40.

Of particular interest are velocities predicted from our learned mobility function for pure screw dislocations. When shear stress is applied in the twinning (T) direction (corresponding to a MRSSP angle χ = − 30o), screw velocity reveals a well-defined threshold stress behavior characteristic of thermally-activated motion controlled by kink-pair mechanisms, with very low velocity at stresses below an activation threshold (finite temperature Peierls stress) followed by a transition to a drag-controlled regime at higher stresses. In contrast, when stress is applied in the anti-twinning (AT) direction (corresponding to a MRSSP angle χ = + 30o), the same screw dislocation is predicted to move at barely discernible velocities which is again fully consistent with theoretical models41 and earlier large-scale MD simulations10. Considering that a screw dislocation in a BCC crystal has a choice of three {112} glide planes at least one of which seeing resolved shear stress no less than \(\sqrt{3}/2\) of shear stress resolved in the MRSSP, Fig. 2b indicates that, under conditions of our MD simulations and for the IAP used in this work, glide of screw dislocations on {112} planes in the AT directions is essentially prohibited.

DDD+ML simulations with learned mobility

To fully assess validity of our DDD+ML approach, we implemented our learned GNN dislocation mobility function in the ParaDiS DDD code7 via the C++ PyTorch interface. For a one-to-one comparison, DDD simulations were performed under conditions identical to those used in the ground-truth MD simulations. Cube-shaped simulation boxes with the side length 296b (here b = 0.2743 nm is the magnitude of the Burgers vector), were initially seeded with randomly positioned prismatic loops and then deformed under tension and compression along the [001] axis at a strain rate of 2 × 108/s. For consistency, the shear modulus μ = 149.78 GPa, the Poisson ratio ν = 0.289 and the dislocation core terms42 have been all computed at 300 K using the same IAP as in MD simulations38.

As shown in Fig. 3, DDD predictions for flow stress agree closely with the ground-truth MD simulations. In particular, the asymmetry in flow stress under tension and under compression is fully captured in the learned GNN mobility function. However, in comparison to MD, DDD under-predicts dislocation densities (Fig. 3b) both under tension and under compression, even if asymmetry of two densities predicted in DDD is approximately the same as in MD. Potential sources of these discrepancies will be discussed in the next section.

Fig. 3: Large-scale DDD+ML predictions versus ground-truth MD results.
figure 3

a Comparison of stress-strain response predicted in MD and DDD simulations of [001] tension and compression deformation of BCC tungsten: the GNN dislocation mobility function employed in the DDD simulation was trained on the MD trajectories. MD and DDD simulations are performed in simulation volumes of the same size deformed under the same uniaxial straining conditions at rate 2 × 108/s. b Comparison of dislocation density evolution in the same MD and DDD simulations. Lines colors are the same as in the legend in (a).

To further test accuracy of our learned GNN mobility function, we now compare DDD and MD simulations performed under conditions never seen in training. For this purpose, we ran DDD and MD simulations under compression at a lower rate of 2 × 107/s, an order of magnitude lower than before. To ensure that simulations at the lower rate remain statically representative of bulk crystal plasticity, crystals used in the MD simulations were enlarged by a factor of four to ~140 millions atoms with a corresponding increase in the size of DDD simulation box to 470b. Stress-strain curves extracted from MD and DDD simulations are plotted side by side in Fig. 4. Although our GNN dislocation mobility function was trained only on MD simulations performed under high deformation rate of 2 × 108/s, the plastic flow response predicted in the DDD+ML also agrees quantitatively with the MD simulation performed at the previously unseen lower rate of deformation.

Fig. 4: Comparison of stress-strain response predicted in MD and DDD simulations of [001] compression at rates 2 × 108/s and 2 × 107/s.
figure 4

Although the GNN mobility function was trained only on 2 × 108/s trajectories (red), it correctly predicts the flow stress at the lower straining rate of 2 × 107/s (green). Evolution of dislocation densities predicted at the lower straining rate is shown in the inset.

This important observation is consistent with our earlier simulations based on a simplified DDD system25 in which we similarly observed that a GNN model trained only at some deformation rate performs well in simulations performed at lower strain rates not seen in training. To rationalize this observed transferability of GNN mobility function to DDD simulations at lower deformation rates we note that, although the external/average flow stress stays nearly constant in our ground-truth training MD simulations, in the course of the same simulations dislocations experience a much wider range of stress conditions. Indeed, additional analysis of stress variations in simulation volumes of our MD training set shows that local stress (sum of average external stress and internal stress due to elastic interactions among dislocations) varies widely, from well above to well below the average stress. Accordingly, mechanical forces seen at any given time by many dislocations in the network can be low, as manifested in our previously documented observation that a large faction of dislocation segments move little or not at all from one DXA snapshot to the next10. Instances when mechanical driving force on a dislocation is low become increasingly common with decreasing deformation rates, but even at the high rates of our training MD simulations they occur sufficiently frequently for the GNN to learn on. We leave it to future work to further examine accuracy and transferability of our ML-learned dislocation mobility function across a wider range of deformation conditions.

Discussion

Results presented in the preceding sections demonstrate the feasibility and accuracy of our proposed DDD+ML approach. The same results are all the more significant in comparison to traditional “hand-crafted” mobility functions, including the ones previously developed in our group, that often fail to satisfactorily capture salient characteristics of crystal plasticity such as the tension/compression asymmetry or the peculiar slip crystallography in BCC crystals. Not commonly noted in the literature, such failures come in full display when comparing DDD predictions to predictions of MD simulations performed under identical conditions, which is precisely the essence of a cross-scale matching approach24 we presently pursue. Here we take advantage of a large number of instances of dislocation motion encountered in large-scale MD simulations of crystal plasticity to train a machine learner to advance dislocations forward in time in a DDD simulation in the same manner they move in MD. While fully accounting for known physics in a data-driven fashion, the GNN learns on local network configurations to predict corrections to dislocation motion that are otherwise difficult if not impossible to evaluate. Once incorporated in a DDD model, our learned mobility function accurately describes complex motion of dislocations in a variety of local configurations and stress states resulting in strongly anisotropic plasticity response of BCC metals33,34,43,44. Through observing a large number of examples during training, the GNN also learns to predict motion of isolated dislocations for which it predicts smooth stress-velocity relations in close agreement with existing theory, e.g., see Fig. 2b.

Of particular interest is that, once trained, our GNN mobility function encodes information about slip crystallography without any prior assumption about planes in which dislocations should prefer to glide, a still contentious issue in BCC metal plasticity37. This is in contrast to previous work in which a fixed set of glide planes and a certain functional form of the dislocation mobility function are assumed a priori14,16. At present when physics of dislocation motion is not yet fully understood, imposing any such a priori constraints may prove detrimental to fidelity of DDD simulations.

In our first application of the DDD+ML approach, we nevertheless observe discrepancies in DDD predictions compared to the ground-truth MD simulations. While under compression DDD and MD predictions for flow stress agree within statistical uncertainty, flow stress under tension is slightly under-predicted in DDD. Yet the most notable discrepancy is found in the evolution of dislocation densities that are under-predicted in DDD by about 50% both under tension and under compression. Additional work is required to establish what causes these discrepancies, but several sources appear plausible and worth further examination.

One plausible source of discrepancies in dislocation multiplication is that in DDD dislocation networks evolve not only via the motion of dislocation nodes or segments, but also through a sequence of topological transformations, e.g., dislocation collisions, junction zipping and unzipping and such. Although taking place both in MD and DDD, DDD models are never really informed how such topological events take place in MD but rely instead on plausible and yet to be validated theoretical arguments such as maximum power dissipation7. Furthermore, in mapping dislocation line networks on a regular grid and comparing them on the basis of their continuous Nye’s field representations, as was done in training the GNN mobility function, any topological events taking place between two consecutive DXA snapshots are integrated over and lost. Thus, strictly speaking, the GNN only learns how to advance the Nye’s field on the grid while never learning to perform topological transformations in the dislocation line network the way they take place in MD. We suspect that, in not matching MD on topological events, DDD can not match MD on dislocation multiplication. Exactly where and how dislocations multiply in a massive MD simulation remains an important open question for future studies.

We also note that the resulting dislocation behaviors captured in our trained GNN mobility is naturally tied to the choice of IAP used to generate the ground-truth MD data. For instance, Finnis-Sinclair style potentials such as that used to model W in this work38 are known to predict degenerate screw core structures44, which may affect predictions of slip crystallography in the ground-truth MD trajectories. Yet, we emphasize that our primary objective here is to develop a generic framework capable of accurately capturing details of dislocation motion as seen in the ground-truth data, agnostic to the quality of the ground-truth data itself. Should a more accurate IAP be employed for generating training data, our DDD+ML workflow should be able to learn the resulting dislocation motion equally well. It is also possible that advanced techniques of transfer-learning may be used in the future to fine-tune GNN mobilities initially trained on ”cheap” IAPs to learn dislocation behaviors from higher-fidelity IAPs, requiring only minimal additional training data. Furthermore, coupling of the model with a physics-informed neural networks (PINN)45 approach may provide a path to improve the accuracy and generalizability of the proposed approach in the future.

As a coarse-grained model, DDD simulations are typically orders of magnitude faster to perform than MD simulations under identical conditions. Although requiring more calculations than the traditional hand-crafted mobility functions, our trained GNN mobility function still comes at a negligible computational cost compared to force calculations. Considering simulations reported in Fig. 3 (at 2 × 108/s), an MD simulation requires ~ 10, 000 CPU-hours (using a cheap EAM IAP), compared to ~ 10 − 20 CPU-hours needed to complete a DDD simulation performed under identical conditions. The speedup of DDD over MD is only going to grow larger with more sophisticated IAPs and decreasing deformation rates where, in addition to its considerably longer time steps, DDD remains efficient in much larger simulation volumes (e.g. several μm) that are presently out-of-reach for MD simulations.

Finally, as a first application here we demonstrate our ML-DDD framework on BCC crystals in which dislocation cores are compact. To extend our approach to FCC and HCP crystals where dislocations dissociate into partials, we envision two possible paths depending on the material. For crystals with high stacking-fault energy (SFE), a possible approach would be to constrain DXA to extract only the complete dislocations, learn a mobility function on the resulting DXA snapshots and use it in subsequent DDD simulations of the same material. For low SFE crystals in which two partials may move far apart, pairing the partials into complete dislocations may not be physically justifiable or even possible. Here a more suitable approach would be to learn mobility of the partial dislocations and use it in a DDD model resolving the partials46. Extension of the framework to more complex materials and material microstructures (e.g. irradiated materials, impurities, etc.) should also be possible in the future, e.g., by conditioning the local GNN mobility function on the presence of other types of crystal defects.

In summary, we propose a data centered approach to constructing DDD mobility functions in which a GNN model is trained on large-scale MD data. Training is achieved by minimizing DDD prediction loss for the next state of dislocation network conditioned on mechanical forces computed for the previous state using standard DDD equations. To circumvent uncertainties in comparing generally non-isomorphic graphs, dislocation networks are mapped on a regular grid using a continuous Nye’s tensor field representation. To test our new approach on a complex case of crystal plasticity in BCC tungsten, we ran DDD simulations using a GNN dislocation mobility function trained on ground-truth MD simulations. The resulting DDD predictions match MD simulation used in training as well as MD simulations performed under previously unseen deformation conditions. We believe our data-driven approach to constructing dislocation mobility functions is a promising avenue for improving prediction fidelity of the DDD method.

Methods

GNN mobility law

We model the mobility law in Eq. (4) with a message-passing GNN47, which has proved very powerful for predicting force fields48,49,50,51, defect-properties linkage52, fracture mechanics53, polycrystal properties54,55, and for simulating complex physics56.

Following our recent work on applying GNN to a simplified model of DDD simulations25, a dislocation configuration is represented by a graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\), where \({\mathcal{V}}=\{{N}_{i}\}\) is a collection of dislocation node/vertex features (attributes), and \({\mathcal{E}}=\{{E}_{ij}\}\) is a collection of dislocation segment/edge features. We define the input features for each node i as

$${N}_{i}=\left({n}_{i},{{\boldsymbol{F}}}^{{\rm{DDD}}},\parallel {{\boldsymbol{F}}}^{{\rm{DDD}}}\parallel \right)$$
(7)

where ni is a flag used to specify whether node i is a discretization or physical (junction) node. The input edge features on edge ij are

$${E}_{ij}=({e}_{ij},{{\boldsymbol{b}}}_{ij},\parallel {{\boldsymbol{b}}}_{ij}\parallel ,{{\boldsymbol{r}}}_{j}-{{\boldsymbol{r}}}_{i},\parallel {{\boldsymbol{r}}}_{j}-{{\boldsymbol{r}}}_{i}\parallel )$$
(8)

where eij is a flag used to specify whether segment ij is a glissile or junction segment, and rjri are the local segments line vectors, naturally compatible with the use of periodic boundary conditions. To satisfy Burgers vector conservation, we use a directed graph, i.e. if ij is an edge with Burgers vector bij, then ji is also an edge but with opposite Burgers vector − bij.

Our GNN architecture follows57 and is first composed of vertex and edge encoders ENCV, ENCE transforming concatenated input features into a latent space

$${v}_{i}^{(0)}={\text{ENC}}^{V}({N}_{i}),\,{e}_{ij}^{(0)}={\text{ENC}}^{E}({E}_{ij}),$$
(9)

followed by K stacked message passing layersf E(k), f V(k) (1 ≤ k ≤ K) sequentially updating the latent vertex and edge variables

$${e}_{ij}^{(k)}={f}^{E(k)}({e}_{ij}^{(k-1)},{v}_{i}^{(k-1)},{v}_{j}^{(k-1)}),$$
(10)
$${v}_{i}^{(k)}={f}^{V(k)}\left({v}_{i}^{(k-1)},\sum _{j}{e}_{ij}^{(k)}\right),$$
(11)

and finally a node decoder DEC that translates the latent node variables v (K) into the desired output properties, i.e. nodal velocity vectors:

$${{\boldsymbol{V}}}_{i}=\,\text{DEC}\,\left({v}_{i}^{(K)}\right).$$
(12)

Functions ENCV, ENCE, f V, f E, and DEC are neural network operators built from multi-layer perceptrons with two hidden layers, layer normalization58, skip connections59, and GELU activation functions60.

MD simulations

Large-scale MD simulations of BCC tungsten are performed with LAMMPS61 under the Kokkos GPU implementation62, using the EAM-style IAP developed in ref. 38. Simulations are performed following the protocol introduced in ref. 8. Periodic, orthorombic BCC perfect crystals are initially seeded with twelve 1/2〈111〉{110} prismatic loops of the vacancy type. The crystals are first equilibrated at the temperature of 300 K, after which they are deformed at a constant true strain rate. Temperature and uniaxial loading conditions are maintained during deformation using the langevin thermostat and the nph barostat. For the ~ 35 millions atoms simulations deformed at a rate of 2 × 108/s, DXA27 is executed every 1 ps to save the detailed evolution of the dislocation networks.

Mobility workflow and training

DXA configurations produced in the MD simulations are first converted to the ParaDiS format7. For consistency, during this operation the line networks are also remeshed with discretization size of ~ 10b, corresponding to the average segment size used in our DDD simulations.

The so-converted dislocation configurations \(\{{{\mathcal{G}}}^{{t}_{s}}\}\) are then fed to the ParaDiS code to compute nodal forces, Eq. (2). Applied forces \({{\boldsymbol{F}}}_{i}^{app}\) are computed by integrating the Peach-Koehler force \(({{\boldsymbol{\sigma }}}^{{t}_{s}}\cdot {{\boldsymbol{b}}}_{ij})\times {{\boldsymbol{t}}}_{ij}\) along the dislocation segments ij with unit tangent tij, where \({{\boldsymbol{\sigma }}}^{{t}_{s}}\) is the instantaneous stress applied to network \({{\mathcal{G}}}^{{t}_{s}}\) at time ts as recorded during the MD runs. Long-range interaction forces \({{\boldsymbol{F}}}_{i}^{lr}\) are computed using DDD-FFT approach introduced in ref. 29 which can easily handle non-cubic, deforming simulation boxes as produced by MD simulations. Short-range interaction forces \({{\boldsymbol{F}}}_{i}^{sr}\) are computed using the non-singular isotropic analytical formulation63. Core forces \({{\boldsymbol{F}}}_{i}^{core}\) are computed from core energies extracted from the ground-truth IAP38 using the framework developed in ref. 42.

The networks containing nodal forces are then used as inputs to our training procedure implemented within PyTorch30. To facilitate training, input forces are rescaled by a factor 109Pa b2 so that their average magnitude is on the order of unity. The loss function, Eq. (5), is computed by evaluating the Nye’s tensor on a grid of 323 voxels using a fully vectorized implementation of the discrete-to-continuous method introduced in ref. 29.

We trained different GNN models to explore different sets of hyper-parameters. We tested a combination of models with K = {2, 3} message-passing layers each with latent space of size L = {48, 96}, leading to 4 different trained models. The models were trained using the AdamW optimizer with weight decay of 1 × 10−564. Training was performed for 12 hours with batch size of 4 on a single NVidia V100 GPU. We find that the GNN model with K = 3 and L = 48 offers the best trade-off between accuracy and complexity (78,768 total parameters) while showing no sign of ovefitting. We thus selected it as the best model for results presented in this work.