Predicting orientation-dependent plastic susceptibility from static structure in amorphous solids via deep learning

Fan, Zhao; Ma, Evan

doi:10.1038/s41467-021-21806-z

Download PDF

Article
Open access
Published: 08 March 2021

Predicting orientation-dependent plastic susceptibility from static structure in amorphous solids via deep learning

Nature Communications volume 12, Article number: 1506 (2021) Cite this article

6161 Accesses
46 Citations
3 Altmetric
Metrics details

Subjects

Abstract

It has been a long-standing materials science challenge to establish structure-property relations in amorphous solids. Here we introduce a rotationally non-invariant local structure representation that enables different predictions for different loading orientations, which is found essential for high-fidelity prediction of the propensity for stress-driven shear transformations. This novel structure representation, when combined with convolutional neural network (CNN), a powerful deep learning algorithm, leads to unprecedented accuracy for identifying atoms with high propensity for shear transformations (i.e., plastic susceptibility), solely from the static structure in both two- and three-dimensional model glasses. The data-driven models trained on samples at one composition and a given processing history are found transferrable to glass samples with different processing histories or at different compositions in the same alloy system. Our analysis of the new structure representation also provides valuable insight into key atomic packing features that influence the local mechanical response and its anisotropy in glasses.

Learning to Predict Crystal Plasticity at the Nanoscale: Deep Residual Networks and Size Effects in Uniaxial Compression Discrete Dislocation Simulations

Article Open access 19 May 2020

Predicting the failure of two-dimensional silica glasses

Article Open access 20 May 2022

A general and transferable deep learning framework for predicting phase formation in materials

Article Open access 25 January 2021

Introduction

The mechanical response of a crystalline metal to applied stresses can be quantitatively explained by monitoring the evolution of dislocations^1,2 in the lattice. In stark contrast, amorphous metals do not have such well-defined structural defects, owing to the absence of long-range atomic packing order. As a result, in the static structure of a glass, it remains a grand challenge to identify a priori local “defects” that are vulnerable to rearrangements, even when the positions of all atoms are fully known^3,4,5,6,7. Over the years, a number of physical-property-informed parameters have been used to serve as indicators^{8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31} to forecast fertile sites for shear transformations. Also, several macroscopic and experimental parameters^{28,29,30,31,32} have been shown to correlate with the properties/processes in glasses. Data-driven models^33,34,35,36 have been put forward as well for the same goal. However, all these endeavors have only achieved moderate success: the correlations with the plastic susceptibility of particles/atoms, i.e., the propensity of each particle to experience plastic rearrangement, have not been sufficiently strong to establish structure–property relations in glassy solids.

A critical reason for this status quo is that the scalar or rotation-invariant quantities employed so far are inherently inadequate in capturing the anisotropic response of a local configuration. A given local environment of an atom (particle) can respond quite differently when the externally applied global force is imposed along different orientations^37,38,39,40. Recently, additional indicators have been invoked to take into account anisotropy in predicting the propensity for plastic activity in two-dimensional (2D) model glasses^38,39,40. But it remains unclear how well these new indicators fare in three-dimensional (3D) amorphous solids. Furthermore, these indicators are physics-informed quantities that need to be evaluated with known interparticle interactions (potentials), every time the structure changes. In other words, one cannot simply monitor the spatial atomic positions alone, to explain property changes. In terms of physical insight, it is difficult to use these parameters to directly identify the critical structure features (i.e., the atomic packing environment per se) responsible for the anisotropic local mechanical response. This has been a puzzle existing for at least more than two decades since Falk and Langer pointed it out in 1998³⁷.

Recently, deep convolutional neural networks (CNNs) have enabled revolutionary breakthroughs in the field of computer vision and pattern recognition^41,42,43,44. Inspired by such successes, here we explore the potential of this deep learning approach via converting local configurations into images and constructing large-scale training datasets (up to ~20 million images) from extensive atomic simulations. More importantly, we also devise a rotationally non-invariant structure representation, to capture the anisotropic responses of a local configuration and the intrinsic sensitivity to the loading orientation. We will demonstrate that this combination achieves unprecedented accuracy when it is used to separate out particles with high plastic susceptibility, unveiling the power of local static structure for predicting orientation-dependent plastic events in both 2D and 3D glasses. In terms of new physical insight, by identifying a clear difference in the new atomic structure representation for contrasting particles, i.e., those with high versus low plastic susceptibility, we have contributed, from the structural perspective, to fully resolving the puzzle as to why the same local configuration could respond very differently to external loading applied in different orientations.

Results

Atomic structure representation and model architecture

We have developed CNN models in both a 2D binary Lennard-Jones (L-J) glass system⁴⁵ and a 3D binary embedded-atom method (EAM)-based Cu–Zr metallic glass system⁴⁶. To represent the local packing environment of each particle i, we use a new multidimensional structural function, termed a Gaussian-weighted spatial density map (SDM). For the 2D model, the SDM of each particle i is defined as

$${{\Xi }}_i\left( {y,x,\beta } \right) = \mathop {\sum }\nolimits {\mathrm{exp}}\left( { - \frac{{\left( {r_{ij,x} - x} \right)^2 + \left( {r_{ij,y} - y} \right)^2}}{{2\Delta ^2}}} \right)$$

(1)

where the summation is performed over all particles satisfying these conditions: ${\mathrm{species}}\left( j \right) \in \beta ;| {r_{ij,x}} | < r_c$ and $| {r_{ij,y}} | < r_c$, $r_{ij,x}$ and $r_{ij,y}$ are the components along x and y dimensions, respectively, of ${\mathbf{r}}_{i,j}$, which is the vector connecting the central particle i with particle j of species $\beta$ within a cutoff $r_c$ (particle j may be the central particle when the species of the central particle is that under consideration in the channel). Here, $x,y \in \left[ { - r_c + 0.5\Delta ,r_c - 0.5\Delta } \right]$ with a constant increment of $\Delta$ and $\beta \in \left\{ {0,1} \right\}$ where 0 represents small (S) particles, and 1 large (L) ones. We set $r_c = 5\sigma$ and $\Delta = 0.2\sigma$ after trials using many different values (see Supplementary Fig. 1a and Supplementary Notes 1 and 2, and $\sigma$ is the characteristic length scale parameter in the potential, see details in “Methods”).

Similarly, for 3D glass systems, the SDM of each atom i is defined as

$${{\Xi }}_i\left( {z,y,x,\beta } \right) = \mathop {\sum }\nolimits {\mathrm{exp}}\left( { - \frac{{\left( {r_{ij,x} - x} \right)^2 + \left( {r_{ij,y} - y} \right)^2 + \left( {r_{ij,z} - z} \right)^2}}{{2\Delta ^2}}} \right)$$

(2)

here the summation is performed over all particles satisfying these conditions: ${\mathrm{species}}\left( j \right) \in \beta ;| {r_{ij,x}} | < r_c;| {r_{ij,y}} | < r_c$ and $| {r_{ij,z}} | < r_c$, and $\beta$ = 0 or 1, to represent Zr and Cu atoms, respectively. Using an SDM to represent a local configuration can be viewed as projecting the local configuration to 2D (or 3D) grids for different species, which results in a multidimensional numerical array, equivalent to a 2D (or 3D) image containing $\left( {2r_c/\Delta } \right)^2$ or $\left( {2r_c/\Delta } \right)^3$ pixels. And each pixel has channels equal to the number of components in the sample. These images can be used directly as input into the CNN model. For an adequately large $r_c$, a sufficient number of surrounding particles would be included. And, when $\Delta$ is small enough, any tiny variation of particle positions in the local configuration would lead to a corresponding variation in SDM. Therefore, the SDM can faithfully and accurately represent the topological structure feature of a local configuration. Obviously, when a larger $r_c$ is used, an image would contain structural information across a larger range, and a smaller $\Delta$ means a higher resolution of resultant images; the difference between two very similar local configurations would be more pronounced in the images. Although a larger $r_c$ and a smaller $\Delta$ are good for more faithful and accurate representation of the topological structure feature and elevated predictive performance, these also mean larger size of each image, requiring more computational resource and memory to generate, store, and use these images. In this work, we will choose optimized values of both $r_c$ and $\Delta$ for different glass systems after trying many different values. See more discussion about the optimization of both $r_c$ and $\Delta$ in Supplementary Notes 1 and 2. In addition, the contribution to the SDM from different species are separated into different channels, taking into account the influence arising from chemical affinity. Figure 1a shows a typical 2D model glass, with a close-up view (a local configuration) in Fig. 1b. The SDM of S and L particles corresponding to Fig. 1b are shown in Fig. 1c, d, respectively, where each image contains 50 × 50 pixels. The pixels with high intensity (darker color) correspond to the locations of particles. Obviously, the SDM will be different when the local configuration is rotated. Before computing SDMs, we will use different coordinate systems or rotate configurations for different loading orientations such that the shear direction and the normal direction of the shear plane are parallel to X and Y axes, respectively, of the chosen coordinate system (several specific examples are given in Supplementary Fig. 2 for choosing appropriate coordinate systems for different loading protocols). Therefore, the SDM is rotationally non-invariant, which is the key feature we need for predicting orientation-dependent plastic events in amorphous solids. As will be shown later, the rotation-dependent SDM is very sensitive to small changes in sample-loading alignment. The SDM can also be used to predict rotation-invariant properties of crystalline materials via data augmentation⁴¹, i.e., arbitrarily rotating images each time they are fed into the network. The SDM is a substantial extension of the radial Gaussian function pioneered by Behler and Parrinello previously⁴⁷, to multidimensional space. Our extension is important in the following aspects. First, it ensures a complete structure representation. Second, it enables rotationally non-invariant prediction. Third, it results in images that can be interpreted by CNN which promises state-of-the-art learning capability.

**Fig. 1: Atomic structure representation and model architecture for 2D glasses.**

We then applied the CNN method⁴⁴ to each input image to predict the propensity of each particle to experience plastic rearrangement, at different global shear strains when the sample is loaded along a specific orientation. See more details on the loading protocol in “Methods.” Note that the CNN model is taking one image per atom (each image represents the local atomic environment centered on the atom of interest), rather than taking an image of the entire simulation box. Figure 1e illustrates the architecture of the CNN model for the 2D glasses. Specifically, each CNN model contained 25 convolutional layers, where filters with a small receptive field of 3 × 3 were used. Zero padding was involved in all but not in the fifth and last convolutional layers. Batch normalization⁴⁸ was adopted right after each convolution and before activation with the rectification (ReLU) nonlinearity⁴¹. A max-pooling layer is periodically inserted in between the 25 successive convolutional layers. Max-pooling is performed over a 2 × 2 pixel window, with a stride of 2. The number of filters is 20 in the first five convolutional layers and then doubled after each max-pooling layer. And the last convolutional layer is directly followed by the output layer, which is a sigmoid neuron as we are performing binary classification tasks. In the classification tasks, each input image has a true positive label (y_i = 1) when the non-affine squared displacements (D²_min)³⁷ of the atom i represented by the image is greater than a specific threshold after straining to a given degree under a specific loading condition, and a negative label otherwise (y_i = 0). More details about the D²_min threshold and the loading protocol will be presented later.

For 3D glasses, r_c = 9.6 Å and Δ = 0.6 Å were found to produce optimized results (see Supplementary Fig. 1b, Supplementary Notes 1 and 2), and the resultant input images contain 32 $\times$ 32 $\times$ 32 pixels. The corresponding CNN model contains 20 convolutional layers and three max-pooling layers, and both convolution and max-pooling were performed in 3D space, as illustrated in Supplementary Fig. 3. The number of filters in each 3D convolutional layer was 60. To accelerate training and avoid shortage of memory, we did not double the filter number after max-pooling. This is not expected to influence significantly the training results, based on our experience with the 2D glasses for which doubling the filter number increased the validation accuracy by less than 1%. For tutorials about deep learning, the readers are referred to refs. ^41,42,43,44.

Structural insight into the anisotropic local mechanical response

In the literature, free volume^8,49 has been commonly used as a (macroscopic) structural parameter. However, it has been known for a long time that the correlation between local excess volume and dynamics in glasses is rather weak^10,17. We will further confirm later that there is indeed no obvious correlation between the plastic susceptibility and local volume. There have also been recent observations that atoms with fewer neighboring particles in the nearest-neighbor shell, but more in the troughs in between the shells, would be more flexible^33,35. However, this does not explain why the same local configuration responds differently to applied global force along different directions. Now that we have converted local configurations into images, let us first examine, from this perspective, if we could identify a critical structure feature responsible for the anisotropic mechanical response. To this end, we seek valuable insight by contrasting the SDMs of particles with extremely high plastic propensity (i.e., the tail end of the high-susceptibility side) versus those on the other end.

We first sheared 5000 2D glass samples to a shear strain of 3.0% using the athermal quasi static (AQS) method^50,51, such that we can identify all of the particles with extreme (the highest or the lowest 0.5%) D²_min, for each of the four different loading protocols (positive and negative simple shear and positive and negative pure shear). In positive (negative) simple shear, the applied strain increment is $\Delta \gamma _{xy} = \Delta \gamma$ ($\Delta \gamma _{xy} = - \Delta \gamma$) and in positive (negative) pure shear, the applied strain increments are $\Delta \varepsilon _{xx} = - \Delta \varepsilon _{yy} = - \Delta \gamma /2$ ($\Delta \varepsilon _{xx} = - \Delta \varepsilon _{yy} = \Delta \gamma /2$), see details in Supplementary Fig. 2 and “Methods”). In this work, following the pioneering work by Falk and Langer³⁷ as well as recent machine-learning attempts^33,34, we assume that an atom has a high plastic susceptibility if it shows a large D²_min (the position of the atom is then a fertile site for shear transformations). The plastic susceptibility is a scalar quantity, but exhibits a different value each time the sample is globally loaded along a different orientation. We then went back to the original configurations before deformation and calculated the SDMs for these two groups of extreme particles that we want to compare. For each group an average SDM was obtained, representing the averaged environment of a representative particle. In the main text, we show and discuss the difference, in terms of the spatial distribution of surrounding S particles, between central S particles having the highest versus the lowest plastic susceptibility. Other cases, including the distribution of L particles around central S, and the distribution of both S and L particles around central L, are displayed in Supplementary Figs. 4–6 only, as they show features similar to that observed for the distribution of S particles around the central S particle. In 3D Cu–Zr glasses, Cu and Zr species correspond to S and L particles, respectively.

It is interesting to observe from Fig. 2a–h that for a particle of highest 0.5% D²_min (see Fig. 2a–d), hereafter referred to as “fertile site,” its neighboring shells are more diffuse (wider), and the intensity in the same shell is nonuniform. For the other extreme, the lowest D²_min particle (shown in Fig. 2e–h), each shell is sharper with more uniform intensity. In other words, the images do reveal structural differences between particles with high versus low plastic susceptibility. This difference in local environment can be better appreciated by subtracting the maps in Fig. 2a–d with those in Fig. 2e–h. As can be seen from the resultant difference in the SDMs, Fig. 2i–l, for “fertile sites” the surrounding shells are ellipses. The long axis is parallel to the macroscopic elongation direction ${\boldsymbol{\zeta}}$—the white dashed line in Fig. 2i–l, which is the sum of the unit vector $\hat {\boldsymbol{\iota}}$ (effective shear direction) and ${\hat {{{\boldsymbol{\eta}}}}}$ (normal to the effective shear plane), and the short axis is perpendicular to ${\boldsymbol{\zeta}}$. In contrast, for particles on the other end of the plastic susceptibility spectrum, their neighboring shells are circles. A schematic highlighting our observation in Fig. 2i–l is included in Supplementary Fig. 7, to better illustrate the critical structural difference.

**Fig. 2: Structural insight into the anisotropic local mechanical response.**

To further understand the structural difference, we also used an orientational pair correlation function, $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$, to characterize the local atomic packing environment. The $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ for a single particle i is defined as

$$g_{{\boldsymbol{\zeta}} ,i}\left( {r,\theta } \right)_\beta \,= \frac{1}{{\kappa \rho f_\beta }}\mathop {\sum }\limits_{j = 1}^{N_\beta } \left[ {\delta \left( {r - r_{ij}} \right)\delta \left( {\theta - \theta _{{\boldsymbol{\zeta}} ,{\mathbf{{r}}}_{ij}}} \right)} \right]$$

(3)

where $\theta _{{\boldsymbol{\zeta}} ,{\mathbf{r}}_{ij}} = {\mathrm{arccos}}\left( {| {\hat {\mathbf{{r}}}_{ij}\cdot \hat {\boldsymbol{\zeta}} } |} \right)$ is the angle between the vectors ${\boldsymbol{\zeta}}$ and ${\mathbf{r}}_{ij}$ and has its values in the range of [0, $\pi /2$], ${r_{ij}} = | {{\mathbf{r}}_{ij}} |$, $\rho$ is the number density of atoms, and $f_\beta$ is the composition fraction of species $\beta$ in a sample. For 2D samples, $\kappa = 4r\Delta r\Delta \theta$, and for 3D samples, $\kappa = 4\pi r^2\Delta r\left[ {\cos \left( {\theta - 0.5\Delta \theta } \right) - \cos \left( {\theta + 0.5\Delta \theta } \right)} \right]$. The $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ measures the line density of particles/atoms in the region with radial distance of $r$ from the central atom i, oriented relative to elongation direction ${\boldsymbol{\zeta}}$ at angle $\theta$ (this can be compared with the usual pair correlation function, which can be regarded as the average of orientational correlations over all $\theta$). Again, our goal is to probe the local environment of particles identified to exhibit extreme D²_min in 3.0%-strained 2D samples. In order to accomplish this, we went back to the original unsheared sample to calculate the $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\theta } \right)$ for these particles of interest. This curve is then averaged over all such S (or L) particles from 5000 different samples for a given protocol (each sample was sheared four times, with the different protocols). The $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ plots are shown in Supplementary Fig. 8 for S particles and Supplementary Fig. 9 for L particles. The $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ curves obtained for particles identified in the various loading protocols show little difference, e.g., (a–d) in the Supplementary Fig. 8 are very similar to one another; they were hence averaged to the one shown in Fig. 2m.

As seen from Fig. 2m, for particles with low plastic susceptibility, the gray curves of $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ for different $\theta$ almost overlap on top of each other, which is consistent with the observation that around each of such particles the neighbors tend to form circularly symmetric shells. In stark contrast, for fertile-site particles, the curves of $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ at different $\theta$ are quite different. Specifically, their neighboring particles with $\theta$ close to 0 have longer bond lengths, whereas those with $\theta$ closer to $\pi /2$ have shorter bond lengths, see Fig. 2m. Note that other peaks at larger radial distance behave similarly, so only the first peaks are compared in the figures. We carried out similar $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ analysis for 3D Cu₅₀Zr₅₀ model glasses. As shown in Fig. 2n, the structural features are similar to those in 2D glasses.

The analysis based on $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ unveils that a local configuration would be vulnerable to plastic rearrangement when its longest axis of the elliptical neighboring shells is parallel to the elongation direction of global loading. This identifies the softest direction of a 2D local configuration. But for a 3D local configuration under this same condition, there is still one more degree of freedom: the local configuration can rotate arbitrarily around the elongation direction. Additional conditions are therefore needed to identify the softest direction of 3D local configurations, to take into account this angle of rotation. To this end, we employ another orientational pair correlation function, $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)$

$$g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)_\beta = \frac{1}{{\kappa \rho f_\beta }}\mathop {\sum }\limits_{j = 1}^{N_\beta } \left[ {\delta \left( {r - r_{ij}} \right)\delta \left( {\varphi - \varphi _j} \right)} \right]$$

(4)

Here, we shift the elongation vector ${\boldsymbol{\zeta}}$ such that it goes through the central atom i of a local configuration and then define a new vector ${\boldsymbol{\chi}} _j$ for each surrounding atom j of the local configuration. The ${\boldsymbol{\chi}} _j$ is perpendicular to ${\boldsymbol{\zeta}}$ and passes through the surrounding atom j of interest. Then, $\varphi _j$ is the angle between ${\boldsymbol{\chi}} _j$ and the plane (${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$) defined by the shear direction vector (${\boldsymbol{\iota}}$) and the normal vector of shear plane (${\boldsymbol{\eta}}$), for each surrounding atom j. Thus, the $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)_\beta$ depicts neighboring shells which pass through the vector ${\boldsymbol{\zeta}}$ but with different $\varphi$ relative to the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$. When $\varphi = 0$, the shells are in the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$, and when $\varphi = \pi /2$, they are perpendicular to the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$. Here, $\kappa = 8r^2\Delta r\Delta \varphi$, different from that for $g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)$ in Eq. (3).

For the 3D glasses, Fig. 2o contrasts the partial $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)$ of the Cu atoms having the lowest 0.5% D²_min with those having the 0.5% highest for Cu–Cu. These extreme particles were identified in the 1500 Cu₅₀Zr₅₀ glasses deformed to 5% strain. For the lowest D²_min, all partial $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)$ at different $\varphi$ also overlap with one another, in conjunction with the overlapping partial $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\theta } \right)$ at different $\theta$ shown in Fig. 2n. These unequivocally confirm that the neighboring shells of particles with lowest plastic susceptibility exhibit spherical symmetry. In contrast, for the fertile sites with the highest D²_min, the surrounding particles inside the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$ tend to have shorter bond lengths, as seen in all the $g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)$ at small $\varphi$. Combining this with the insight from Fig. 2n, a 3D local configuration is expected to emerge as a fertile site when (i) the degree of asymmetry of its atomic packing is sufficiently large; (ii) its longest axis is parallel to the elongation direction of the global loading; and (iii) its orientation around the elongation direction axis is in such a way that the neighboring shell inside the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$ coincides with the ellipse having the largest aspect ratio. Putting it another way, in the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$, the larger the ratio of the bond length parallel to the elongation direction to that parallel to the contraction direction (the elongation and contraction directions are perpendicular to each other), the higher the susceptibility to plastic rearrangement.

We emphasize here that these features above provide new insight into the difference in local structure, between particles with high versus low plastic susceptibility. To recapitulate, in both the 2D and 3D glasses, for particles with low plastic susceptibility, their neighbors tend to form circular shells with almost uniform intensity (peak height) at different $\theta$ and $\varphi$. In other words, the bond lengths are very similar for each shell, in which the neighbors uniformly distribute. In contrast, packing tends to be more asymmetric surrounding fertile sites, the neighboring shells are ellipses, each with non-uniform particle distribution. For a fertile site to be activated upon a particular mechanical loading, the longest and shortest axes of neighboring shells should be parallel to the elongation and contraction directions, respectively. In other words, a fertile site is the most prone to be activated when its surrounding packing along the elongation direction is the loosest and packing along the contraction direction is the densest, as illustrated by the schematic in Supplementary Fig. 7. This is the structural origin responsible for the anisotropy of local mechanical response in amorphous solids.

With these insights in mind, we expect that the mechanical response of a local configuration will not show much difference when one mirrors the local configuration around the plane perpendicular to the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$ and containing the vector ${\boldsymbol{\zeta}}$ or the vector perpendicular to ${\boldsymbol{\zeta}}$ or the plane of ${\boldsymbol{\iota}} \times {\boldsymbol{\eta}}$ (see details on these mirror operations in Supplementary Fig. 10), as these mirrors will not change the packing density of local configurations along the elongation or contraction direction. To verify this, we loaded three such mirrored configurations of a 2D L-J sample and seven mirrored Cu₅₀Zr₅₀ configurations along a constant loading orientation. As expected, D²_min of all particles in those mirrored samples are indeed almost identical to those in the corresponding original samples, when the strain is well below global yielding, as shown in Supplementary Fig. 11. In training our CNN model (next section), we will augment our training data with these mirror symmetries. This was found to improve accuracy by ~2%, lending further support to our insight above.

Training procedure and results

The insight from Fig. 2 discussed in the preceding section is important for our understanding of the structural origin of the anisotropy, as it provides a general picture of the main structural features/difference. But our trials found it difficult to translate the new insight into a simple hand-crafted indicator that can predict plastic susceptibility with high accuracy. This underscores the need to take into account all the subtle but nontrivial structural features, when a concrete decision is to be made regarding whether a particular particle has high (or low) susceptibility. As will be demonstrated in the work that follows, deep learning is well suited for this purpose, in terms of providing adequate predictive power. Furthermore, the consistency between the CNN prediction with our expectation based on the insight in Fig. 2 will also be confirmed at the end of this subsection. To construct big datasets for the CNN models, we prepared 2D L-J (or 3D Cu₅₀Zr₅₀) glasses with constant a cooling time of 10⁶ t₀ (or effective quench rate of 10¹⁰ K/s) from the corresponding well-equilibrium liquid. A total of 5000 (1500) samples were made, each containing 10,000 (32,000) particles. Each sample was then deformed under 4 (24) different loading protocols (see more details about sample preparation and deformation in “Methods”). These 2D (3D) glass models were divided into training, validation, and test datasets, each containing 4900 (1480), 50 (10), and 50 (10) samples, respectively. Since we are interested in identifying particles which will experience extreme (large) plastic rearrangement upon loading, the first step is to do a binary classification task. We will explore the relations between initial static structure and plastic responses at different degrees of strain before the yielding point. Thus, we constructed different datasets for the plastic responses at different degrees of strain. For a specific strain, a particle i is labeled y_i = 1 if its D²_min is higher than a specific threshold and y_i = 0 otherwise. Here, we use f_thres to set the D²_min threshold at a given shear strain. For example, f_thres = 0.5% means that we label particles in the top 0.5% D²_min group as y_i = 1 and the remainder y_i = 0. The shear stress–strain curves and cumulative distribution function (CDF) of D²_min are shown in Supplementary Figs. 12 and 13, respectively, for the 2D and 3D glasses. Our main purpose is to reveal how accurate it can be when identifying atoms with large plastic susceptibility a priori from initial static structure in glasses. We therefore choose f_thres = 0.5% to show the predictive performance (to ensure sufficiently large datasets, f_thres cannot be too small), after trying many different values of f_thres. Actually, the advantage of the CNN over other machine-learning methods is more obvious for larger f_thres, see more discussions and results in Supplementary Note 3 and Supplementary Fig. 14. Our training, validation, and test datasets were balanced, as in each dataset we included all the y_i = 1 particles in each sample deformed under each specific deformation protocol, with the same number of y_i = 0 particles selected at random from the rest. In contrast to previous works^33,36, we will train a single model for all species in a glass system to improve the user friendliness of machine-learning models; this does not sacrifice much predictive power compared to training different models for different species, as verified in Supplementary Fig. 15 and related discussion in Supplementary Note 4. Note again that the SDM was constructed for particles in the initial undeformed configurations; it is just that these particles were selected based on their D²_min response upon straining.

The main interest in the glass community is to see how effective it could be when it comes to separating out fertile sites, i.e., atoms prone to large plastic rearrangement upon deformation. Classification suffices for this purpose. In performing binary classification tasks, the most commonly used metric to measure the predictive performance is “accuracy,” defined as the fraction of the number of images (local configurations) predicted correctly divided by the total number of images in a dataset. The image is predicted correctly, when the actual label y_i (1 or 0, determined by the value of D²_min from simulations) of an image is identical to the label predicted by the machine-learning model. Since each of our datasets is balanced, i.e., the number of images with actual y_i = 0 is equal to that of images with actual y_i = 1, the lowest value of accuracy would therefore be 50%. We confirmed that, for each trained model, the validation accuracy and test accuracy are practically the same, suggesting that the trained models did not bias our validation datasets despite the hyperparameters involved, such as CNN architecture, $r_c$ and Δ. We therefore only report the validation accuracy (or simply “accuracy”) in the following.

Figure 3a, b, for 2D and 3D glasses, respectively, shows the predictive accuracy achieved on validation datasets with our CNN method, the graph neural network (GNN)³⁶ method, and the linear support vector machine (SVM)³³ method, when differentiating particles/atoms with y_i = 1 from those with y_i = 0 at different strains with a constant f_thres = 0.5%. As seen from Fig. 3a for the 2D case, CNN exceeds SVM by far in performance, over the entire strain range we considered; the highest accuracy achieved was as high as 96.27%. This accuracy is very close to the ceiling (see “Methods” for details about the upper bound). Several advantages of CNN make it obviously superior to SVM: (i) CNN can capture more complicated relations between static structure and plastic activity compared to linear SVM, as the latter is merely a linear machine-learning algorithm; (ii) SVM loses some subtle but critical structural information during the conversion from atomic positions to the input features; and (iii) most importantly, for a given local configuration in different loading directions, we constructed images with different coordinate systems (in which the X axis is parallel to the shear direction (${\boldsymbol{\iota}}$) and the Y axis is the normal direction of the shear plane (${\boldsymbol{\eta}}$)). This enabled orientation-dependent predictions, whereas the input features to SVM are rotation-invariant, leading always to the same prediction. We observe that GNN produces slightly lower accuracy compared to the accuracy of our CNN when f_thres = 0.5% for 2D glasses, as shown in Fig. 3a. This is not surprising, as GNN used atomic positions directly to construct graphs as input, and in training GNN models we rotated edge vectors for different loading directions to take into account the orientation dependency of local plastic response, in contrast to the original procedure from ref. ³⁶ (see details in “Methods”). The advantage of CNN over GNN is more obvious for larger f_thres, see Supplementary Fig. 14, and for all cases in 3D glasses (see discussion in next paragraph). In addition, the graph input to GNN cannot be used to provide the insights gained from Fig. 2 based on images. Supplementary Fig. 16 shows the particle-level prediction results of a given 2D glass under different loading protocols for the three machine-learning methods (see more discussion in Supplementary Note 5).

**Fig. 3: Predictive power enabled through deep learning.**

When dealing with 3D Cu₅₀Zr₅₀ glasses, the GNN method is also superior to SVM, consistent with the observations in ref. ³⁶. However, we found that our CNN, which has stronger learning capability, is far more powerful than GNN, as shown in Fig. 3b, over the entire strain range. This is because the relation between structure and plastic response in 3D glasses is much more complex than that in 2D glasses. The highest accuracy achieved with CNN is above 90%, in the strain range from 3 to 6%. At small strains, only some of the initial “defects” have been activated and thus D²_min cannot reflect the system plastic response very well; on the other hand, at large strains close to global yielding, the role played by initial structural state becomes less dominant. As a result, the correlation between the initial structure and the plastic response peaks at an intermediate value of shear strain. Even at a strain of 10% where global yielding occurs (see Supplementary Fig. 12b), the accuracy is still >83%. This suggests our CNN model may be useful in predicting the location where shear band initiates. This point is beyond the scope of the current work and will be discussed elsewhere. We believe that the CNN method can achieve higher accuracy, especially for 3D glasses, if we use deeper networks. But it will require more computational resources, see more discussion in Supplementary Note 1. Our main purpose here is to demonstrate that the CNN is indeed a powerful method to meet the challenge we are dealing with. To benchmark the advantages of the CNN, a comparison is made against the reference—the fully connected neural networks, as shown in Supplementary Fig. 17.

One would be naturally curious as to how well our CNN models fare, when compared with the prediction based on some previously published physical indicators, such as the participation ratio in soft modes (SMs)^17,18, flexibility volume (v_flex)¹⁹, and atomic volume ($\Omega _a$). For binary classification tasks, the lowest accuracy of any machine-learning method is always 50%. But if the accuracy is used as the metric to assess the predictive performance of those physical parameters, it would drop below 50%. To avoid such an unfair comparison, we have chosen the following route. For the three machine-learning models, we use the class probability for CNN and GNN, and the distance to the separation boundary predicted by SVM, to denote the predicted plastic susceptibility of each particle in the samples constructed for the test dataset but never involved in the training dataset. Take CNN as an example, the class probability for each input image is a numerical value in the range of (0, 1) provided by the learning algorithm, to represent the probability of having a label of y_i = 1 for the input object. For binary classifications, class probability greater than 0.5 predicts y_i = 1 and y_i = 0 otherwise. For the three physical indicators, the value of each of these indicators is used directly to denote the predicted plastic susceptibility of each particle. Similar to ref. ²², we define a cumulative rank correlation (C_rank) to compare the power of these six routes in predicting the particles which will have top 0.5% D²_min when a sample is sheared to different strains in a given orientation. As the indicator based on each of the six methods are expected to correlate positively with the plastic response, i.e., larger indicator means larger D²_min, here we define

$$C_{{\mathrm{rank}}} = 2\overline {{\mathrm{CDF}}} \left( {\rho _i} \right) - 1$$

(5)

where $\rho _i$ is the plastic susceptibility of particle i predicted via one of the six routes, ${\mathrm{CDF}}\left( {\rho _i} \right)$ is the cumulative distribution function value for $\rho _i$, and the bar on top represents the averaging of ${\mathrm{CDF}}\left( {\rho _i} \right)$ over all particles with top 0.5% D²_min. The highest (lowest) value of $C_{{\mathrm{rank}}}$is around 0.995 (−0.995), which means perfect correlation (anticorrelation), and $C_{{\mathrm{rank}}} = 0$ means no correlation.

As seen in Fig. 3c, d, the $C_{{\mathrm{rank}}}$ achieved with our CNN method is almost equal to its highest possible value over a wide range of strain for both the 2D and 3D glasses. The worst case is $\Omega _a$ for which $C_{{\mathrm{rank}}}$ is always close to zero across the board, for either the 2D or 3D glasses. For the three data-driven methods, their relative predictive power (the magnitude of $C_{{\mathrm{rank}}}$) follows an order similar to that observed in Fig. 3a, b. The CNN prediction stands out to be the best. Note that the predictive power generally decreases with increasing strain for physical indicators. In this regard, our CNN has a major advantage over other methods at larger strains, which is important for monitoring the correlation between structure and mechanical response of amorphous solids. As SM analysis on large 3D samples is too computationally expensive, we used instead Cu₅₀Zr₅₀ glasses containing 10,000 atoms. Such smaller samples were also analyzed via CNN, to allow head-to-head comparison (included in Fig. 3d). The consistency in $C_{{\mathrm{rank}}}$ obtained with our CNN models on smaller and larger samples (red circles almost overlap with white pentagons in Fig. 3d) indicates the robustness of our CNN models with no sample-size dependence. This is not a big surprise, as the r_c we used is sufficiently large, and so are our training samples.

In addition, we also used a different and extremely stringent criterion, an overlap ratio, ${\Re}$, to gauge the fraction of correctly predicted particles. ${\Re}$ is evaluated by dividing the number of particles belonging simultaneously to two groups that are intersecting (partially overlapping), with the number of particles known to have the top 0.5% D²_min. Here, the intersection/overlapping particles are those that fall into not only the group known to have the top 0.5% D²_min, but at the same time also the group having the top 0.5% plastic susceptibility predicted using that particular method/indicator. In the ideal case, the particles predicted to have the top 0.5% plastic susceptibility are expected to be the same particles known to have the top 0.5% D²_min (i.e., all the particles in that pool). As such, the closer to 1.0 the ${\Re}$ ratio, the higher the predictive capability. The corresponding results are displayed in Supplementary Fig. 18, which shows a similar trend as in Fig. 3c, d. This further confirms the advantage of our CNN method as the most robust of the six, disregarding the metric adopted to characterize the strength of correlation. Under this extremely stringent criterion, the advantage of CNN over GNN is evident on the 2D glasses, see Supplementary Fig. 18a.

To demonstrate that the orientation dependence has been clearly captured, we show in Fig. 4 the CNN-predicted plastic susceptibility for a 2D local configuration when it is rotated gradually: the rotationally non-invariant SDM turns out to be highly sensitive to minor orientation change. In its initial orientation ($\theta = 0^\circ$), when its loosest and densest packing directions align with the elongation (red dashed line) and contraction (black dashed line) directions of mechanical loading, respectively, CNN predicts the highest plastic susceptibility (class probability very close to 1.0). With the counter-clockwise rotation, these packing directions gradually misalign with the loading ones, and the predicted plastic susceptibility goes down. At $\theta \cong 90^\circ$, the loosest packing of the configuration lines up with the contraction direction whereas the densest packing with the elongation direction, class probability drops to a value very close to zero. When $\theta$ increases to ~180°, the favorable alignment comes back, such that the plastic susceptibility is predicted to be the highest again. Likewise, the lowest reemerges when $\theta = 270^\circ$. This sensitivity to the coupling between the anisotropic local structure and the loading direction demonstrates that, when we feed the images (configurations) into CNN, the algorithm is very good at recognizing this coupling and churning out predictions consistent with the insight in Fig. 2. The blue dotted line in the polar plot of Fig. 4 represents D²_min of the local configuration at different θ (see “Methods”), which almost matches the CNN prediction. This underscores the benefit and advance enabled by our approach to innovate a new structure representation and combine it with CNN.

**Fig. 4: The orientation dependence of CNN-predicted plastic susceptibility for a 2D local configuration.**

Generalize to different processing history or compositions

Our previous work³⁵ demonstrated that the SVM models, trained on samples having the same processing history at a single composition, can be generalized to samples with different processing histories or at different compositions in the same alloy system. To see if such a generalization is possible with our CNN models, 50 2D samples were quenched over different time periods in the range from 10⁰ to 10⁶ t₀. For 3D glasses, we also quenched 10 Cu₅₀Zr₅₀ samples with effective cooling rates in the range from 10⁹ to 10¹³ K/s, and 10 Cu_xZr_100-x glasses with different compositions (x = 20, 25, …, 80) at the same effective cooling rate of 10¹⁰ K/s. Each of these 2D (3D) samples was sheared using 4 (24) different loading protocols with the AQS method. We then constructed test datasets for each processing history and each composition, by selecting all of the atoms with D²_min above a threshold and a same number of atoms randomly from the rest. The accuracy achieved on these datasets, using the CNN model previously trained at a single composition for a particular processing history (see caption and “Methods”), is presented in Fig. 5a–c. We find a very high accuracy across a wide composition range (Fig. 5c), especially for samples with slower cooling (Fig. 5a, b). For example, for 3D Cu₅₀Zr₅₀ quenched at cooling rate of 10⁹ K/s, the highest accuracy achieved is 92.40%. The decrease in accuracy for faster cooling rate was observed before^22,35,38,52, as the critical structural difference becomes increasingly difficult to distinguish³⁸. The more relaxed glasses, on the other hand, have a much smaller population of fertile sites that exhibit larger shear susceptibility, making the recognition easier and prediction more robust.

**Fig. 5: The ability of CNN models to generalize.**

This transferability of the model offers the desirable applicability to glasses across different compositions and/or different processing histories, to compare plastic susceptibility. An example is shown in Supplementary Fig. 19a (for 2D) and b (for 3D). Here, the glass structure is generically characterized by the fraction of predicted “y_i = 1” particles out of all particles. This fraction, f_py=1 (the subscript means “predicted (labeled) y = 1”), is given by the fraction of particles with class probability > 0.5, and scales almost linearly with the average class probability. These plots reveal a strong correlation between the glass structure and shear modulus (G), found to be robust at any strain as long as it is well below that corresponding to global yielding.

A possible reason for the transferability is that our training datasets, although from one composition and the same processing history, are large enough to include almost all of the possible local environments for samples with different processing histories and/or at different compositions within the same alloy system, as suggested in our previous work³⁵. Achieving transferability across different alloy systems while keeping high predictive capability requires more exploration but it is beyond the scope of the current work.

Discussion

Our results presented in this paper bring about several advances over previous efforts to identify structural “defects” in amorphous solids, i.e., particles that are most prone to rearrangements, in particular shear transformations. First of all, a qualitative but crucial structural difference, in terms of a more non-uniform and asymmetric packing environment, was discovered for atoms with high plastic susceptibility relative to those with low plastic susceptibility. This difference is found to be the origin of the anisotropic local mechanical response in amorphous solids, contributing to the full resolution of this long-standing puzzle. Our expectation based on this insight, starting from Fig. 2, is reflected in the eventual CNN predictions, as highlighted in Fig. 4. This new insight goes beyond that obtained based on the previous concept of geometrically unfavored motifs (GUMs)¹⁸. The GUMs refer to coordination polyhedra, an elementary but incomplete structural representation, especially since the local property is known to be influenced by the surroundings beyond the first and second nearest neighbors. In addition, it is difficult to identify the softest shear direction merely based on the local GUMs. The insight presented in this work may be assessed from experiments if the orientational pair correlation functions can be mapped out in future experiments. Second, we achieved unprecedented accuracy in predicting orientation-dependent local mechanical response over a wide range of shear strain, by designing a rotationally non-invariant structure representation coupled with a powerful deep learning algorithm, as highlighted in Fig. 3. This accomplishes what is left unsolved in recent attempts invoking vectors, symmetry, and orientational order^53,54,55, which only consider some aspects of the topological information in the first nearest-neighbor shell and lack the capacity to handle the coupling with the direction of the imposed shear. Here, we have used the SDM and the orientational pair correlation functions to cover the entire local environment of an atom—the full gene of the glass structure beyond the first neighboring shell. Because our representation is no longer rotation-invariant, our approach unveils the predictive power of the static structure of amorphous solids in forecasting direction-dependent local plastic response. We are aware that factors beyond static structures, including short-term dynamics, are also relevant to relaxation responses at finite temperatures. However, there has been a pressing need to gauge how much predictive power can be garnered, from the initial static structure alone. This is a question that the community wants a clear answer for, as the static structure is usually the information that becomes available first. Third, once the optimized CNN model is in hand, all that is needed to predict plastic response is atomic positions, without relying on other knowledge such as interparticle interactions required by previous approaches using parameters based on physical properties. Fourth, we have demonstrated that the CNN models trained on a single glass (with one set of a particular composition and a specific processing history) can be generalized to samples with different processing histories or at different compositions in the same alloy system. This is important for probing into the effects of processing procedure or chemical composition on properties, enabling the comparison between different glasses. These four merits open new avenues to the understanding of the structure–property relations in amorphous solids. Fifth and finally, we anticipate that our novel structure representation (SDM), in combination with the powerful CNN method, would find use in the studies of other amorphous matter, beyond metallic glasses, as well as in predicting rotation-invariant properties of crystalline materials, via data augmentation⁴¹.

Methods

The data used for training and validating deep learning models are from molecular dynamics simulations⁵⁶, which are implemented in the LAMMPS package⁵⁷.

Preparation of 2D model glasses

2D binary glasses are composed of equal-massed (m) small (S) and large (L) particles which interact via a standard 6–12 L-J potential. The full details of the potential are presented elsewhere⁴⁵. We chose our composition such that the ratio between the species is N_L:N_S = $\left( {1 + \sqrt 5 } \right):4$ to be consistent with previous studies^14,22,37 of this system and each of the sample contains N = 10,000 particles. All units will be expressed in terms of m as well as ${\it{\epsilon }}$ and $\sigma$, the parameters describing the energy and length scale, respectively, of the interparticle interaction. The characteristic time is $t_0 = \sigma \sqrt {m/\varepsilon }$. The glass transition temperature T_g of this system is known to be 0.325$\varepsilon /k$^14,22,37, where $k$ is the Boltzmann constant. Periodic boundary conditions were imposed on square boxes of linear dimension 98.8045$\sigma$. The density of the system ($N/L^2 \approx 1.02\sigma ^{ - 2}$) was kept constant. A total of 5000 2D glass samples were each obtained by continuously decreasing the temperature from a liquid state, well equilibrated at 1.08 T_g and quenched to a low-temperature (0.092 T_g) solid state over a period of 10⁶ t₀ using a Nose–Hoover thermostat. Then, a static relaxation using conjugate gradient algorithm was applied to bring the system to mechanical equilibrium before conducting deformation simulation²². Particle positions in this state were used to construct structure representations. Our training, validation, and test datasets contained 4900, 50, and 50 samples, respectively. To demonstrate that the CNN models are valid for the glasses with different processing histories, 50 additional samples were quenched from 1.08 T_g to 0.092 T_g over different period in the range from 10⁵ t₀ to 10⁰ t₀.

Deforming 2D model glasses

Each of the 2D glasses was deformed under four loading conditions (positive and negative simple shear and positive and negative pure shear) with an AQS method^50,51. In positive (negative) simple shear, the applied strain increment is $\Delta \gamma _{xy} = \Delta \gamma$ ($\Delta \gamma _{xy} = - \Delta \gamma$), and in positive (negative) pure shear, the applied strain increments are $\Delta \varepsilon _{xx} = - \Delta \varepsilon _{yy} = - \Delta \gamma /2$ ($\Delta \varepsilon _{xx} = - \Delta \varepsilon _{yy} = \Delta \gamma /2$). Here, $\Delta \gamma$ = 10^-5. After each deformation increment, the system was relaxed to its mechanical equilibrium.

Mechanical response of a local 2D configuration at arbitrary orientation

To probe the mechanical response of a local 2D configuration at arbitrary orientation, i.e., the blue dotted curve in the Fig. 4, we deleted particles⁵⁸ outside the largest circular region from the original square box. Next, we rotated the circular configuration by various angles counter-clockwise, and then conducted simple shear on it with the AQS method. During deformation, the particles in the outer annulus (wall) of width 5$\sigma$ (two times of potential cutoff) only experience applied affine displacement, i.e., they are not allowed to relax such that the periodic boundary conditions are lost.

Preparation of 3D model glasses

The 3D Cu₅₀Zr₅₀ metallic glasses were simulated with an optimized EAM potential adopted from ref. ⁴⁶. Each of the 1500 samples contained 32,000 atoms in a cubic box. After adequate equilibration at 2000 K, the liquid was first quenched to 1500 K at a cooling rate of 10¹³ K/s, followed by 10¹² K/s to 1000 K, then at the desired rate of 10¹⁰ K/s (effective cooling rate, in the text) to 500 K, and finally at 10¹³ K/s to 50 K in the NPT ensemble using a Nose–Hoover thermostat with zero external pressure. The periodic boundary condition was applied in all three directions. The samples were then brought to mechanical equilibrium through a static relaxation via a conjugate gradient algorithm. Our training, validation, and test datasets for the 3D system contain 1480, 10, and 10 samples, respectively. To demonstrate that the CNN models are valid for glasses with different processing histories or at different compositions, 10 Cu₅₀Zr₅₀ glasses were also quenched using different effective cooling rates in the range from 10⁹ to 10¹³ K/s, and at each different composition (Cu_xZr_100-x, x = 20, 25 …, 80, in step of 5), 10 metallic glasses were quenched with an effective cooling rate of 10¹⁰ K/s.

Deforming 3D model glasses

Before extracting atomic coordinates to construct structure representation and conducting deformation simulation in each of the 24 loading orientations listed in Supplementary Table 1, we rotated each configuration such that the shear direction and the normal direction of the shear plane were parallel to X and Y axes, respectively, of the coordinate system. For some loading orientations, appropriate atom replication was needed after rotation, which is illustrated in Supplementary Fig. 20. Each of the 3D glasses was deformed in simple shear along the 24 different loading orientations via an AQS method^50,51. After each applied strain increment of $\Delta \gamma _{xy}$ = 10⁻⁴, the system was relaxed to its mechanical equilibrium. We calculated D²_min per definition in ref. ³⁷, implemented in the OVITO package⁵⁹.

Upper bound of the prediction accuracy

To assess this ceiling, we make use of the particles from the test dataset, which were already labeled after deformation. Another deformation simulation was then conducted, after shuffling the sequence of all particles, as a new job run on a different computer. The resultant new configuration was used to compute D²_min of all the particles, which were labeled again. The number of particles always having the same label, determined based on both the old and the new D²_min, was divided by the total number of particles in the dataset, to give the upper bound of accuracy that can be reached via any method. Note that numerical error, when the atom sequence in the initial configuration file is changed although all other deformation conditions are kept the same, renders deformation under the AQS not exactly reproducible. This error also grows with increasing strain.

Training procedure of CNN models

For the CNN models, we minimized the binary cross-entropy loss between true labels and predicted labels, and used early stopping and selected the model with the highest accuracy in the validation dataset. The learning rate started from 0.001 and was then divided by $\sqrt {10}$ once the validation accuracy plateaued. We chose RMSprop optimizer and mini-batch size of 512 and 64, for the 2D and 3D glasses, respectively. We augmented the training data by applying randomly one of the 4 (or 8) mirror symmetries to the images of 2D (or 3D) glasses every time we fed the images into the CNN. The training was implemented in the TensorFlow package⁶⁰. And we conducted distributed deep learning with the Horovod package⁶¹ to accelerate the training for 3D glasses.

GNN methods

We followed the procedure in ref. ³⁶ to construct graph with edge threshold of 2.0$\sigma$ (and 4.0 Å) for 2D (and 3D) glasses. We used the same GNN architecture as in ref. ³⁶, except that we activated the output layer with a sigmoid function as we were performing binary classification tasks, and chose n_rec = 7 and 6 for 2D and 3D glasses, respectively, according to the optimization results shown in Supplementary Fig. 21. We also tried more neurons in all multilayer perceptors, but this did not further improve the validation accuracy.

For the 2D (3D) samples in both the test and validation datasets, we loaded each sample, and constructed corresponding graph, 4 (24) times, each graph corresponding to one of the 4 (24) loading protocols. And we updated edge vectors of those graphs using relative atomic positions in the new coordinate system, the X and Y axes of which were parallel to the shear direction and the normal of shear plane, respectively, of each specific loading protocol. Atoms were labeled based on the magnitude of D²_min under the corresponding loading protocol. When evaluating the loss function and accuracy for the GNN models, only the atoms in the test and validation datasets were considered. Therefore, both test and validation datasets for the GNN are exactly the same as those for our CNN. For the 2D (3D) samples in training datasets, we constructed only one graph for each sample in order to relieve memory burden and save computation time. However, we took into account all loading directions for each of the 4900 (1480) training examples by updating at random their target labels with D²_min from one of the 4 (24) loading protocols and by rotating edge vectors correspondingly every time we fed them (both targets and graphs) to the network (note that in our earlier attempts, we tried to directly construct and load multiple graphs for each training sample, but the results obtained were similar). Also, we augmented the training data by applying randomly one of the 4 (or 8) mirror symmetries to the edge vectors of graphs of 2D (or 3D) glasses every time we fed the graphs to the GNN. When optimizing GNN models and evaluating loss function and accuracy, only the atoms in the training datasets were used to enable balanced datasets and a fair comparison with our CNN.

To minimize the binary cross-entropy loss between true labels and predicted labels, similarly to ref. ³⁶, we used early stopping and selected the model with the highest accuracy on the validation dataset after running enough epochs. We trained the GNN models with a learning rate of 10⁻⁴, gradient clipping, Adam optimizer, and a TensorFlow implementation.

SVM methods

Same as in ref. ³³, we used the following radial and angular structural functions to construct input features to SVM models:

$$G_{i,\alpha }\left( r \right) = \mathop {\sum }\limits_{j \in \alpha ,r_{ij} < r_c} {\mathrm{exp}}\left( { - \frac{{\left( {r_{ij} - r} \right)^2}}{{2\Delta ^2}}} \right)$$

$$\Psi _{i,\alpha ,\beta }\left( {\xi ,\lambda ,\varsigma } \right) = \mathop {\sum }\limits_{j \in \alpha } \mathop {\sum }\limits_{k \in \beta } {\mathrm{exp}}\left( { - \frac{{r_{ij}^2 + r_{jk}^2 + r_{ik}^2}}{{\xi ^2}}} \right)\left( {1 + \lambda \cos \theta _{ijk}} \right)^\varsigma$$

where $r_{ij}$ denotes the distance between atoms i and j, and $\alpha$ or $\beta$ denote the species, $\theta _{ijk}$ is the angle between vector ${\mathbf{{r}}}_{ij}$ and ${\mathbf{{r}}}_{ik}$, and $r \in [r_0 + 0.5\Delta ,r_c)$ with an increment of $\Delta$. We use $r_0$ = 0.5$\sigma$, $r_c$ = 5.0$\sigma$, and $\Delta$ = 0.025$\sigma$, and $r_0$ = 2.0 Å, $r_c$ = 8.0 Å, and $\Delta$ = 0.050 Å for 2D and 3D glasses, respectively, based on the findings in Supplementary Fig. 22. For the parameters in the angular structural functions, similar to ref. ³⁶, we used $\lambda \in \left\{ { - 1,1} \right\},\varsigma \in \left\{ {1,2,4,8,16} \right\}$, and 16 values of $\xi$ equally spaced between 0.75$\sigma$ and 5.0$\sigma$ (between 3.0 and 8.0 Å) for 2D (3D) glasses. We tried to use 4, 8, 16, or 32 values for $\xi$ in the given region but found only a weak dependency. We also tried to use exactly the same angular parameters as in the original work³³, but the results came out slightly worse.

We conducted L2-regularized L2-loss support vector classifications with a linear kernel, which were executed in the LIBLINEAR package⁶² by solving the primal optimization problem. A regularization parameter of C = 100 was used, as we tried many different values over the range from 10⁻⁷ to 10⁵ and found only a weak dependency once C was above 10. We also tuned the termination criterion, which did not further improve the accuracy on validation datasets. Note that 500 (100) samples were sufficient to construct training datasets for 2D (3D) glasses, because training the SVM models does not require large datasets, as demonstrated in Supplementary Fig. 23.

Supplementary Tables 2 and 3, for 2D and 3D glasses, respectively, present a summary of the input features and trainable parameters of the three machine-learning models.

Calculating the flexibility volume

The flexibility volume is defined as the product of vibrational mean squared displacement (MSD) and atom spacing¹⁹. We calculated MSD at 0.092 T_g for 2D glasses and at 50 K for 3D glasses, using the procedure in ref. ¹⁹. The MSD was averaged over 100 independent runs, all starting from the same configuration but different initial velocities. And the atomic volume is calculated based on Voronoi analysis.

SM analysis

The normal mode analysis of the glasses was carried out by diagonalizing the dynamical matrix of the inherent structure of glasses. The participation fraction of atom i in eigenmode $e_\omega$ is $p_i = \left| {\vec e_\omega ^i} \right|^2$, where $\vec e_\omega ^i$ is the corresponding polarization vector of atom i⁶³. Here, $p_i$ was summed over a small fraction of 1% (same as that in ref. ¹⁸) of the lowest-frequency normal modes. $p_i$ measures the involvement of atom i in SMs.

Data availability

The data that support the findings of this study are available from the corresponding author on request.

Code availability

We used open source packages LAMMPS and TensorFlow for our molecular dynamics simulations and deep learning, respectively. All custom codes are available from the corresponding author upon request.

References

Cottrell, A. H. The Mechanical Properties of Matter (Wiley, 1964).
Hirth, J. & Lothe, J. Theory of Dislocations (John Wiley & Sons, 1982).
Schuh, C. A., Hufnagel, T. C. & Ramamurty, U. Mechanical behavior of amorphous alloys. Acta Mater. 55, 4067–4109 (2007).
Article ADS CAS Google Scholar
Cheng, Y. Q. & Ma, E. Atomic-level structure and structure–property relationship in metallic glasses. Prog. Mater. Sci. 56, 379–473 (2011).
Article CAS Google Scholar
Greer, A. L., Cheng, Y. Q. & Ma, E. Shear bands in metallic glasses. Mater. Sci. Eng.: R: Rep. 74, 71–132 (2013).
Article Google Scholar
Ma, E. Tuning order in disorder. Nat. Mater. 14, 547–552 (2015). 2015.
Article ADS CAS PubMed Google Scholar
Hufnagel, T. C., Schuh, C. A. & Falk, M. L. Deformation of metallic glasses: Recent developments in theory, simulations, and experiments. Acta Mater. 109, 375–393 (2016).
Article ADS CAS Google Scholar
Spaepen, F. A microscopic mechanism for steady state inhomogeneous flow in metallic glasses. Acta Metall. 25, 407–415 (1977).
Article CAS Google Scholar
Srolovitz, D., Maeda, K., Vitek, V. & Egami, T. Structural defects in amorphous solids statistical analysis of a computer model. Philos. Mag. A 44, 847–866 (1981).
Article ADS CAS Google Scholar
Egami, T. Atomic level stresses. Prog. Mater. Sci. 56, 637–653 (2011).
Article CAS Google Scholar
Johnson, W. L. & Samwer, K. A universal criterion for plastic yielding of metallic glasses with a (T/T_g)^2/3 temperature dependence. Phys. Rev. Lett. 95, 195501 (2005).
Article ADS CAS PubMed Google Scholar
Johnson, W. L., Demetriou, M. D., Harmon, J. S., Lind, M. L. & Samwer, K. Rheology and ultrasonic properties of metallic glass-forming liquids: a potential energy landscape perspective. MRS Bull. 32, 644–650 (2007).
Article CAS Google Scholar
Tsamados, M., Tanguy, A., Goldenberg, C. & Barrat, J.-L. Local elasticity map and plasticity in a model Lennard-Jones glass. Phys. Rev. E 80, 026112 (2009).
Article ADS CAS Google Scholar
Shi, Y. & Falk, M. L. Strain localization and percolation of stable structure in amorphous solids. Phys. Rev. Lett. 95, 095502 (2005).
Article ADS PubMed CAS Google Scholar
Shi, Y. & Falk, M. L. Atomic-scale simulations of strain localization in three-dimensional model amorphous solids. Phys. Rev. B 73, 214201 (2006).
Article ADS CAS Google Scholar
Peng, H. L., Li, M. Z. & Wang, W. H. Structural signature of plastic deformation in metallic glasses. Phys. Rev. Lett. 106, 135503 (2011).
Article ADS CAS PubMed Google Scholar
Manning, M. L. & Liu, A. J. Vibrational modes identify soft spots in a sheared disordered packing. Phys. Rev. Lett. 107, 108302 (2011).
Article ADS CAS PubMed Google Scholar
Ding, J., Patinet, S., Falk, M. L., Cheng, Y. & Ma, E. Soft spots and their structural signature in a metallic glass. PNAS 111, 14052–14056 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Ding, J. et al. Universal structural parameter to quantitatively predict metallic glass properties. Nat. Commun. 7, 13733 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Fan, Z., Ding, J., Li, Q.-J. & Ma, E. Correlating the properties of amorphous silicon with its flexibility volume. Phys. Rev. B 95, 144211 (2017).
Article ADS Google Scholar
Fan, Z., Ding, J. & Ma, E. Making glassy solids ductile at room temperature by imparting flexibility into their amorphous structure. Mater. Res. Lett. 6, 570–583 (2018).
Article CAS Google Scholar
Patinet, S., Vandembroucq, D. & Falk, M. L. Connecting local yield stresses with plastic activity in amorphous solids. Phys. Rev. Lett. 117, 045501 (2016).
Article ADS PubMed CAS Google Scholar
Zylberg, J., Lerner, E., Bar-Sinai, Y. & Bouchbinder, E. Local thermal energy as a structural indicator in glasses. PNAS 114, 7289–7294 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Xu, B., Falk, M. L., Li, J. F. & Kong, L. T. Predicting shear transformation events in metallic glasses. Phys. Rev. Lett. 120, 125503 (2018).
Article ADS CAS PubMed Google Scholar
Liu, C., Guan, P. & Fan, Y. Correlating defects density in metallic glasses with the distribution of inherent structures in potential energy landscape. Acta Mater. 161, 295–301 (2018).
Article ADS CAS Google Scholar
Peng, C.-X. et al. Bond length deviation in CuZr metallic glasses. Phys. Rev. B 96, 174112 (2017).
Article ADS Google Scholar
Steinhardt, P. J., Nelson, D. R. & Ronchetti, M. Bond-orientational order in liquids and glasses. Phys. Rev. B 28, 784–805 (1983).
Article ADS CAS Google Scholar
Lewandowski, J. J., Wang, W. H. & Greer, A. L. Intrinsic plasticity or brittleness of metallic glasses. Philos. Mag. Lett. 85, 77–87 (2005).
Article ADS CAS Google Scholar
Qiao, J. C. et al. Fast secondary relaxation and plasticity initiation in metallic glasses. Natl Sci. Rev. 5, 616–618 (2018).
Article CAS Google Scholar
Yang, Q., Peng, S.-X., Wang, Z. & Yu, H.-B. Shadow glass transition as a thermodynamic signature of β relaxation in hyper-quenched metallic glasses. Natl Sci. Rev. 7, 1896–1905 (2020).
Article PubMed PubMed Central Google Scholar
Kumar, G., Neibecker, P., Liu, Y. H. & Schroers, J. Critical fictive temperature for plasticity in metallic glasses. Nat. Commun. 4, 1536 (2013).
Article ADS PubMed CAS Google Scholar
Zhu, F., Song, S., Reddy, K. M., Hirata, A. & Chen, M. Spatial heterogeneity as the structure feature for structure–property relationship of metallic glasses. Nat. Commun. 9, 3965 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Cubuk, E. D. et al. Identifying structural flow defects in disordered solids using machine-learning methods. Phys. Rev. Lett. 114, 108001 (2015).
Article ADS CAS PubMed Google Scholar
Wang, Q. & Jain, A. A transferable machine-learning framework linking interstice distribution and plastic heterogeneity in metallic glasses. Nat. Commun. 10, 5537 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Fan, Z., Ding, J. & Ma, E. Machine learning bridges local static structure with multiple properties in metallic glasses. Mater. Today 40, 48–62 (2020).
Article CAS Google Scholar
Bapst, V. et al. Unveiling the predictive power of static structure in glassy systems. Nat. Phys. 16, 448–454 (2020).
Article CAS Google Scholar
Falk, M. L. & Langer, J. S. Dynamics of viscoplastic deformation in amorphous solids. Phys. Rev. E 57, 7192–7205 (1998).
Article ADS CAS Google Scholar
Barbot, A. et al. Local yield stress statistics in model amorphous solids. Phys. Rev. E 97, 033001 (2018).
Article ADS CAS PubMed Google Scholar
Schwartzman-Nowik, Z., Lerner, E. & Bouchbinder, E. Anisotropic structural predictor in glassy materials. Phys. Rev. E 99, 060601 (2019).
Article ADS CAS PubMed Google Scholar
Xu, B., Falk, M. L., Patinet, S. & Guan, P. Atomic Nonaffinity as a Predictor of Plasticity in Amorphous Solids. https://arxiv.org/abs/1905.12035 (2020).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. in Advances in Neural Information Processing Systems 25 (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–061105 (Curran Associates, Inc., 2012).
Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-scale Image Recognition. https://arxiv.org/abs/1409.1556 (2015).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 770–778 (2016).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS PubMed Google Scholar
Lançon, F., Billard, L. & Chaudhari, P. Thermodynamical properties of a two-dimensional quasi-crystal from molecular dynamics calculations. EPL 2, 625–629 (1986).
Article ADS Google Scholar
Cheng, Y. Q., Ma, E. & Sheng, H. W. Atomic level structure in multicomponent bulk metallic glass. Phys. Rev. Lett. 102, 245501 (2009).
Article ADS CAS PubMed Google Scholar
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article ADS PubMed CAS Google Scholar
Ioffe, S. & Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167 (2015).
Argon, A. S. Plastic deformation in metallic glasses. Acta Metall. 27, 47–58 (1979).
Article CAS Google Scholar
Malandro, D. L. & Lacks, D. J. Molecular-level mechanical instabilities and enhanced self-diffusion in flowing liquids. Phys. Rev. Lett. 81, 5576–5579 (1998).
Article ADS CAS Google Scholar
Maloney, C. & Lemaître, A. Universal breakdown of elasticity at the onset of material failure. Phys. Rev. Lett. 93, 195501 (2004).
Article ADS PubMed CAS Google Scholar
Richard, D. et al. Predicting Plasticity in Disordered Solids from Structural Indicators. https://arxiv.org/abs/2003.11629 (2020).
Rieser, J. M., Goodrich, C. P., Liu, A. J. & Durian, D. J. Divergence of voronoi cell anisotropy vector: a threshold-free characterization of local structure in amorphous materials. Phys. Rev. Lett. 116, 088001 (2016).
Article ADS PubMed CAS Google Scholar
Milkus, R. & Zaccone, A. Local inversion-symmetry breaking controls the boson peak in glasses and crystals. Phys. Rev. B 93, 094204 (2016).
Article ADS CAS Google Scholar
Yang, J. et al. Structural parameter of orientational order to predict the boson vibrational anomaly in glasses. Phys. Rev. Lett. 122, 015501 (2019).
Article ADS CAS PubMed Google Scholar
Brooks, C. L. Computer simulation of liquids. J. Solut. Chem. 18, 99–99 (1989).
Article Google Scholar
Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Article ADS CAS MATH Google Scholar
Gendelman, O., Jaiswal, P. K., Procaccia, I., Gupta, B. S. & Zylberg, J. Shear transformation zones: state determined or protocol dependent? EPL 109, 16002 (2015).
Article ADS CAS Google Scholar
Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO—the open visualization tool. Model. Simul. Mater. Sci. Eng. 18, 015012 (2010).
Article ADS Google Scholar
Abadi, M. et al. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) 265–283 (2016).
Sergeev, A. & Del Balso, M. Horovod: Fast and Easy Distributed Deep Learning in TensorFlow. https://arxiv.org/abs/1802.05799 (2018).
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R. & Lin, C.-J. LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008).
MATH Google Scholar
Widmer-Cooper, A., Perry, H., Harrowell, P. & Reichman, D. R. Irreversible reorganization in a supercooled liquid originates from localized soft modes. Nat. Phys. 4, 711–715 (2008).
Article CAS Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge Prof. M. L. Falk for discussions and his insightful comments on the manuscript, and V. Bapst for the useful discussions on training the GNN model. The work is supported at JHU by the U.S. Department of Energy (DOE) DOE-BES-DMSE under grant DE-FG02-19ER46056. This research used the CPU resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and the Maryland Advanced Research Computing Center (MARCC). The authors also acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing GPU resources that have contributed to the research results reported within this paper. The early attempts of deep learning used the GPU resource from MARCC.

Author information

Authors and Affiliations

Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, MD, USA
Zhao Fan & Evan Ma

Authors

Zhao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Evan Ma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Both authors conceived the research project. Z.F. designed the structure representation, conducted the MD simulations, and carried out the machine/deep learning analysis. Both authors contributed to the discussions and the writing of the paper. E.M. left the project in July 2020.

Corresponding author

Correspondence to Zhao Fan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Victor Bapst, Hai-Bin Yu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fan, Z., Ma, E. Predicting orientation-dependent plastic susceptibility from static structure in amorphous solids via deep learning. Nat Commun 12, 1506 (2021). https://doi.org/10.1038/s41467-021-21806-z

Download citation

Received: 28 November 2020
Accepted: 12 February 2021
Published: 08 March 2021
DOI: https://doi.org/10.1038/s41467-021-21806-z

This article is cited by

Microscopic mechanisms of pressure-induced amorphous-amorphous transitions and crystallisation in silicon
- Zhao Fan
- Hajime Tanaka
Nature Communications (2024)
Identifying key features for predicting glass-forming ability of bulk metallic glasses via interpretable machine learning
- Yangchuan Zeng
- Zean Tian
- Quan Xie
Journal of Materials Science (2024)
Topology of vibrational modes predicts plastic events in glasses
- Zhen Wei Wu
- Yixiao Chen
- Limei Xu
Nature Communications (2023)
Developing novel amorphous alloys from the perspectives of entropy and shear bands
- Shidong Feng
- Zhen-Qiang Song
- Ri-Ping Liu
Science China Materials (2023)
Molecular Mechanics of Disordered Solids
- Franz Bamer
- Firaz Ebrahem
- Benjamin Stamm
Archives of Computational Methods in Engineering (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.