A transferable machine-learning framework linking interstice distribution and plastic heterogeneity in metallic glasses

When metallic glasses (MGs) are subjected to mechanical loads, the plastic response of atoms is non-uniform. However, the extent and manner in which atomic environment signatures present in the undeformed structure determine this plastic heterogeneity remain elusive. Here, we demonstrate that novel site environment features that characterize interstice distributions around atoms combined with machine learning (ML) can reliably identify plastic sites in several Cu-Zr compositions. Using only quenched structural information as input, the ML-based plastic probability estimates (“quench-in softness” metric) can identify plastic sites that could activate at high strains, losing predictive power only upon the formation of shear bands. Moreover, we reveal that a quench-in softness model trained on a single composition and quench rate substantially improves upon previous models in generalizing to different compositions and completely different MG systems (Ni62Nb38, Al90Sm10 and Fe80P20). Our work presents a general, data-centric framework that could potentially be used to address the structural origin of any site-specific property in MGs.

The pathway of quenching, uniaxial compression test, plastic heterogeneity indicator calculation and datasets construction in generating atomic-level plastic heterogeneity data of Cu-Zr metallic glasses for machine learning. For Ni62Nb38, Al90Sm10 and Fe80P20, we simulate both tensile and compressive deformation with different strain rates and periodic boundary conditions (Methods).

Supplementary Figure 3 | Stress-strain responses.
Stress-strain curves of Cu50Zr50 (a), Cu65Zr35 (b) and Cu80Zr20 (c) metallic glasses obtained with quenching rates of 5 × 10 10 K s -1 , 5 × 10 11 K s -1 and 5 × 10 12 K s -1 , respectively. The plastic atoms of at the strain from 2% to 8% with an interval of 1% as well as that at the strain of 6.5% are shown as cyan-blue atoms in the inset picture of (b). The Cu-Zr glass samples (each containing 345600 atoms) are compressed along Z axis under a strain rate of 2.5 × 10 7 s -1 in a quasi-static mode. Compression is carried out at a low temperature of 50 K. Periodic boundary conditions are imposed in Y and Z axes and free surfaces are applied along X axis to allow offsets.
In contrast to the dislocation-mediated mechanism in crystals, the deformation in metallic glasses (MGs) is largely heterogeneous and difficult to discern. Effective mechanical indicators are needed to detect plastic events in MGs. In this work, we employ non-affine displacement (D 2 ) as the plastic indicator 4 . To calculate D 2 , the locally affine transformation matrix, J i , that best maps where d ji and d ji 0 are the vector separation (row vectors) between atom j and i and superscript 0 indicates the reference configuration. Here, j is one of atom i's nearest neighbors, and N i 0 is the total number of nearest-neighbors of atom i at the reference configuration.
Then, the D 2 for atom i can be defined as the residual of the least square fit: Here D 2 is calculated based on a slightly modified code from OVITO 5 . The cutoff in calculating D 2 is set to be 4.5 Å, which falls roughly between the 1 st and 2 nd peaks in the pair correlation functions of the Cu-Zr glasses. By setting a D 2 threshold of 5.0 Å 2 , we can indicate whether an atom is plastic or not at a typical strain such as 4.0%.
In addition to non-affine displacement (D 2 ), local von-mises strain invariant (η mises ) is also a commonly used plastic indicator 4 . To calculate η mises of for atom i, we also calculate the locally affine transformation matrix, J i , that best maps !d ji , as shown in Equation 2 of the main text. The local Lagrangian strain matrix can then be derived as The local shear invariant η mises of atom i can then be computed as The compatibility of D 2 and η mises can be seen in the following Supplementary Figure 4. One can see that the regions with prominent D 2 and η mises are overlapped to a large degree, suggesting that these two plastic indicators capture similar plastic rearrangement regions in the MGs.

Supplementary Figure 4 | Compatibility of D 2 with other plastic indicators.
Compatibility of non-affine displacement (D 2 ) with another plastic indicator of local shear strain η mises . Distribution of non-affine displacement D 2 (left) and local shear strain η mises at a typical strain of 0.04 of the Cu65Zr35 quenched under 5 × 10 10 K s -1 .

Supplementary Note 2. Recursive feature elimination in Cu-Zr MGs
Supplementary Figure 5 | Retained AUC-ROC with varying number of features after recursive feature elimination for Cu65Zr35 @ 5 × 10 10 K s -1 . As stated in the Methods, in this work, we use AUC-ROC as the scoring metric, as it best suits our target of this study. In terms of the 3 rd point (AUC-ROC is robust with the imbalanced data), Supplementary Figure 6 illustrates the AUC-ROCs of data with different degrees of imbalance: i) an imbalanced dataset of an unseen glass configuration without any sampling; ii) a balanced test set after equal undersampling and train-test (80:20) split; iii) an imbalanced dataset combining the balanced test set of ii) with all the unsampled majority samples.

Supplementary Note 3. Symmetry functions and other existing features
In previous works, symmetry functions, originally proposed to represent the potential energy landscape of systems 6,7 , have been successfully employed to featurize the local environments in various disordered solids and liquids. 1-3 The symmetry functions are individually less descriptive but can collectively provide a complete and unbiased description of the structure. As presented in the Methods, we follow the settings of r, ξ, λ, and ζ in previous works 1-3 to calculate the symmetry functions for each atom in our studied MGs. Specifically, for an atom i in Cu-Zr MGs, we derive 100 radial functions (50 for i-Cu and 50 for i-Zr) by varying r from 0 to 5.0 × Cu-Cu equilibrium distance (sum of metallic radii) in increments of 0.1 × Cu-Cu equilibrium distance, and 66 angular functions (22 for Cu-i-Cu, Cu-i-Zr and Zr-i-Zr, respectively) by using 22 sets of , , and for each atom. Please refer to the original papers 1-3 for details of parameter settings.
We then take the 166 features as input and train a GBDT model on the same datasets and cross-validation (CV) splits with our interstice features. The hyperparameters of the GBDT model are also optimized based on Supplementary Table 1. After optimizing a model on the data sampled from two glass configurations, the model is generalized to the unseen configuration set-aside for tests (Methods). Supplementary Table 3 shows the results of the ML models on the unseen glass configuration when using the symmetry functions as input. The results of using all the 80 interstice features or the 15 interstice features after feature reduction are also displayed for comparison. Characteristic motifs

Volume metrics
Cluster packing efficiency 12 , atomic packing efficiency 13 , and the ratio of atomic volume to the Voronoi polyhedron volume around each site.
where ni is Voronoi index (i in the range of 3-7), reflecting the strength of i-fold symmetry in local sites 14 . 5 Weighted i-fold symm idx3…7 Weighted i-fold symmetry indices using Voronoi facet areas. 5 BOOP

q4…10-Voro/Dist and w4…10-Voro/Dist
Lowest-order and higher-order rotation-invariant ql (l = 4, 6, 8 and 10) of the lth moment in a multipole expansion of the bond vector distribution on a unit sphere 11  In addition to symmetry functions, there are also some conventional short-range order (SRO) structural signatures in the field of MGs. Supplementary Table 4 lists some typical structural signatures proposed in the past studies [8][9][10][11][12][13][14][15][16] . These signatures can readily be derived from the structure itself, without requiring knowledge of detailed interatomic interactions, and are thus pure structural signatures as well. In Supplementary Table 4, the first 7 feature sets are geometrical features (GSRO) and the last feature set is the chemical (CSRO) feature set.
We then featurize the atoms in Cu-Zr MGs with these existing SRO features, and try augmenting our features with these SRO features. However, the AUCs only have a minimal change, as summarized in Supplementary Table 5. We also tried generating the corresponding medium-range order (MRO) versions of the existing SRO features, following the coarse-graining technique described in the main text (e.g. Equation 1). Supplementary Table 6 lists the MRO feature sets we have generated in this work. Similarly, they can be considered as medium-range geometrical features (GMRO) or chemical features (GMRO).

Supplementary Note 4. Calibration curves of GBDT and Linear SVC models
The "quench-in softness" (QS) proposed in this work is basically the class probability estimate from the GBDT models. The idea is that when performing classification, we want to not only obtain the class label but also have a probability of the respective class that gives us some sense of confidence on the predictions 18 .

Supplementary Figure 7 | Probability density distributions and calibration curves of Linear
SVC models (upper panel) and our GBDT models (lower panel) using the same interstice distribution features and on the same data of the Cu65Zr35 @ 5 × 10 10 K s -1 . The calibration curves are plotted on the resampled balanced test set.
In previous works, researchers have defined the distances to the separating Linear SVC hyperplane as "softness" 2,3 . The distances can also serve as a continuous metric to reflect the propensity to be one class, yet the distances are unbounded and often require calibration to transform the unbounded distances into probability estimates (bounded in [0, 1]). Applying minmax scaling of the SVC distances to [0, 1] often yields bad calibration, and the calibration curve often shows sigmoid curve, which is typical for maximum-margin methods that focus on hard-toclassify samples that are close to the decision boundary (support vectors) 18 . GBDT are also prone to have bad estimates, however, here we verify that our GBDT probability estimates are wellcalibrated, largely attributed to the shallow trees and small number of trees used in our work.
As an example, Supplementary Figure 7 shows the comparison of probability density distributions and calibration curves of Linear SVC models and our GBDT models if fitted on same interstice features and same data of Cu65Zr35. The calibration curves are plotted on the undersampled balanced test set. We see that the calibration curve of raw SVM distances is quite sigmoid-like, deviating much from the perfect calibrated curve (diagonal line), while the calibration curve of our GBDT probability estimates are well-calibrated. Well-calibrated probability estimates can then be directly interpreted as the confidence level of classification. Readers can refer to the website in Ref. 18 for more examples of calibration. If one is interested in transforming the SVM distances into well-calibrated probability estimates, the parametric approach based on Platt's sigmoid model 19 and the non-parametric approach based on isotonic regression 20 could be helpful for this end. Supplementary Note 7. Interpreting the feature sets

Supplementary Note 5. ML results of Ni-Nb/Al-Sm/Fe-P MGs Supplementary
As described in the main text, we benchmarked the individual predictive power of 18 feature sets to provide a quantitative picture on their correlation with plastic heterogeneity (Figure 5a). In this section, we further interpret the ML model obtained with each feature set to show the dependence of deformation probability on these site features (Supplementary Figures 8 -25). For each feature set, two-dimensional (2-D) and one-dimensional (1-D) partial dependence plots (PDP) are shown for the top-4 features with the highest feature importance in each feature set (except for the CNVoro/Dist and Volume metrics feature sets with 2 and 3 features, respectively). Please refer to Methods or Supplementary Tables 4 and 6 for details of the SRO and MRO feature sets.