Abstract
This paper develops a Bayesian inferencebased probabilistic crack nucleation model for the Nibased superalloy René 88DT under fatigue loading. A datadriven, machine learning approach is developed, identifying underlying mechanisms driving crack nucleation. An experimental set of fatigueloaded microstructures is characterized near crack nucleation sites using scanning electron microscopy and electron backscatter diffraction images for correlating the grain morphology and crystallography to the location of crack nucleation sites. A concurrent multiscale model, embedding experimental polycrystalline microstructural representative volume elements (RVEs) in a homogenized material, is developed for fatigue simulations. The RVE domain is modeled by a crystal plasticity finite element model. An anisotropic continuum plasticity model, obtained by homogenization of the crystal plasticity model, is used for the exterior domain. A Bayesian classification method is introduced to optimally select informative state variable predictors of crack nucleation. From this principal set of state variables, a simple scalar crack nucleation indicator is formulated.
Introduction
Nibased superalloys, such as René 88DT, are widely used in hightemperature applications especially in the aerospace propulsion industry, on account of their excellent thermomechanical and chemical properties. Despite significant advantages, the fatigue life of these materials is compromised due to the presence of heterogeneities in the microstructure. While very few metallurgical defects like inclusions or voids are found in their polycrystalline microstructures, typically manufactured by powder metallurgy techniques, heterogeneities manifest as spatial variations in the crystallographic and grain morphology distributions, as well as intragranular compositions^{1}. Additionally, annealing twins are critical heterogeneities that form in the polycrystalline microstructures during processing. These microstructural variations are major contributing factors to their fatigue failure under operating conditions^{2}.
A number of studies, e.g. in^{3,4,5,6,7} have observed nucleating cracks near coherent Σ3 twin boundaries under fatigue loading conditions. For the polycrystalline René 88DT, fatigue cracks are often found to nucleate at these twin boundaries, especially where the adjacent grains also have a high Schmid factor^{1,8}. Crack nucleation is preceded by the formation of slip bands that are parallel to the coherent twin boundaries, yet slightly offset from the boundary itself^{9,10}. In addition to the high Schmid factor, high elastic modulus mismatch across the boundary and long twin boundaries have also been associated with the probable locations of crack nucleation in^{11,12,13}. A hypothesis behind such observations of crack nucleation is that these geometric and crystallographic metrics can be correlated with high values of the resolved shear stress on shared slip systems between twin and parent grains, parallel to the twin boundary^{14,15,16,17,18,19}. The studies conjecture that rather than high Schmid factors, the high resolved shear stresses promote dislocation glide along the shared plane leading to nucleation. Outofplane forest dislocations do not disrupt slip on the shared plane, and hence dislocation glide is unimpeded along the boundary. With the progression of cyclic loading, highly localized slip parallel to the twin boundary trigger the formation of persistent slip bands (PSB), which precede crack nucleation and propagation^{20}. Experiments on Nibased superalloys have shown that PSBs can form at strains below 0.35% within a single cycle of deformation^{9}.
These observations necessitate the development of robust mechanismdriven predictive models that can relate evolving state variables in the underlying polycrystalline microstructure to the nucleation of cracks. The last two decades have witnessed significant advances in the understanding and prediction of the connection between microstructure and local mechanical fields, along with critical failure indicators under fatigue loading. This has been enabled by a combination of advanced experimental characterization techniques like electron backscatter diffraction (EBSD), highenergy diffraction microscopy (HEDM), and Xray tomography/diffraction techniques, and computational models like adaptive finite element methods (FEM) and fast Fourier transform (FFT) methods integrated with crystal plasticity constitutive models. In the computational modeling approaches, virtual 2D and 3D polycrystalline microstructures are generated from materials characterization data, and subsequently imagebased micromechanical simulations are conducted to interrogate the locations of crack evolution^{21,22,23}. Comprehension of material states near the critical cracked regions has guided researchers in constructing metrics of fatigue and failure indicators. Both physicsbased and datadriven approaches have been pursued to create the appropriate fatigue metrics, such as dislocation density and plastic slipbased criteria combined with crack opening stresses for susceptible grain configurations^{24,25}, energetics and stability of PSBs in^{26}, accumulated plastic slip and elastic anisotropy in^{14}, and cyclic stored plastic energy density^{21,23,27,28}. Important fatigue indicator parameters have thereby been derived through the use of correlation metrics, variancebased sensitivity, or physicsderived arguments in combination with machine learning and Bayesian techniques^{21,22,29,30,31}.
Despite significant advances, shortcomings, e.g. in experimental throughput rates and computational efficacy, have in some cases mitigated the effectiveness of these models. Limitations in the collection of experimental data sets linking microstructural configurations to crack nucleation sites have often led to the development of models from very limited microstructural data sets, typically encompassing a single grain configuration. Alternatively, the microstructural geometry has been abandoned entirely in favor of higherscale statistical analysis. The lack of realistic experimental data leads to inherent uncertainty in the model output. The resulting fatigue indicators are not likely to retain fidelity with the introduction of new data. Furthermore, probabilistic methods, required to handle multiple experiments in a highdimensional input data space, suffer from various issues resulting from the problem dimensionality. From a modeling point of view, the computational methods must ensure accurate determination of state variables like stresses, plastic strains, dislocation densities, etc. at potential crack nucleation sites for effective development of mechanicsbased models of crack nucleation. Boundary conditions play a very important role in the prediction of localized variables in the imagebased micromechanical models^{32,33}. Conventionally used periodicity boundary conditions can lead to gross inaccuracies when modeling fatigue behavior.
This paper introduces a datadriven Bayesian model framework for predicting fatigue crack nucleation in polycrystalline microstructures of the Nibased superalloy René 88DT, overcoming some of the limitations mentioned above. The overall approach leverages large data sets obtained from crystal plasticity FEMbased micromechanical simulations with statistics and machine learning algorithms. Within this framework, the probability of observing a crack nucleation event is derived based on computationally generated micromechanical data on state variables at a potential nucleation point. High fidelity, imagebased crystal plasticity finite element (CPFE) models developed in^{34,35,36,37} are adopted for simulating evolving state variables in microstructural representative volume elements (RVE) of polycrystalline Nibased superalloys. The deficiencies with improper boundary conditions on the RVE is alleviated in this work through a concurrent multiscale model, where the microstructural RVE is embedded in a homogenized exterior material domain. The exterior domain is modeled using an upscaled, anisotropic continuum plasticity constitutive model that is obtained from homogenization of crystal plasticity RVE simulations^{38,39,40,41,42,43,44}. This model is discussed in the Supplementary Methods Section.
The paper introduces a robust Bayesian model framework for extracting fatigue nucleation criterion using a combination of simulation and experiments. The primary contributions of this framework are delineated below.

The probabilistic crack nucleation model accounts for multiple experimental microstructures that are simulated using crystal plasticity FEM with onetoone virtual representations of EBSD scans. A major factor for model calibration and validation is the generation of statistically significant results over a set of multiple experimental microstructures, covering a range of microstructural variabilities. By incorporating multiple onetoone experimental data sets directly in the probabilistic modeling framework, the approach mitigates uncertainties arising from a single microstructure dependency and insufficient validation data. An important additional aspect is the accounting for material states of both cracked and uncracked regions in the development, through efficient algorithms for handling large data sets.

The Bayesian crack nucleation model is formulated using a customized Nataf transformation to represent multivariate joint posteriors. The resulting probabilistic framework yields several advantages over other comparable methods, viz. (i) semianalytical expressions for highdimensional joint probability distributions, (ii) easy sampling of arbitrary points in the dataspace with no reliance on specialized sampling techniques, (iii) simple calculation of moments and measures on the joint probability distribution, and (iv) straightforward regularization of discrete data that reduces a generalized model error. These advantages enable efficient use of the final posterior joint distribution in predicting crack nucleation.

The automated simulation framework creates pipelines, which take a set of EBSD scans delineating grain configurations and crack nucleation sites as input, and returns a reduced order statevariablebased Bayesian model for probabilistic nucleation prediction as shown in Fig. 1. The framework integrates multiple computational procedures into a continuous workflow.

The model introduces an algorithm for selecting important state variables and isolating critical metrics that indicate locations of crack nucleation. This algorithm assesses successive multidimensional joint distributions and accounts for highly correlated sets of variables to be incrementally optimal in its predictions.

The automated procedure with minimal intervention uses experimental data to inform the probabilistic model of the most informative parameters. These parameters yield mechanistic insights into the drivers of crack nucleation for this material. The resulting probabilistic model demonstrates robust performance across variations in the microstructure and local grain configuration.

A scalar crack nucleation indicator is derived from the simulation platform.
The process of developing a probabilistic crack nucleation modeling framework integrates experimental testing and characterization with a pipeline of simulation tools for mechanical modeling and datadriven, supervised learning techniques. The sequence of four major modules in this pipeline is schematically depicted in Fig. 1.
Experimental testing and characterization
In this module, specimens of the Nibased superalloy René 88DT are subjected to interrupted fatigue loading at various stages of deformation, to track the precursors of crack nucleation. Scanning electron microscopy (SEM) maps are acquired and utilized for identifying microstructural locations of crack nucleation. Simultaneously, EBSD maps are acquired to determine the crystallographic orientations and grain morphologies at locations of interest in the microstructure.
Concurrent multiscale model development with embedded microstructural RVE
This module builds a twoscale concurrent model, consisting of an embedded polycrystalline microstructural RVE in a homogenized outer domain. Important microstructural deformation variables are extracted from simulations of the concurrent model of the experimental specimen. The twoscale concurrent representation is constructed to avoid direct application of boundary conditions on the RVE, thereby averting boundary conditionrelated inaccuracies in micromechanical fields. The embedded microstructural region is generated by extruding 2D EBSD scans in the outofplane direction. It is modeled using a crystal plasticity constitutive model for individual grains in the polycrystalline microstructure, while the outer domain is modeled by a homogenized, ratedependent elastoplastic constitutive model that is calibrated to the average mechanical response of the embedded RVE.
Creating a database of microstructural state variables
The concurrent multiscale model is simulated with fatigue loading conditions applied on the external boundary, from which the evolving state variables in the embedded microstructural RVE region are extracted. The state variables are then processed and archived in a crystal plasticity simulation database for subsequent stages of the datadriven modeling.
Development of a Bayesian crack nucleation model
The crystal plasticity simulation database provides the input data for the modeling framework to determine the probability of crack nucleation corresponding to a mechanical state in the material volume. The Bayesian model selects the most informative state variables for predicting crack nucleation by navigating the covariance structure of candidate predictor variables. The results are used to create a simple model for identifying potential locations of crack nucleation in the microstructure.
Each module is standardized and automated so that the model building process can be rapidly and systematically repeated for all experimental microstructures. Furthermore, this approach allows for the extension to other critical events as well as the inclusion of new data sets.
Results
Material description
The material investigated in this study is a polycrystalline Nibased superalloy René 88DT. Generally produced by powder metallurgy techniques^{45,46}, René 88DT has a nominal chemical composition (weight percent) of 13% Co, 16% Cr, 4% Mo, 4% W, 3.7% Ti, 0.7% Nb 0.03% C, 0.015% B, and the balance in Ni^{47}. The intragranular microstructure of René 88DT consists of an FCCstructured Ni matrix (γ phase) and an embedded population of L1_{2}structured Ni_{3}Al precipitates (γ' phase) with a bimodal size distribution of large (~100 nm) secondary and small (<10 nm) tertiary \(\gamma ^{\prime}\) precipitates. At the polycrystalline scale, the material has an average grain size of about 26 μm, very weak crystallographic texture, and a significant population of Σ3 twin boundaries. Details on the René 88DT microstructure are given in^{8,9}.
Experiments on fatigue tests and data acquisition
An experimental data set consisting of 12 specimens with independent grain structure configurations is utilized for developing the Bayesian model in the present study. Room temperature, lowcycle fatigue tests are conducted to evaluate the fatigue life of the Nibased superalloy René 88DT, for which the mechanical testing procedures are detailed in^{9}. The tests are performed in air on an electromechanical machine in a uniaxial, symmetric pushpull mode. Fully reversed loading tests with an Rratio equal to −1 at a frequency of 1 Hz, are carried out in a stresscontrolled mode at a maximum stress of 758 MPa. Cylindrical specimens with a gauge diameter of 5 mm and a gauge length of 16 mm are used in these tests. The strains in the specimen are measured by a mechanical extensometer that is positioned on the specimen gauge. The average fatigue lifetime of the specimens is observed to be 65340 cycles with a scatter (standard deviation) of 5530 cycles. A maximum and minimum total strain of 0.35% ± 0.04% and −0.35% ± 0.03% are measured for the first cycle. The material exhibits cyclic evolution of the strain, with the maximum and minimum strain stabilizing at 0.40% ± 0.05% and −0.30% ± 0.04% respectively, after half of the lifetime i.e. around 30–40,000 cycles. The strain range remains constant at 0.70% ± 0.03% during the entire fatigue test. Interrupted tests are performed to enable specimen surface examination using SEM and EBSD measurements at about 45% of the average fatigue lifetime (~30,000 cycles). The SEM scans identify regions exhibiting evidence of microcrack formation, while the EBSD measurements interrogate the configuration of the surface grains surrounding the microcracks. The interrupted tests capture two types of cracks, viz. those confined to one grain^{48} (type I) and those extended between one and threegrain diameters, crossing one or two grain boundaries on the surface (type II). This approach facilitates the identification of the lowcycle crack nucleation sites that are characteristic for each microstructure. A few typical microstructures, showing nucleated cracks under the fatigue loading conditions, are shown in Fig. 2.
Experimentally generated polycrystalline microstructures
Subsequent to EBSD acquisition, an automated data processing pipeline is applied to each of the 12 investigated grain structures in the microstructure generator software DREAM.3D^{49} to remove any measurement defects. Low image quality voxels are identified and automatically filled. Grains are determined by grouping voxels within a 5^{∘} misorientation angle threshold. Grains containing an area <20 μm^{2} are merged with their surrounding grains using a nearestneighbor identification method. These processing stages result in a collection of 2D segmented images, free of artifacts that can cause difficulties with generating robust finite element meshes.
Key experimental observations on fatigue crack nucleation
The primary sources of heterogeneity in René 88DT are derived from variations in microstructural features since minimal metallurgical defects like inclusions and voids are generated with powder metallurgy processing. The most prominent microstructural features are annealing twins, whose boundaries are uniformly dispersed throughout the grain structure during heat treatment. These boundaries have unique crystallographic and morphological properties and thereby affect dislocation motion during material deformation.
Prior research, e.g. in^{1,9,50,51}, suggests a number of experimental correlations between the material grain structure and physical location of crack nucleation sites. A few key observations are delineated below:

1.
Fatigue crack nucleation and the highest degree of strain localization are systematically reported to occur near and parallel to Σ3 twin boundaries in the low, high and very high cycle regimes^{9};

2.
Crack nucleation always occurs between pairs of grains, which are relatively oriented in a parallel slip configuration;

3.
Additional geometric, crystallographic, and mechanical factors, such as high Schmid factor, large difference in elastic modulus between adjacent grains, and long straight twin boundary length, have been shown to be associated with crack nucleation sites;

4.
Slip bands at crack nucleation sites can form as early as the first cycle, suggesting state variables at early stages of deformation are predictive of the crack nucleation locations;

5.
For both high and lowcycle fatigue regimes, the previous studies have reported that all crack nucleation sites are associated with twin boundaries that display parallel slip configurations^{15}.
The above observations are taken into account in this study for predicting the microstructure and grain configurations that generate crack nucleation. Next the pipeline requires the development of a concurrent multiscale model for fatigue analysis, followed by the creation of a microstructural state variable database related to fatigue nucleation. These are respectively discussed in Sections I and II of the Methods Section.
A Bayesian framework for predicting crack nucleation
A stochastic machine learning (ML) method is developed for assessing the probability of crack nucleation in the René 88DT superalloy microstructure. The Bayesian framework, outlined in the Methods Section III, is chosen for generating the probability of observing a crack nucleation event, as a function of the micromechanical state in the microstructure. This is a supervised learning classification approach that is effective with a limited set of highdimensional data. It has model parameters that are physically interpretable functions of the micromechanical state variables and offers a level of uncertainty in its predictions. The Bayesian modeling framework naturally yields a probabilistic outcome as opposed to a binary (0 or 1) response, which is advantageous over other machine learning classification techniques. The finite element simulation data, generated by the concurrent model in the Methods Section I, is aggregated over all microstructures and used as training data for this stochastic model. The results of these simulations are collected into a segmented database containing over 20 potential crack nucleation predictors in Methods Section II.
A Bayesian framework for predicting crack nucleation
Constructing a Bayesian model using all of these parameters would result in overfitting, providing little useful information. To remedy this, a model selection technique is developed to systematically reduce the number of state variables to only those that most influence the posterior distribution p(C ∣ x) as a combined set. The objective is to select a reduced order model with only the most significant state variables from the possible k = 20 state variables in the data set X listed in Table 2 as well as their regularized difference terms.
A recursive buildup approach is executed to add state variables one by one to the model and assess the incremental value of each new addition. In order to compare across multiple models, an objective function is designed to quantify the overall effectiveness of each posterior distribution. Let the objective function ρ_{C} be defined as:
where N_{C} is the total number of cracked material volumes across all microstructures and \({w}_{C}^{(i)}\) are the normalized volumes for each, i.e. \(\mathop{\sum }\nolimits_{i}^{{N}_{C}}{w}_{C}^{(i)}=1\). This objective function ρ_{C} represents the geometric mean of the probability that the model correctly labels each known cracked volume of the observed data set. In practice, \(\log {\rho }_{C}\) is used for numerical computations.
The reduced order posterior model p(C ∣ x_{*}) is determined by choosing a subset of state variables X_{*} ⊂ X that incrementally maximizes ρ_{C} for a fixed number of state variables k_{*} ≤ k. This optimization procedure is described in Algorithm 1 in the Supplementary Methods Section. The fundamental idea is to train the posterior PDF with each of the state variables individually, and assess the effectiveness of each model by comparing their respective ρ_{C}. After determining the most informative state variable with the maximal objective value, the model is trained again with each of the remaining state variables individually, as well as the previously chosen most informative state variable, in order to select the second most informative state variable. This incremental process is repeated until a priori chosen k_{*} ≤ k state variables are established. The incremental selection process algorithm is given in the Supplementary Methods Section under Algorithm 1.
The incremental selection process is executed with the present data set X comprised of k = 20 state variables, using k_{*} = 8. In each iteration, the method optimally adds the most informative state variable. With each additional state variable included in the probabilistic model, the overall performance characterized by ρ_{C} increases. Figure 3a plots the incremental effectiveness of the posterior with additional state variables. The slope of the plot gradually saturates as less new and independent crackdelineating information is provided to the joint posterior PDF from the additional simulation output data.
The above procedure yields a systematic ranking of state variables from the data set in terms of effectiveness, into a set X_{*}. It provides a clear path to reduce the degrees of freedom in the final model. The top three state variables determined by this process are:

1.
The von Mises stress σ_{VM}.

2.
The maximum plastic sliprate \(\dot{\gamma }\).

3.
The plastic defect energy ψ_{p}.
These three quantities contain the most combined information that is capable of distinguishing between cracked and uncracked material volumes within the regions undergoing parallel slip. The marginal distributions for these state variables are shown in Fig. 3a–c, illustrating the overall trends for the crack data separation. Generally, low σ_{VM}, high \(\dot{\gamma }\), and high ψ_{p} all correlate with an increased likelihood of cracked nucleation at the material volumes in a parallel slip configuration.
A reducedorder posterior is generated from these three state variables by defining a set \({{{{\bf{x}}}}}_{{{\mbox{r}}}}=[{\sigma }_{{{\mbox{VM}}}},\dot{\gamma },{\psi }_{{{\mbox{p}}}}]\). The threedimensional likelihood PDFs p(x_{r} ∣ C, ξ) and \(p(\,{{{{\bf{x}}}}}_{{{\mbox{r}}}}\, \,\bar{C},\xi )\) are constructed, and subsequently the resultant posterior PDF p(C ∣ x_{r}) is calculated. Twodimensional projections of the likelihoods are visualized in Fig. 4a, which shows the tendency of the cracked data to separate according to the values of σ_{VM} and \(\dot{\gamma }\). Incorporating the priors yields the final projected posterior as shown in Fig. 4b, where the highest probability of crack nucleation commences for low σ_{VM} and high \(\dot{\gamma }\) values.
The posterior PDF, when compared with other existing crack nucleation models of varying complexity, demonstrates excellent improvement. The original prior \({{{{\mathcal{M}}}}}_{0}=p(C)\), the prior after conditioning on parallel slip configuration regions \({{{{\mathcal{M}}}}}_{\xi }=p(C\, \,\xi )\), and additional posteriors including an increasing number of state variables, are ranked according to their predictive capacity on the training data set. In Fig. 4c, the logodds of each PDF’s geometric mean probability ρ_{C} (in bits) is used as the measure of comparison and is calculated as \({{{\mbox{log}}}}_{2}\left(\frac{{\rho }_{C}}{1{\rho }_{C}}\right)\). The logodds gives a natural interpretation of the strength of evidence gained by adding model complexity to a probabilistic model^{52}. A unit increase in logodds corresponds to an additional bit of evidence. This is roughly interpreted as observing 2:1 evidence in support of the model’s hypothesis, i.e. that it correctly predicts crack nucleation sites. In this context, p(C ∣ ξ) adds about 6.5 bits of evidence over the naive p(C), by requiring that crack nucleation occurs at material volumes in a parallel slip configuration. Furthermore, \({{{{\mathcal{M}}}}}_{{{\mbox{r}}}}=p(C\, \,\xi ,{{{{\bf{x}}}}}_{{{\mbox{r}}}})\) adds about 1.5 bits of evidence on top of p(C ∣ ξ), by incorporating the top three state variables in the posterior PDF. The model therefore greatly improves the probability of locating a rare crack nucleation event, by about 8 bits relative to the original prior from the data.
From a mechanistic point of view, an interpretation is offered for the model’s emphasis on σ_{VM}, \(\dot{\gamma }\), and ψ_{p} in terms of the experimental evidence supporting long, thin cracks in PSB regions at parallel slip configurations. The von Mises stress, which is a norm of the deviatoric stress, typically represents a driving force for the volumepreserving shear deformation such as plasticity and dislocation motion. However, since the material volume is already in a parallel slip configuration, a high directionally independent measure of shear stress, such as σ_{VM}, indicates that dislocation slip is likely to occur on multiple slip systems, rather than focusing on only the parallel slip plane. This importance of directionality is further evidenced by a large \(\dot{\gamma }\) measure, also acting as a crackdelineating state variable. The combination of the low σ_{VM} and high \(\dot{\gamma }\) is an environment that promotes dislocation motion along the primary slip plane, which is the parallel slip plane in a parallel slip configuration. It represses slip along with other secondary slip systems that interferes with the main deformation mode. The plastic defect energy as the third major indicative variable of crack nucleation is linked to the buildup of dislocation substructures that are typical of persistent slip band (PSB) regions. Regions of large ψ_{p} store more external work into complex and organized arrays of tangled dislocations, rather than dissipating it entirely as plastic work.
After constructing the full posterior distribution p(C ∣ x_{r}), probability fields are overlaid on the microstructures to determine the most likely crack nucleation sites. A sample polycrystalline microstructure is shown in Fig. 5 with the corresponding orientation map from the EBSD scan. The probability contours by the Bayesian model concur with the experimental crack nucleation sites, as probable locations for observing a crack.
Crossvalidation
The predictive capacity of the probabilistic model is assessed in this section. Delineation of the training and test data allows for a comparison of the performance of the methodology in terms of both training error and generalization error. Model overfitting is a concern when the only efficacy measure of a model is on the data it is trained on. Therefore, a “leavetwoout” crossvalidation approach is used to ensure the crack nucleation model generalizes onto unseen data. The leavetwoout method is a standard, exhaustive crossvalidation approach, in which the model is trained on every available set of data except for two microstructures. The model is then tested/validated on the two microstructures that are left out from the training set. This process is repeated, leaving two different data sets out each time, and retraining the model on the remaining ten microstructures. This approach is one of the best for small data sets and for training a computationally expensive model, as it increases the amount of training data available for each validation evaluation.
For the case of M = 12 microstructures, there are M(M − 1)/2 = 66 ways to leave two unique data sets out and train on the remaining ten. Therefore, the model is trained and tested 66 times for all combinations. The geometric mean of correct crack nucleation prediction ρ_{C} is used as a measure of the effectiveness of the model in both training and validation. An additional related measure is defined as:
where \({N}_{\bar{C}}\) is the total number of uncracked material volumes across all microstructures of interest, and \({w}_{\bar{C}}^{(i)}\) are the normalized volumes for each. This measure is the geometric mean of the probability that the model correctly labels every known uncracked material volume of the observed data set.
The arithmetic mean and geometric mean of ρ_{C} and \({\rho }_{\bar{C}}\) over all 66 cases are given in Table 1 for both the training and testing data sets, by employing the leavetwoout method. The resulting evaluation measures indicate that the probabilistic model yields similar abilities for both the training and testing sets. Only the geometric mean for the validation cracked cases is slightly lower than that of the test case. This deviation is due to the geometric mean’s sensitivity to variance and bias towards low values, which are higher in the testing examples. Overall, these results demonstrate that the model generalizes well, and is predictive on new data.
The predictive behavior of the model is further emphasized by plotting the posterior distribution over a grain structure for one of the 66 leavetwoout cases. Fig. 1 in the Supplementary Methods Section shows the probability field and the corresponding EBSD map over one of the two cases that are not included in the training. The probability field illustrates likely locations for crack nucleation, and in particular, highlights the locations near the experimental crack nucleation site. These regions are comprised of material volumes in a parallel slip configuration with high propensity for primary slip and have already undergone substantial defect buildup.
Discussion
The probabilistic crack nucleation model consisting of a validated threedimensional posterior PDF can be used to create a simple, heuristic scalar indicator model of crack nucleation probability. A onedimensional functional form is constructed to become a separator of the cracked and uncracked regions. The dimensionless crack nucleation indicator is defined at material volumes in a parallel slip configuration (Ξ = ξ) as:
where \(\dot{\gamma }\) is the sliprate, \(\dot{{\gamma }_{0}}=5\times 1{0}^{7}\) is the reference sliprate, ψ_{p} is the plastic defect energy, and κ_{1}, κ_{2} are parameters to be determined. The crack nucleation indicator incorporates the basic mechanistic ideas discussed in the previous section. The first term of Ψ_{c} corresponds to a state of material which heavily favors parallel slip along a boundary rather than homogeneous plastic deformation across multiple slip systems. The numerator \(\log \left(\frac{\dot{\gamma }}{\dot{{\gamma }_{0}}}\right)\) is approximately proportional to the effective RSS driving slip through thermal activation, and the denominator \({\sigma }_{\,{{\mbox{VM}}}\,}^{2}\) is proportional to the elastic distortion energy. The second term of Ψ, often used in the Helmholtz energy representation of phasefield crack models^{53}, indicates that the material has a tendency to build up dislocation substructures to promote the onset of persistent slip band formation. The proposed crack nucleation indicator encompasses and combines features of fatigue indicator parameters proposed in the literature, such as the emphasis on the plane of maximum plastic slip in^{24} and the significance of the stored plastic defect energy in^{23}.
The two parameters κ_{1} and κ_{2} of Ψ_{c} are linear coefficients with respect to the inputs of the function. Correspondingly they can be determined using linear discriminant analysis (LDA). LDA is a classification method that finds the linear combination of inputs that best separates two distinct distributions, to find a reduced order model. The application of LDA to the crack nucleation data yields explicit values for the linear coefficients of Ψ_{c}, i.e. κ_{1} = − 0.28 ⋅ 10^{−18} Pa^{−2} and \({\kappa }_{2}=0.26\cdot 1{0}^{6}{\left(\frac{{{\mbox{J}}}}{{{{\mbox{m}}}}^{3}}\right)}^{1}\). This process results in a single scalar value for crack nucleation prediction and is used as a sole indicator in the probabilistic model p(C ∣ x), where x = [Ψ_{c}].
Visual analysis of the likelihood distributions of Ψ_{c} in Fig. 6a establishes it as a useful separator of the cracked and uncracked experimental data. Furthermore, the model in Eq. 3 is assessed similar to the previous posteriors by calculating ρ_{C}. Applying the cracked data to the model p(C ∣ Ψ_{c}) yields ρ_{C} = 11.8%, which is superior to building a model with any of the other previously evaluated state variables individually. Consequently, the crack nucleation indicator, which embodies the mechanistic motivation of parallel slip within a persistent slip band, is a consistent measure for identifying cracked regions of the microstructures.
The contour plots of the crack nucleation indicator field are shown for four polycrystalline microstructures in Fig. 2 of the Supplementary Methods Section. This enables a visual comparison to assess its performance with respect to predicting crack nucleation sites. Qualitatively, large values of Ψ_{c} highlight areas associated with the experimentally identified nucleation sites for each case.
A final study is performed to quantify the sensitivity of the scalarvalued probabilistic model development to the inclusion of new data. The 66 crossvalidation cases are revisited for this analysis. For each case, the posterior model is constructed by training with data from ten microstructures, using only the data from the crack nucleation indicator Ψ_{c}. The posterior p(C∣Ψ_{c}) is generated and plotted as a function of Ψ_{c} for each case. In Fig. 6b, the 66 models are aggregated into an average model shown with the solid dark line, and standard deviation bounds depicted with dotted lines. This result demonstrates that most models lie within about 5–10% of the average model, and therefore are robust and relatively insensitive to the choice of training data set.
In summary, a robust Bayesian inferencebased probabilistic predictive model for crack nucleation in the Nibased superalloy René 88DT under fatigue loading is developed in this paper. The process fuses several statistical and datadriven methods operating on micromechanical and experimental data with important mechanistic observations. The model systematically reduces a large set of experimentally acquired microstructural data and corresponding crystal plasticitybased micromechanical simulation results to a 3D posterior probability density function (PDF) for crack nucleation. Three hierarchically important state variables in the microstructural material volume, relevant to the crack nucleation process, are identified by the framework. They are: (i) the von Mises stress, (ii) the maximum plastic slip rate, and (ii) the plastic defect energy. This datadriven selection process is performed in an automated fashion without the need for external intervention. It follows a statistically unbiased approach that makes an informed decision while accounting for various experimental microstructures and mechanical simulations.
Given a set of EBSD scans along with the associated preidentified crack nucleation sites as input, the procedure ranging from the processing of experimental EBSD and SEM data, multiscale simulations for creating a reference database, to the development of the probabilistic crack nucleation model is implemented through an automated computational pipeline. It comprises four sequential modules given below.

1.
Collection of postfatigue loading local microstructural data, and subsequent cleaning and processing of EBSD data;

2.
Development of a concurrent multiscale model with the imagebased 3D microstructure embedded in a homogenized material modeled by an upscaled, anisotropic continuum plasticity constitutive model;

3.
Multiscale simulations with different experimental microstructures to create a database of micromechanical state variables relevant to the crack nucleation process;

4.
Formulation of a Bayesian crack nucleation model that identifies crack nucleation sites from the database of local state variable fields.
The probabilistic models for indicating crack nucleation are based on the core concept of parallel slip at susceptible coherent twin boundaries. The material near these boundaries possesses particular crystallographic configurations that promote an environment for recurring dislocation slip parallel to the boundary. State variables, such as the von Mises stress, the maximum plastic slip rate, and the plastic defect energy in this region, have a strong effect on this mechanism. Slip is driven on the primary plane and builds up dislocation substructures like PSBs that further promote future slip along the same path. Simultaneously, little driving force exists for forest dislocations to impede obstacles on the primary plane. The combination of these events creates a probable setting for the incidence of crack nucleation.
The multidimensional Bayesian inference model is constructed to capture this sequence of mechanistic events. It is validated for various microstructures and probabilistic measures, using the “leavetwoout” crossvalidation process. Finally, a simple crack nucleation indicator is formulated to integrate the parallel slip concepts into a single scalar model. This composite indicator outperforms other individual state variables in differentiating between cracked and uncracked material volumes. In conclusion, this paper provides a robust framework for employing sophisticated data analysis techniques in conjunction with multiscale simulations to investigate the mechanisms underlying complex physical phenomena, and produces interpretable results. The Bayesian crack nucleation model should be valid for different microstructural compositions and loading conditions, as long as the main mechanisms of deformation such as plastic slip, dislocation glide, etc., remain the same.
The overall modeling framework developed in this paper is not limited to only superalloys considered in this study. Most aspects of the framework are transferable to other polycrystalline alloys for which, state variable evolution corresponding to parallel slip and PSB formation govern their fatigue behavior. A majority of steps in the automated framework, including the embedding methodology, finite element meshing, state variable calculation, probabilistic model procedure, and the reduced model selection technique, are material agnostic within the realm of metallic materials. Transferring the microstructure to deformation behavior and fatigue platform to other metallic materials will require the corresponding experimental data, viz. EBSD scans and crack nucleation sites, a calibrated crystal plasticity constitutive model, and the associated upscaled anisotropic continuum plasticity model. Additionally, the probabilistic classification framework has many other potential applications in datarich and knowledgepoor circumstances.
For componentscale analysis, it is necessary to introduce a detailed parametrically upscaled constitutive model (PUCM) and associated parametrically upscaled crack nucleation model (PUCNM) developed in^{39,40,41,42,43}. The thermodynamically consistent PUCMs, incorporate a parametric representation of lowerscale microstructural descriptors in higherscale constitutive coefficients. The PUCM coefficients are expressed as functions of representative aggregated microstructural parameters (RAMPs), representing lowerscale descriptors of microstructural morphology and crystallography. While the upscaled anisotropic continuum plasticity model developed for the concurrent model is a step in this direction, it does not have the parametric forms in terms of the RAMPs. The PUCM, in conjunction with the PUCNM, readily performs componentscale analysis with direct connection to the microstructure. The models developed in this study can be used for SERVE analysis to develop the PUCM and PUCNM for the superalloy considered for this study.
Methods
Concurrent multiscale model for fatigue analysis
The concurrent multiscale model in Fig. 7 consists of a polycrystalline microstructural RVE that is embedded in an exterior computational domain. The RVE and the exterior domain are modeled by a crystal plasticity constitutive model^{34,35,36} and a ratedependent homogenized elastoplasticity constitutive model respectively. Cyclic loads representing fatigue loading are applied on the outer surface of the homogenized exterior domain. The constitutive relations used in the concurrent multiscale model are summarized in the Supplementary Methods Section of this paper.
Concurrent multiscale model with embedded microstructural RVE
The application of periodicity or other boundary conditions directly on the microstructural RVE domain can result in significant inaccuracies due to improper representation of the domain and consequent boundary effects^{32,33,54}. The concept of concurrent multiscale domains, in which the RVE is embedded in a homogenized exterior domain, has been successfully introduced in^{43,55,56} to alleviate this discrepancy. Boundary conditions are applied on the exterior boundary of the concurrent domain, while the RVE shares a common interface with the homogenized domain. Similar concurrent approaches have also been proposed for coupled continuumpolycrystalline domains in^{29,57}.
The embedded RVE in Fig. 7 is generated by directly mapping the 2D EBSD micrograph of a polycrystalline microstructure shown in Fig. 2, and extruding it in the orthogonal direction to a suitable depth. Each section of the RVE contains about 50–200 grains. Previous studies on Nibased superalloys in^{34,54,58} have demonstrated the effectiveness of microstructural RVEs of this size. Some of the EBSD micrographs containing cracked regions are not fully supported by first and second nearest neighbor grains. While this may lead to some error in the simulation results due to limited grain representation in RVE models, the relatively lower strains levels may not significantly alter the response fields as noted in^{59}. The homogenized exterior medium in the concurrent model may also mitigate the surrounding boundary effect due to influence from neighboring grains. Furthermore, the final posterior model is constructed using multiple experimental microstructures and testing. With increasing data, the Bayesian model improves at distinguishing between all the sources of model noise and the meaningful relation existing between crack nucleation sites and state variable fields.
For fatigue crack modeling, it is necessary that the microstructural RVEs be constructed directly from EBSD scans of the region containing cracks for onetoone correspondence with experimental observations. Statistically equivalent RVE (SERVEs), with matching morphological and crystallographic distributions, are not ideal for this onetoone correspondence. 3D microstructures may consequently be created by focused ion beam (FIB)–based serialsectioning, or synchrotronbased CT methods. However, it is challenging to obtain interrupted 3D experimental microstructures in the context of fatigue experiments. The decision to create 3D microstructural RVEs by extruding 2D EBSD scans was made since such types of data were not available for this study. In^{60}, the authors have inferred that subsurface crystallography and grain morphology, are needed to accurately predict surface strain distributions. In the present study, it is assumed the error due to the extrusion will not be significant since the thickness of the RVE is smaller than the average grain size. Additionally, the use of multiple microstructural instantiations may have a mitigating effect on the crack nucleation model.
The concurrent multiscale model can mitigate other spurious boundary effects on the evolving RVE variables. A sample microstructural RVE, shown in Fig. 7, has dimensions of 168 μm × 69 μm × 7.5 μm, while the homogenized exterior rectangular domain has dimensions of 336 μm × 138 μm × 42 μm. This corresponds to twice the length and width of the RVE and a depth of 7.5 μm plus half of the RVE width. Five of the six faces of the embedded RVE share an interface with the exterior homogenized domain, enforcing displacement compatibility across these interfaces. The remaining RVE face represents the free surface of the fatigue experiment and is unconstrained during the cyclic simulations.
The concurrent FE model is discretized into TET4 elements, generated using an automated meshing pipeline built within the Simmetrix code^{61}. This minimizes the computational and memory requirements and provides a robust discretization in terms of numerical accuracy and geometric conformity. Since the state variables near twin boundaries are key to the crack model, a graded mesh with highly refined elements near the grain/twin boundaries, and coarser elements away from these critical regions, is developed as seen in Module II of Fig. 1. The exterior domain mesh matches the RVE mesh at the interface, but coarsens away from the interface.
Creating a microstructural state variable database related to fatigue nucleation
The following steps are executed for creating the database.
Experimentally obtained training data sets for supervised learning
The Bayesian modeling approach to fatigue crack nucleation requires a data set for supervised learning, consisting of experimentally observed points for which the binary cracked/uncracked states are known. To generate this data set, the locations of crack nucleation sites are recorded for the experimental micrographs, e.g. in Fig. 2. Each TET4 element in the concurrent model, for which three nodes are on a cracked interface is marked as belonging to the set of cracked elements. All other elements in the ensemble are designated as uncracked. This labeling method classifies the data set into two distinct states for training the probabilistic model.
Extracting core microstructural state variables for the crack nucleation model
The concurrent multiscale model in Methods Section I is now developed for each of the 12 experimental microstructures and subsequently loaded under fatigue loading conditions. Analogous to the experiments in^{51}, a stresscontrolled fully reversed (R = −1) cyclic loading with a triangular waveform and 1 Hz frequency is applied on the computational model. The peak stress in this loading cycle corresponds to a 1% total strain. All state variables are recorded at the peak tensile load of each cycle. This process generates a complete database of mechanical fields for all the 12 experimental microstructures, which is used for training the Bayesian crack nucleation model.
Timehistories of state variables are calculated and recorded as quantities of interest in the development of the fatigue crack nucleation model. From a comprehensive set of mechanistic variables, including stress and strain measures, dislocation evolution indicators, fatigue indicator parameters, etc., a set of ten core state variables are deduced. Most of the other indicators are functions of these core parameters. The core variables are tabulated in Table 2.
The set of variables in Table 2 is designated as X. Additionally, a regularized difference operator (similar to unsharp masking) acting on each state variable is appended to the set X. Correspondingly the dimension of the set is k = dim(X) = 20. The values of X are assembled into a state variable database for use in the training and testing of the probabilistic model.
Observations on parallel slip configuration from experiments
Observations from fatigue experiments in^{9} suggest that dislocation slip parallel to the coherent twin boundaries results in persistent slip band (PSB) and thin shear band formation due to strain localization. Adjacent to the twin boundary, plasticity occurs due to dislocation glide on slip planes parallel to the twin boundary. The repeated passing of these dislocations and migration of dislocations to this region, shears the \(\gamma ^{\prime}\) precipitates, promoting additional plasticity by reducing the barrier to subsequent dislocation motion. It has been observed that these highly localized regions of plasticity nucleate cracks, several nanometers offset from the actual twin boundary.
Some twin boundaries are identified by grains that are oriented such that their slip systems favor dislocation motion along their shared boundary plane. This special grainpair configuration is called a parallel slip configuration. It is characterized by a grain pair separated by a coherent twin boundary, where the resolved shear stress on one of the shared slip systems is significantly higher than that of the other slip systems. This configuration results in primary slip along that single shared plane and no new dislocation migration occurs from other planes to the slipped region. The twin boundary lengths are also relatively long and straight compared to other grain boundaries. Consequently, this geometric effect and the lack of outofplane dislocation motion leaves the shared slip system free of obstacles, which allows dislocations to glide unimpeded for long distances. In cyclic deformation, uninterrupted localized slip promotes the early stages of PSB formation.
Indicator of dominant parallel slip
Since parallel slip configurations are identified as precursors to crack nucleation, it is important to develop an indicator for dominant parallel slip. This indicator should identify coherent twin boundaries and subsequently search for grainpairs, oriented such that the resolved shear stress favors slip along the shared boundary plane. The following procedure is adopted to account for these conditions and lead up to a definition for the parallel slip indicator.
The criteria used to determine if two grains are in a Σ3 twinned configuration is based on two conditions. The first condition is that the misorientation angle must be 60^{∘} ± 2^{∘}, and the second requires that the absolute value of the inner product between the shared slip plane normals is >0.985. For each grain pair, the shared slip plane normal is defined by the unit vector average of the two slip plane normals, whose inner product has the largest absolute value. Specifically, these vectors are determined by the following process.
For two adjacent grains g_{1} and g_{2} in the microstructure, the two respective slip planes that most closely share the same normal are selected according to the criterion:
where l and k enumerate the four possible slip planes in the 〈111〉 family. Correspondingly, the shared slip plane normal for the grains g_{1} and g_{2} are defined as:
where \({{{{\bf{n}}}}}_{l* }^{{g}_{1}}\) is the normal to the l^{*} slip plane of grain g_{1} and \({{{{\bf{n}}}}}_{l* }^{{g}_{2}}\) is the normal to the k^{*} slip plane of grain g_{2}.
The existence of a shared plane allows for a partition of the standard octahedral slip systems of FCC materials. Let \({{{\mathcal{A}}}}\) denote the set of all its slip systems for a grain that shares a slip system with an adjacent grain. Also, let \({{{{\mathcal{A}}}}}_{{{\mbox{p}}}}\) be the set of slip systems that contain the shared slip plane characterized by \({{{{\bf{n}}}}}_{\,{{\mbox{p}}}\,}^{{g}_{1},{g}_{2}}\) in Eq. 5. Finally, let \({{{{\mathcal{A}}}}}_{{{\mbox{o}}}}\) be the complement of \({{{{\mathcal{A}}}}}_{{{\mbox{p}}}}\) in \({{{\mathcal{A}}}}\), i.e. the complementary set of slip systems that do not contain the shared slip plane.
Two special resolved shear stresses (RSS) are defined to determine when parallel slip is dominant for a given twin boundary. The maximum RSS is given as \({\tau }_{* }=\mathop{\max }\limits_{\alpha \in {{{\mathcal{A}}}}}\left( {{{\bf{M}}}}:{{{{\bf{s}}}}}_{0}^{\alpha } \right)\), where \({{{{\bf{s}}}}}_{0}^{\alpha }={{{{\bf{m}}}}}_{0}^{\alpha }\otimes {{{{\bf{n}}}}}_{0}^{\alpha }\) is the Schmid tensor for the α slip system, \({{{{\bf{m}}}}}_{0}^{\alpha }\) is the slip direction, and \({{{{\bf{n}}}}}_{0}^{\alpha }\) is the slip plane normal in the reference configuration. Correspondingly, the parallel slip RSS and the outofplane RSS are respectively defined as:
The parallel slip RSS is the maximum RSS on the shared slip plane, while the outofplane RSS is the maximum over the remaining three slip planes. When τ_{p} > τ_{o}, primary slip occurs along the twin boundary, and outofplane dislocations do not disrupt this plastic flow. Therefore, parallel slip dominates the deformation process. This condition is combined with the coherent twin and a plastic flow requirement to define a binary parallel slip indicator Ξ as \(\xi \,or\,\bar{\xi }\) for a given material volume. The parallel slip indicator is activated with Ξ = ξ, if all of the following three requirements are met simultaneously:

1.
The material volume is located on a coherent twin boundary;

2.
The parallel slip plane is dominant, i.e. τ_{p} > τ_{o};

3.
The maximum plastic sliprate exceeds a minimum threshold, i.e. \(\dot{\gamma } > {\dot{\gamma }}_{{{\mbox{m}}}}\), where \({\dot{\gamma }}_{{{\mbox{m}}}}\) is a minimum threshold sliprate.
The threshold sliprate is selected as \({\dot{\gamma }}_{{{\mbox{m}}}}=1{0}^{5}{s}^{1}\) to indicate that the material is undergoing plasticity. If any one of the three conditions are not met, then \({{\Xi }}=\bar{\xi }\). The parallel slip indicator is designed to signal timeevolving, pointwise occurrence of parallel slip throughout the microstructure. It is analogous to the timeindependent, grainwisehomogeneous parallel slip configuration definition in^{1,8,9,15}. The importance of Ξ is that it acts as a sufficiency condition for crack nucleation within the experimental data set of microstructures. This empirical data is exploited in the probabilistic model framework.
Formulation of the Bayesian model for classifying crack nucleation
Let a discrete random variable Θ represent the possible observation of a crack event in a fixed, reference material volume V_{*}, and let a continuous kdimensional random vector X denote the set of core state variables pertaining to that particular region. The entire microstructural field is classified by one of two distinct states, viz. (i) with a crack event Θ = C, and (ii) without a crack event \({{\Theta }}=\bar{C}\). Additionally, let the previously defined parallel slip indicator Ξ in the microstructure be a discrete random variable with two possible states, viz. (i) Ξ = ξ indicating a parallel slip configuration, and (ii) \({{\Xi }}=\bar{\xi }\) otherwise. In accordance with experimental observations, it is assumed that all cracked states C require that a material volume is in a parallel slip configuration Ξ = ξ. Therefore, the probability of observing a crack nucleation event C in the microstructure with given state variables x, homogeneous over some small volume V_{*}, is given by Bayes’ theorem. For the case of Ξ = ξ, this is stated as:
where p_{Θ∣X}(C ∣ x) and p_{Θ ∣ Ξ,X}(C ∣ ξ, x) are both posterior probability density functions (PDF’s), p_{X ∣ Θ,Ξ}(x ∣ C, ξ) is a likelihood function, p_{Θ ∣ Ξ}(C ∣ ξ) is the prior belief of observing a crack nucleation event, and p_{X ∣ Ξ}(x ∣ ξ) is the marginal state variable PDF, which normalizes the expression. For the complementary event \({{\Xi }}=\bar{\xi }\), the posterior PDF p_{Θ∣X}(C ∣ x) = 0. Hereafter, the subscripts on the PDF’s will be omitted to eliminate notational clutter, but may be used occasionally for providing beneficial clarity. Applying the law of total probability to Eq. 7 yields:
The posterior PDF p(C ∣ x) is the primary quantity of interest to be determined for the crack nucleation model. From Eq. 8, it is seen that four PDF’s must be calculated to fully characterize this posterior, viz. p(x ∣ C, ξ), \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\), p(C ∣ ξ), and \(p(\bar{C}\, \,\xi )\). Each of these distributions is systematically constructed from the experimental data set in the following sections.
Determination of the prior distributions \(p({C}\, \,\xi )\) and \(p(\bar{C}\, \,\xi )\)
The prior belief that a crack event occurs in a material volume is constructed by calculating the ratio of cracked volume to total volume over all experimental microstructures. For the available experiments, let M = 12 denote the number of independent microstructures in the data set, and let \({V}_{C}^{(i)}\) and \({V}_{\,{{\mbox{tot}}}\,}^{(i)}\) be respectively the total cracked volume and total overall volume of the ith microstructure, where Ξ = ξ. The crack nucleation prior is correspondingly defined as:
The uncracked prior is constrained by the complementary event, and is therefore determined as \(P(\bar{C}\, \,\xi )=1P(C\, \,\xi )\). Additionally, let the reference volume for the model be defined as:
which corresponds to the average crack volume over all microstructures.
Determination of the likelihood distributions \(p({{{\bf{x}}}}\, \,{C},\xi )\) and \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\)
In general, the likelihood PDF’s, p(x ∣ C, ξ) and \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\), are highdimensional, highly covariant joint distributions. The Nataf transformation method^{62,63,64} is used in this work to construct explicit functional representations of both p(x ∣ C, ξ) and \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\). It has been shown through Sklar’s theorem that a multivariate joint distribution can be decomposed into a composition of componentwise univariate marginal distributions and a multivariate copula, which retains the covariance structure of the random vector. For the Nataf transformation, the joint distribution is approximated using a Gaussian copula to permit a straightforward analytical computation. The following construction is formulated in the context of p(x ∣ C, ξ). However for \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\), analogous steps are performed by substituting \(\bar{C}\) for C.
The Nataf transformation is comprised of three sequential transformations. First, an observed data point x = [x_{1}, x_{2}, . . . ,x_{k}] of the random set X of state variables is transformed componentwise into a vector of uniformly distributed random variables u = [u_{1}, u_{2}, . . , u_{k}], where
\({F}_{{X}_{i} {{\Theta }},{{\Xi }}}\) is the marginal cumulative distribution function (CDF) for the ith component of X. Secondly, each component of u is standardized by applying the inverse standard normal CDF, specified by a mean of zero and a variance of one. Let the standardized vector z = [z_{1}, z_{2}, . . . , z_{k}] be characterized by this transformation as:
Finally, a kdimensional normal distribution is fit to the observations in the standardized space. This sequential application of the ingredients of the Nataf transformation is summarized as:
where the joint CDF F(x ∣ C, ξ) corresponds to the likelihood p(x ∣ C, ξ) and Φ_{k} is the kdimensional multivariate normal CDF. Differentiating Eq. 13 yields:
where \({p}_{{X}_{i} {{\Theta }}}({x}_{i}\, \,C)\) is the marginal PDF for the ith component, ϕ is the standard normal distribution, and p(z ∣ C) is a multivariate normal distribution ϕ_{k} given as:
with the covariance matrix Σ. The computation of p(x ∣ C, ξ) for the state variable data set only requires the determination of \({F}_{{X}_{i} {{\Theta }},{{\Xi }}}\), \({p}_{{X}_{i} {{\Theta }},{{\Xi }}}\), and Σ. \({F}_{{X}_{i} {{\Theta }},{{\Xi }}}\) and \({p}_{{X}_{i} {{\Theta }},{{\Xi }}}\) are represented by a regularized, weighted empirical CDF and a weighted kernel density estimation from the observed x data, respectively. The covariance matrix Σ, defined as: \({{{\mathbf{\Sigma }}}}={\mathbb{E}}[{{{\mathbf{z}}}}\otimes {{{\mathbf{z}}}}]\), is approximated by a weighted unbiased estimator of the observed data as:
where N_{e} is the total number of discretized volumes (finite elements) over all M microstructures in a state of Ξ = ξ, w^{(i)} are observation weights corresponding to the normalized material volume of each point such that \(\mathop{\sum }\nolimits_{i}^{{N}_{{{\mbox{e}}}}}{w}^{(i)}=1\), and \(\omega =\mathop{\sum }\nolimits_{i}^{{N}_{{{\mbox{e}}}}}{w}^{(i)2}\) is the sum of squares of the observation weights.
An example of the action of the Nataf transformation for the uncracked likelihood PDF \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\) is shown in Fig. 8a for two representative state variables. The von Mises stress and maximum plastic sliprate data points \({{{\bf{x}}}}=[{\sigma }_{{{\mbox{VM}}}},\dot{\gamma }]\), are depicted in the original X space, transformed into the uniform U space, and then transformed again into the standardized Z space. Additionally, grid lines are added and transformed accordingly to illustrate the componentwise warping of each observation space.
Figure 8b demonstrates the application of the model described in Eq. 15 to determine a bivariate normal distribution as an approximation for the standardized observations z. This approximation is achieved through the computation of Σ, using Eq. 16. It is seen that the preservation of the isoprobabilistic property of the transformation and of the covariance structure are determined by how well the standard bivariate normal fits the data in the standardized Z space, shown in Fig. 8b. The original joint PDF, \(p({{{\bf{x}}}}\, \,\bar{C},\xi )\), is recovered in the X space by applying Eq. 14. The outcome, shown in Fig. 8c, is an excellent model for a joint distribution, which would typically be complicated to analytically represent and infer from the data.
Data availability
The input data sets for the training and testing of the Bayesian model will be available from the corresponding author upon reasonable request.
Code availability
The codes to train and test the Bayesian model will be available from the corresponding author in a suitable format upon reasonable request.
References
Miao, J., Pollock, T. & Jones, W. Crystallographic fatigue crack initiation in nickelbased superalloy René 88DT at elevated temperature. Acta Mater. 57, 5964–5974 (2009).
McDowell, D. Viscoplasticity of heterogeneous metallic materials. Mat. Sci. Eng. R. Rep. 62, 67–123 (2008).
Stein, C., Lee, S. & Rollett, A. An analysis of fatigue crack initiation using 2D orientation mapping and fullfield simulation of elastic stress response. Superalloys 2012, 439–444 (2012).
Mineur, M., Villechaise, P. & Mendez, J. Influence of the crystalline texture on the fatigue behavior of a 316L austenitic stainless steel. Mat. Sci. Eng. A 286, 257–268 (2000).
Chen, Q. et al. Small crack behavior and fracture of Nickelbased superalloy under ultrasonic fatigue. Int J. Fatigue 27, 1227–1232 (2005).
Thompson, N., Wadsworth, N. & Louat, N. The origin of fatigue fracture in copper. Philos. Mag. 1, 113–126 (1956).
Boettner, R., McEvily Jr, A. & Liu, Y. On the formation of fatigue cracks at twin boundaries. Philos. Mag. 10, 95–106 (1964).
Miao, J., Pollock, T. & Jones, W. Microstructural extremes and the transition from fatigue crack initiation to small crack growth in a polycrystalline nickelbase superalloy. Acta Mater. 60, 2840–2854 (2012).
Stinville, J., Lenthe, W., Miao, J. & Pollock, T. A combined grain scale elastic–plastic criterion for identification of fatigue crack initiation sites in a twin containing polycrystalline nickelbase superalloy. Acta Mater. 103, 461–473 (2016).
Alam, Z., Eastman, D., Weber, G., Ghosh, S. & Hemker, K. Microstructural aspects of fatigue crack initiation and short crack growth in René 88DT. Superalloys 2016, 561–568 (2016).
Heinz, A. & Neumann, P. Crack initiation during high cycle fatigue of an austenitic steel. Acta Met. Mater. 38, 1933–1940 (1990).
Wong, S. & Dawson, P. Influence of directional strengthtostiffness on the elastic–plastic transition of fcc polycrystals under uniaxial tensile loading. Acta Mater. 58, 1658–1678 (2010).
Stinville, J. et al. Subgrain scale digital image correlation by electron microscopy for polycrystalline materials during elastic and plastic deformation. Exp. Mech. 56, 197–216 (2016).
Stein, C. et al. Fatigue crack initiation, slip localization and twin boundaries in a nickelbased superalloy. Curr. Opin. Solid State Mater Sci.18, 244–252 (2014).
Stinville, J., Vanderesse, N., Bridier, F., Bocher, P. & Pollock, T. High resolution mapping of strain localization near twin boundaries in a nickelbased superalloy. Acta Mater. 98, 29–42 (2015).
Shyam, A. et al. Development of ultrasonic fatigue for rapid, high temperature fatigue studies in turbine engine materials. Superalloys 2004, 259–268 (2004).
Healy, J., Grabowski, L. & Beevers, C. Shortfatiguecrack growth in a nickelbase superalloy at room and elevated temperature. Int J. Fatigue 13, 133–138 (1991).
Li, K., Ashbaugh, N. & Rosenberger, A. Crystallographic initiation of nickelbase superalloy IN100 at RT and 538 ^{∘}C under low cycle fatigue conditions. Superalloys 2004, 251 (2004).
Davidson, D., Tryon, R., Oja, M., Matthews, R. & Chandran, K. Fatigue crack initiation in Waspaloy at 20^{∘}C. Met. Mater. Trans. A 38, 2214–2225 (2007).
Lukáš, P. & Kunz, L. Role of persistent slip bands in fatigue. Philos. Mag. 84, 317–330 (2004).
Rovinelli, A. et al. Predicting the 3D fatigue crack growth rate of small cracks using multimodal data via Bayesian networks: insitu experiments and crystal plasticity simulations. J. Mech. Phys. Solids 115, 208–229 (2018).
Hochhalter, J. et al. A geometric approach to modeling microstructurally small fatigue crack formation: II. Physically based modeling of microstructuredependent slip localization and actuation of the crack nucleation mechanism in AA 7075T651. Model Simul. Mater. Sci. 18, 045004 (2010).
Chen, B., Jiang, J. & Dunne, F. Is stored energy density the primary mesoscale mechanistic driver for fatigue crack nucleation? Int J. Plast. 101, 213–229 (2018).
Shenoy, M., Zhang, J. & McDowell, D. Estimating fatigue sensitivity to polycrystalline Nibase superalloy microstructures using a computational approach. Fatigue Fract. Eng. Mater Struct. 30, 889–904 (2007).
Ozturk, D., Shahba, A. & Ghosh, S. Crystal plasticity FE study of the effect of thermomechanical loading on fatigue crack nucleation in titanium alloys. Fatigue Fract. Eng. Mater Struct. 39, 752–769 (2016).
Yeratapally, S., Glavicic, M., Hardy, M. & Sangid, M. Microstructure based fatigue life prediction framework for polycrystalline nickelbase superalloys with emphasis on the role played by twin boundaries in crack initiation. Acta Mater. 107, 152–167 (2016).
Korsunsky, A., Dini, D., Dunne, F. & Walsh, M. Comparative assessment of dissipated energy and other fatigue criteria. Int J. Fatigue 29, 1990–1995 (2007).
Bandyopadhyay, R., Prithivirajan, V., Peralta, A. & Sangid, M. Microstructuresensitive critical plastic strain energy density criterion for fatigue life prediction across various loading regimes. Proc. Math. Phys. 476, 20190766 (2020).
Pierson, K., Hochhalter, J. & Spear, A. Datadriven correlation analysis between observed 3d fatiguecrack path and computed fields from highfidelity, crystalplasticity, finiteelement simulations. JOM 70, 1159–1167 (2018).
Sankararaman, A., Ling, Y. & Mahadevan, S. Uncertainty quantification and model validation of fatigue crack growth prediction. Eng. Fract. Mech. 78, 1487–1504 (2011).
Yeratapally, S., Glavicic, M., Argyrakis, C. & Sangid, M. Bayesian uncertainty quantification and propagation for validation of a microstructure sensitive model for prediction of fatigue crack initiation. Reliab Eng. Syst. Saf. 164, 110–123 (2017).
Ghosh, S. & Kubair, D. Exterior statistics based boundary conditions for representative volume elements of elastic composites. J. Mech. Phys. Solids 95, 1–24 (2016).
Kubair, D., Pinz, M., Kollins, K., Przybyla, C. & Ghosh, S. Role of exterior statisticsbased boundary conditions for propertybased statistically equivalent RVEs of polydispersed elastic composites. J. Compos Mater. 52, 2919–2928 (2018).
Bagri, A. et al. Microstructure and propertybased statistically equivalent representative volume elements for polycrystalline Nibased superalloys containing annealing twins. Met. Mater. Trans. A 49, 5727–5744 (2018).
Weber, G. & Ghosh, S. Thermomechanical deformation evolution in polycrystalline Nibased superalloys by a hierarchical crystal plasticity model. Mater. High. Temp. 33, 401–411 (2016).
Keshavarz, S., Ghosh, S., Reid, A. & Langer, S. A nonSchmid crystal plasticity finite element approach to multiscale modeling of nickelbased superalloys. Acta Mater. 114, 106–115 (2016).
Keshavarz, S. & Ghosh, S. Multiscale crystal plasticity finite element model approach to modeling nickelbased superalloys. Acta Mater. 61, 6549–6561 (2013).
Kotha, S., Ozturk, D. & Ghosh, S. Parametrically homogenized constitutive models (PHCMs) from micromechanical crystal plasticity FE simulations. Part I: sensitivity analysis and parameter identification for Titanium alloys. Int J. Plasticity 120, 296–319 (2019).
Shen, J., Kotha, S., Noraas, R., Venkatesh, V. & Ghosh, S. Developing parametrically upscaled constitutive and crack nucleation models for the alpha/beta Ti64 alloy. Int J. Plasticity 151, 103182 (2022).
Ozturk, D., Kotha, S. & Ghosh, S. An uncertainty quantification framework for multiscale parametrically homogenized constitutive models (PHCMs) of polycrystalline Ti alloys. J. Mech. Phys. Solids 148, 104294 (2021).
Kotha, S., Ozturk, D. & Ghosh, S. Uncertaintyquantified parametrically homogenized constitutive models (UQPHCMS) for dualphase α/β titanium alloys. Npj Comp. Mater. 6, 117 (2020).
Ozturk, D., Kotha, S., Pilchak, A. & Ghosh, S. Twoway multiscaling for predicting fatigue crack nucleation in Titanium alloys using parametrically homogenized constitutive models. J. Mech. Phys. Solids 128, 181–207 (2019).
Weber, G., Pinz, M. & Ghosh, S. Machine learningaided parametrically homogenized crystal plasticity model (PHCPM) for single crystal Nibased superalloys. JOM 72, 4404–4419 (2020).
Keshavarz, S. & Ghosh, S. Hierarchical crystal plasticity FE model for nickelbased superalloys: subgrain microstructures to polycrystalline aggregates. Int J. Solids Struct. 55, 17–31 (2015).
Krueger, D., Kissinger, R. & Menzies, R. Development and introduction of a damage tolerant high temperature nickelbase disk alloy, Rene’88DT. Superalloys 1992, 277–286 (1992).
Chang, D., Krueger, D. & Sprague, R. Superalloy powder processing, properties and turbine disk applications. Superalloys 1984, 245–273 (1984).
Wlodek, S., Kelly, M. & Alden, D. The structure of Rene 88 DT. Superalloys 1996, 129–136 (1996).
Bataille, A. & Magnin, T. Surface damage accumulation in lowcycle fatigue: physical analysis and numerical modelling. Acta Met. Mater. 42, 3817–3825 (1994).
Groeber, M. & Jackson, M. DREAM.3D: a digital representation environment for the analysis of microstructure in 3D. Integr. Mater. Manuf. Innov. 3, 56–72 (2014).
Stinville, J. et al. Fatigue deformation in a polycrystalline nickel base superalloy at intermediate and high temperature: Competing failure modes. Acta Mater. 152, 16–33 (2018).
Stinville, J. et al. Microstructural statistics for fatigue crack initiation in polycrystalline nickelbase superalloys. Int J. Fract. 208, 221–240 (2017).
Jaynes, E. Probability Theory: The Logic of Science (Cambridge University Press, 2003).
Clayton, J. Nonlinear Mechanics of Crystals, Vol. 177 (Springer Science & Business Media, 2010).
Pinz, M. et al. Microstructure and property based statistically equivalent RVEs for intragranular γ γ’ microstructures of Nibased superalloys. Acta Mater. 157, 245–258 (2018).
Paquet, D., Dondeti, P. & Ghosh, S. Dualstage nested homogenization for ratedependent anisotropic elastoplasticity model of dendritic cast aluminum alloys. Int J. Plasticity 27, 1677–1701 (2011).
Ghosh, S., Bai, J. & Paquet, D. Homogenizationbased continuum plasticitydamage model for ductile failure of materials containing heterogeneities. J. Mech. Phys. Solids 57, 1017–1044 (2009).
Spear, A. et al. A method to generate conformal finiteelement meshes from 3D measurements of microstructurally small fatiguecrack propagation. Fatigue Fract. Eng. Mater Struct. 39, 737–751 (2016).
Prithivirajan, V. & Sangid, M. Examining metrics for fatigue life predictions of additively manufactured IN718 via crystal plasticity modeling including the role of simulation volume and microstructural constraints. Mat. Sci. Eng. A 783, 139312 (2020).
Turner, T. J. & Semiatin, S. L. Modeling largestrain deformation behavior and neighborhood effects during hot working of a coarsegrain nickelbase superalloy. Model Simul. Mater. Sci. 19, 065010 (2011).
Turner, T. J., Shade, P. A., Schuren, J. C. & Groeber, M. A. The influence of microstructure on surface strain distributions in a nickel microtension specimen. Model Simul. Mater. Sci. 21, 015002 (2013).
Simmetrix Inc. Simulation Modeling Suite. www.simmetrix.com (2021).
Lebrun, R. & Dutfoy, A. An innovating analysis of the Nataf transformation from the copula viewpoint. Probabilist Eng. Mech. 24, 312–320 (2009).
Lebrun, R. & Dutfoy, A. A generalization of the Nataf transformation to distributions with elliptical copula. Probabilist Eng. Mech. 24, 172–178 (2009).
Kim, H. & Shields, M. Modeling strongly nonGaussian nonstationary stochastic processes using the iterative translation approximation method and Karhunen–Loève expansion. Comput Struct. 161, 31–42 (2015).
Acknowledgements
This work has been supported through Grant No. CMMI1825115 from the National Science Foundation awarded by the Mechanics of Materials and Structures (MOMS) Program (Program Manager: Dr. Siddiq Qidwai). Early stages of this work was supported through a grant No. FA95501210445 to the Center of Excellence on Integrated Materials Modeling (CEIMM) at Johns Hopkins University awarded by the AFOSR/RSL Computational Mathematics Program (Managers Dr. Fariba Fahroo and Dr. A. Sayir). Computing support by the Maryland Advanced Research Computing Center (MARCC) is gratefully acknowledged.
Author information
Authors and Affiliations
Contributions
M.P. and G.W. have developed the Bayesian framework, homogenized model, and have completed all of the simulation results of the paper, as well as writing a substantial portion. M.P. and G.W. wish to be represented as CoFirst authors. J.C.S. and T.P. have contributed the experimental results, and have written the corresponding sections. S.G. has directed the study and has written a majority of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pinz, M., Weber, G., Stinville, J.C. et al. Datadriven Bayesian modelbased prediction of fatigue crack nucleation in Nibased superalloys. npj Comput Mater 8, 39 (2022). https://doi.org/10.1038/s41524022007275
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41524022007275