Introduction

In material science and engineering, material modeling is of central importance for gaining insight into the interplay between material properties, microstructure, and behavior essential for material optimization and design. In the case of polycrystalline structural materials (e.g., steels), for example, a common approach to the modeling of their mechanical behavior is the numerical solution of initial-boundary-value problems (IBVPs) for mechanical equilibrium, for example, via spectral or finite-element methods (e.g., refs. 1,2,3,4,5,6,7). Unfortunately, such methods are computationally quite expensive, especially at high resolution or “fidelity” (e.g., refs. 8,9,10). As such, alternative approaches to the modeling of complex polycrystalline and polyphase materials and their behavior under mechanical loads are the focus of current research.

Among existing alternatives to the currently used solvers, perhaps the most prominent ones are those based on artificial neural networks (ANNs) and machine learning (ML) (e.g., ref. 11). Development of corresponding models is based in particular on the fitting of ANN parameters to data via constrained optimization, i.e., ANN training, yielding a trained ANN (tANN). Except for a few early works (e.g., refs. 12,13), most such “surrogate” models based on tANNs have been introduced in the last 4 years (e.g., refs. 14,15,16,17,18). The data employed for training and testing can be experimental, empirical, or synthetic in nature. An example of the latter are results from the numerical solution of IVBPs based on physical models. In the “data-driven” case, training is based on such data alone. Going beyond this, one can employ physical relations (e.g., constitutive relations) on which the data are based as additional training constraints. In the case of physics-informed neural networks (PINN) (e.g., ref. 19), the data are the initial and boundary conditions of an IBVP, and the ANN is used to approximate a least-squares-based (numerical) solution of the IBVP (e.g., ref. 20). Recent reviews of applications in the field of continuum mechanics and material modeling include, for example, ref. 21. In the current work, attention is focused on the data-driven approach. The data are obtained from the numerical solution of a BVP for quasi-static mechanical equilibrium based on viscoplastic material modeling of grains in a heterogeneous polycrystalline ensemble.

A number of data-driven approaches have been proposed for applications in solid mechanics. For example, Yang et al.22 trained a conditional generative adversarial network (cGAN) to reproduce stress and strain fields in strained isotropic elastic two-phase composites. Mianroodi et al.23 trained a U-Net-based convolutional neural network (CNN) using results for the von Mises stress for grain microstructures consisting of isotropic elastic and ideal elastoplastic grains subject to uniaxial extension. More recently, Rashid et al.24 introduced a neural-operator-based approach, the Fourier Neural Operator (FNO)25,26, in particular for the surrogate modeling of stress and strain in heterogeneous composites. The U-Net-based CNN and FNO-based ANN have been compared recently by Kapoor et al.27 in the context of surrogate modeling of stress fields in heterogeneous elastoplastic solids.

In the current work, a U-Net-based CNN is trained to output the von Mises stress field σvM in heterogeneous periodic microstructures consisting of inelastic grains subject to uniaxial extension. More specifically, the constitutive behavior of each grain is modeled via J2 elasto-viscoplasticity with linear isotropic hardening. Results from the numerical solution of BVPs for quasi-static mechanical equilibrium in periodic unit cells based on this grain behavior and spectral numerical solution methods (e.g., refs. 1,2,4,7) are employed to train the CNN. For brevity, this is referred to as the “reference model” (RM) in this work. In the “Results” section, results for σvM and its average \(\overline{{\sigma }_{{{{\rm{vM}}}}}}\) over the unit cell U of the grain microstructure obtained from the trained CNN (tCNN) are presented and compared with corresponding results from the RM. In particular, these include the dependence of the tCNN on details of the training dataset and training such as (i) the number of grains in the microstructure, and (ii) the range of material properties chosen for each grain. In addition, the performance of the tCNN for (i) microstructure morphologies (e.g., matrix-inclusion), and (ii) extension levels, not included in the training dataset, is also investigated and discussed. The conclusions and outlook are presented in the “Discussion” section. Methods employed in this work are presented in the “Methods” section. In particular, this includes a brief summary of the model for isotropic J2 viscoplasticity with linear isotropic constitutive response assumed for each grain in the polycrystalline ensemble. Data generation, the U-Net-based network architecture employed, and network training are also explained and discussed.

Results

The purpose of this section is to compare the output of the tCNN for the von Mises stress field σvM with corresponding results from the RM for selected test cases. As explained in more detail in the “Methods” section below, the current CNN is trained with results for σvM on unit cells U with 10-grain microstructures and a resolution of 64 × 64 pixels subject to uniaxial extension. Training (80%) and validation (20%) datasets are based on data from 1000 such microstructures generated randomly. For simplicity, data used for training and testing are based on a single (quasi-static) extension rate of 1 × 10−3 s−1. Test cases here include microstructures with (i) different numbers of grains, (ii) material property contrasts in neighboring grains, and (iii) grain morphologies, which differ from those in the training and validation datasets. The tCNN is also tested for extensions larger than those in these datasets. Lastly, the computational efficiency of the tCNN in comparison to the RM is also discussed. In what follows, σRM and σtCNN represent σvM from the RM and tCNN, respectively.

Test microstructures with a different number of grains

As stated above, the training dataset is based on 10-grain microstructures. For test cases, 64 × 64 pixel unit cells U with 5 and 20 grains are considered here. Each corresponding test dataset consists of 50 microstructures. The mean absolute error (MAE) for these as based on σtCNN − σRM is compared with the MAE of the training dataset in Table 1. Note that the MAE increases with decreasing grain size (i.e., increasing number of grains in a unit cell of constant size) and the concomitant increase in the number of grain boundaries. As discussed in more detail below, these interfaces are regions of maximum error in σtCNN.

Table 1 MAE for test datasets based on 5- and 20-grain microstructures and for the training dataset based on 10-grain microstructures.

Example results for 5- and 20-grain test microstructures, and for a 10-grain training/validation microstructure, are shown in Figs. 13.

Fig. 1: Five-grain test microstructure subject to uniaxial extension along the horizontal axis.
figure 1

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Fig. 2: Ten-grain test microstructure subject to uniaxial extension along the horizontal axis.
figure 2

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Fig. 3: Twenty-grain test microstructure subject to uniaxial extension along the horizontal axis.
figure 3

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

As mentioned above and shown in Figs. 13 (b, bottom), σtCNN − σRM is maximal at grain boundaries and triple junctions where spatial variations in the stress field are largest and pixel resolution is lowest. In particular, the former is due to contrasts among material properties across grain boundaries. Maximum error at grain boundaries has been observed as well in the previous work (e.g., 23). As can be seen in Figs. 13 (b, bottom), the spatial extent of the maximum value of the measure σtCNN − σRM increases with increasing \({\bar{F}}_{11}\) and decreasing average grain size (i.e., increased number of grains) in U.

Test microstructures with different material property distributions

Training and validation data are based on 10-grain microstructures in which material property values for the grains are selected randomly from the ranges in Table 3. To obtain test data, random distributions of E, ν, ξ0, h0 have been chosen from subsets of the ranges in Table 3, i.e., E [50, 300] GPa, ν [0.2, 0.4], ξ0 [50, 300] MPa, and h0 [0, 50] GPa. In particular, the four cases

$$\begin{array}{rcl}&{{{\rm{Case}}}}\,\,1&\begin{array}{lll}E&\in &[50,75]\cup [275,300],\\ {\xi }_{0}&\in &[50,75]\cup [275,300],\end{array}\quad \begin{array}{lll}\nu &\in &[0.2,0.22]\cup [0.38,0.4],\\ {h}_{0}&\in &[0,5]\cup [45,50],\end{array}\\ &{{{\rm{Case}}}}\,\,2&\begin{array}{lll}E&\in &[75,100]\cup [250,275],\\ {\xi }_{0}&\in &[75,100]\cup [250,275],\end{array}\quad \begin{array}{lll}\nu &\in &[0.22,0.24]\cup [0.36,0.38],\\ {h}_{0}&\in &[5,10]\cup [40,45],\end{array}\\ &{{{\rm{Case}}}}\,\,3&\begin{array}{lll}E&\in &[100,125]\cup [225,250],\\ {\xi }_{0}&\in &[100,125]\cup [225,250],\end{array}\quad \begin{array}{lll}\nu &\in &[0.24,0.26]\cup [0.34,0.36],\\ {h}_{0}&\in &[10,15]\cup [35,40],\end{array}\\ &{{{\rm{Case}}}}\,\,4&\begin{array}{lll}E&\in &[125,150]\cup [200,225],\\ {\xi }_{0}&\in &[125,150]\cup [200,225],\end{array}\quad \begin{array}{lll}\nu &\in &[0.26,0.28]\cup [0.32,0.34],\\ {h}_{0}&\in &[15,20]\cup [30,35],\end{array}\end{array}$$
(1)

are considered. Note the decrease in property contrast in going from Case 1 to Case 4. This is reflected in the MAEs of the corresponding test datasets shown in Table 2. Note the decrease in MAE with decreasing contrast in material properties. For example, the contrast in E is at least 215 GPa in Case 1, and 145 GPa in Case 4.

Table 2 MAE for test datasets based on the material property distributions in Eq. (1).

Example results for Case 1 and Case 3 are shown in Figs. 4 and 5, respectively.

Fig. 4: Case 1 (high-contrast) test microstructure subject to uniaxial extension along the horizontal axis.
figure 4

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Note the slight difference in the grain shape distributions in Figs. 4 and 5.

Fig. 5: Case 3 (low-contrast) test microstructure subject to uniaxial extension along the horizontal axis.
figure 5

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

As noted above, since the material properties in each grain are homogeneous, the contrast in the property distributions is also related to the contrast in material properties at grain boundaries and triple junctions. In comparison to the results in Figs. 13 (b, bottom) for σtCNN − σRM in the case that material properties values are chosen from the ranges in Table 3, note the increase in σtCNN − σRM in Fig. 4b (bottom: up to 100 MPa) and Fig. 5b (bottom: up to 80 MPa) due to the larger contrast, especially in the former case. This is also true for the relative error shown in Figs. 4c and 5c. The figure indicates that the results from the tCNN in Case 3 (with a maximum error of 2%) are more accurate than Case 1 in which the maximum error is around 5% as shown in Fig. 4c. Training with a larger dataset in this case would result in a reduction of such errors and better agreement with the RM.

Table 3 Material property ranges used for generating the dataset.

Test microstructures with different morphologies

In this subsection, the tCNN is applied to grain and microstructure morphologies not included in the training dataset. These take the form of a single inclusion embedded in a matrix material. Inclusions of circular and square form are investigated. For each, 50 test results are generated based on a random choice of material properties from the ranges in Table 3. Examples of these are shown in Figs. 6 and 7.

Fig. 6: Round inclusion test microstructure subject to uniaxial extension along the horizontal axis.
figure 6

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Analogous to the case of grain boundaries and triple junctions in the polycrystalline cases in Figs. 13 (b, bottom), σtCNN − σRM is largest at the sharp matrix–inclusion (MI) interface where the contrast in material properties is greatest and pixel resolution is the lowest.

Note the maximum in σtCNN − σRM at the MI interface in the circular case (Fig. 6b (bottom)) at \({\bar{F}}_{11}=1.001\) not present at the MI interface in the square case (Fig. 7b (bottom)). This is due to the fact that (in contrast to the latter system) the former system is still elastic at \({\bar{F}}_{11}=1.001\), as implied by the unit cell stress-deformation results in Fig. 6c and the larger initial resistance stress ξ0 of the circular inclusion and corresponding contrast at the MI interface (Fig. 6a). For comparable contrasts in material properties at the (in particular sharp) interfaces of the circular and square inclusions with the matrix, one expects the largest stress concentration at the corners of the latter, and so the largest values of σtCNN − σRM over the extension history. This is also reflected by the fact that the MAE of the test dataset for the circular inclusion case (1.42 MPa) is slightly lower than in the square inclusion case (1.58 MPa).

Fig. 7: Square inclusion test microstructure subject to uniaxial extension along the horizontal axis.
figure 7

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Extension histories not in the training dataset

The training data is based in particular on results for σRM corresponding to uniaxial extension of the unit cell up to \({\bar{F}}_{11}=1.004\). Since σvM is an input to the network, the tCNN can be used to calculate σvM for values larger than those in the training data, corresponding to unit cell uniaxial extension beyond that represented by the training data. A typical maximum is just above 500 MPa (see, for example, Fig. 11, bottom right). Results for this are shown in Fig. 8.

Fig. 8: Extension histories test microstructure subject to uniaxial extension along the horizontal axis.
figure 8

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, linear isotropic hardening modulus h0. b Results for σRM (above), σtCNN (middle), σtCNN − σRM (below) (all MPa), at different \({\bar{F}}_{11}\). c \(\overline{{\sigma }_{{{{\rm{RM}}}}}}\) (blue curve), \(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}\) (green points), and relative error (red triangles) given by \(100\,(\overline{{\sigma }_{{{{\rm{tCNN}}}}}}-\overline{{\sigma }_{{{{\rm{RM}}}}}})/\overline{{\sigma }_{{{{\rm{RM}}}}}}\) versus \({\bar{F}}_{11}\).

Note the significant increase in σtCNN − σRM in Fig. 8b (bottom) and in the corresponding relative error in Fig. 8c (red triangles) above \({\bar{F}}_{11}=1.0065\).

Figure 8c shows the corresponding stress–strain curve for further loading. According to this curve, the error notably increases for values of \(\overline{{\sigma }_{{{{\rm{vM}}}}}}\) greater than 500 MPa. One reason for this is very little data in the training data for such values. This can be seen for example in Fig. 11 (below right), which displays the distribution of pixel values for σvM at \({\bar{F}}_{11}=1.004\). As shown, only 0.084% of the pixels in the whole training dataset have a von Mises stress level above 500 MPa.

Computational efficiency

We consider next the time required to calculate the von Mises stress field based on the RM and the tCNN. Recall that the former is based on the numerical solution of BVPs for quasi-static mechanical equilibrium in periodic unit cells employing spectral numerical solution methods (e.g., refs. 1,2,4,7). The corresponding results for the RM are obtained using the DAMASK simulation package5 (see next section). The calculations are carried out on a single core of an Intel® Core™ i9900K clocked at 3.60 GHz. On this basis, the run time of DAMASK for a single training simulation averaged over 50 training simulations is approximately 75 s, and the corresponding run time for the tCNN is about 0.15 s. As such, the tCNN is about 500 faster than the RM for the training case. Of course, this difference depends on the details of the training data, e.g., on the chosen resolution of 64 × 64 pixels. For finer resolutions, the difference increases significantly.

Discussion

In this work, a U-Net-based CNN has been trained to calculate the von Mises stress field in metallic polycrystals or composites in which the mechanical behavior of the grains is modeled by J2 elasto-viscoplasticity with linear isotropic hardening. Data for training, validation, and testing were generated via numerical solution of the corresponding periodic boundary-value problems for quasi-static mechanical equilibrium on grain microstructure unit cells via spectral methods. Data sets for testing of the resulting trained CNN (tCNN) are based on the (i) number of grains, (ii) distribution of material properties, (iii) grain morphologies, and (iv) applied unit cell extension.

For the 64 × 64 pixel resolution of the microstructure employed, calculation of the von Mises stress field σvM with the resulting tCNN is up to 500 times faster than with the RM. Increasing the resolution would result in an even larger difference in computational time. On the other hand, in contrast to the RM, the accuracy of the tCNN is limited to (i) the range of the training data as well as (ii) uniaxial extension.

The current approach can be extended and further developed in a number of directions. These include for example (i) output of all components of the stress field, (ii) training for different deformation and loading conditions, or (iii) training for multiple rates of deformation and/or loading. As well, training can be based on more sophisticated “physics-informed” loss functions (accounting, for example, for mechanical equilibrium), or on more sophisticated “physics-encoded” network architectures. This represents work in progress to be reported on in the future.

Methods

RM: J2 viscoplasticity with isotropic hardening

As discussed in the introduction, data for the training and testing of the ANN are obtained from the numerical solution of a boundary-value problem for quasi-static mechanical equilibrium based on viscoplastic material modeling of grain behavior in a grain microstructure. In the current context of isothermal and quasi-static conditions, these include in particular mechanical equilibrium \({{{\rm{div}}}}{{{\bf{P}}}}={{{\boldsymbol{0}}}}\) in terms of the first Piola–Kirchhoff stress P. In the context of the viscoplastic decomposition F = FeFp of the deformation gradient F, the linear elastic relation Se = λ (IEe) + 2μEeEe is assumed between the elastic second Piola-Kirchhoff stress S and the elastic Green strain \({{{{\bf{E}}}}}_{{{{\rm{e}}}}}=\frac{1}{2}({{{{\bf{F}}}}}_{e}^{T}{{{{\bf{F}}}}}_{e}-{{{\bf{I}}}})\); note that \({{{\bf{P}}}}={{{{\bf{F}}}}}_{{{{\rm{e}}}}}{{{{\bf{S}}}}}_{{{{\rm{e}}}}}{{{{\bf{F}}}}}_{{{{\rm{p}}}}}^{-{{{\rm{T}}}}}\). In the following, the Lame constants with λ = Eν/((1 + ν)(1 − 2ν)) and μ = E/(2(1 + ν)) are determined in terms of the Young’s modulus E and the Poisson ratio ν. The evolution of Fp is determined by the J2 flow rule \({\dot{{{{\bf{F}}}}}}_{{{{\rm{p}}}}}{{{{\bf{F}}}}}_{{{{\rm{p}}}}}^{-1}={\dot{\gamma }}_{{{{\rm{p}}}}}\,{{{{\bf{S}}}}}_{{{{\rm{e}}}}}^{{{{\rm{dev}}}}}/|{{{{\bf{S}}}}}_{{{{\rm{e}}}}}^{{{{\rm{dev}}}}}|\), where \({\dot{\gamma }}_{{{{\rm{p}}}}}\) is the rate of equivalent plastic shear, and \({\bf{S}}_{\rm{e}}^{\rm{dev}}\) the deviatoric part of Se. The viscoplastic (i.e., rate-dependent28) form \({\dot{\gamma }}_{{{{\rm{p}}}}}={\dot{\gamma }}_{0}{(| {{{{\bf{S}}}}}_{{{{\rm{e}}}}}^{{{{\rm{dev}}}}}| /\xi )}^{{n}_{0}}\) for the evolution of γp is determined by the typical material inelastic shear rate \({\dot{\gamma }}_{0}\), the power-law exponent n0, and the flow resistance ξ(γp) = ξ0 + h0γp for linear isotropic hardening, with ξ0 the initial flow resistance, and h0 the isotropic hardening modulus.

The viscoplastic model is implemented in the simulation software toolkit DAMASK5. This toolkit is also used for numerical solution of the corresponding quasi-static mechanical boundary-value problem on periodic grain microstructures based on spectral (i.e., Fourier) methods.

Data generation

Material parameters for each grain in the microstructure include E, ν, ξ0, h0, \({\dot{\gamma }}_{0}\), and n0. For simplicity, \({\dot{\gamma }}_{0}=1{0}^{-3}\,\) s−1 and n0 = 20 are assumed the same for all grains. Values for the remaining material properties of each grain are chosen randomly from a range of values shown in Table 3. The training dataset consists of 1000 grain microstructures based on 10 grains and random material property distributions. Grain morphologies and microstructures are generated randomly via Voronoi tessellation. An example is shown in Fig. 9.

Fig. 9: Example of a randomly generated grain microstructure and corresponding material property distribution.
figure 9

The latter include Young’s modulus E, Poisson’s ratio ν, the initial flow resistance ξ0, and the linear isotropic hardening modulus h0.

As done, for example, in ref. 23, the ANN input consists of the distribution of these material properties in the unit cell/microstructure as well as results for the scalar von Mises stress field \({\sigma }_{{{{\rm{vM}}}}}=\sqrt{3{{{{\bf{T}}}}}_{{{{\rm{dev}}}}}\cdot {{{{\bf{T}}}}}_{{{{\rm{dev}}}}}/2}\) (Cauchy stress \({{{\bf{T}}}}={{{\bf{P}}}}{{{{\bf{F}}}}}^{{{{\rm{T}}}}}/\det {{{\bf{F}}}}\)) during extension. Given the material property distribution (E, ν, h0, ξ0) and σvM at time step t (i.e. \({\sigma }_{{{{\rm{vM}}}}}^{t}\)), then, the trained ANN (tANN) outputs σvM at time step t + Δt (i.e., \({\sigma }_{{{{\rm{vM}}}}}^{t+{{\Delta }}t}\)). This is depicted in Fig. 10.

Fig. 10: Schematic illustration of the tANN for calculation of the local von Mises stress based on \(J_{2}\) viscoplasticity with linear isotropic hardening.
figure 10

a Young’s modulus E, Poisson’s ratio ν, initial flow resistance ξ0, and linear isotropic hardening modulus h0 are material property distributions as input of the tANN. \(\sigma_{\mathrm{vM}}^{t}\) denotes the von Mises stress distribution at time step t.

In the current case of purely bulk behavior of grain microstructures on the unit cell U, fields \(f=\bar{f}+\tilde{f}\) on the unit cell U are additively split into mean \(\bar{f}\) and fluctuation \(\tilde{f}\) parts, with \(\bar{f}:= v{(U)}^{-1}{\int}_{U}f\,dv\) and v(U)  ∫Udv. In this context, deformation “boundary conditions” on U take the form of prescribed values for the mean deformation gradient \(\bar{{{{\bf{F}}}}}(t)\). For the current case of uniaxial extension, the Cartesian/matrix form

$$\bar{{{{\bf{F}}}}}(t)\equiv \left[\begin{array}{ccc}{\bar{F}}_{11}(t)&0&0\\ 0&1&0\\ 0&0&1\end{array}\right],\quad {\bar{F}}_{11}(t)=1+{\dot{\bar{F}}}_{11}(0)\,t,$$
(2)

of \(\bar{{{{\bf{F}}}}}(t)\) applies. For simplicity, data for training and testing are limited to a single extension rate \({\dot{\bar{F}}}_{11}(0)=1\times 1{0}^{-3}\) s−1, i.e., to quasi-static extension. Based on this extension rate, results obtained up to \({\bar{F}}_{11}(4)=1.004\) are employed as training data, and those between this value and \({\bar{F}}_{11}(8)=1.008\) are used for testing the tCNN.

The pixel distributions of values for Young’s modulus E and σvM for different values of \({\bar{F}}_{11}\) in the training data are shown in Fig. 11.

Fig. 11: Distributions of values and corresponding mean value (in red) for selected input quantities in the training data.
figure 11

Top left: Young’s modulus E. Top right: σRM for \({\bar{F}}_{11}=1.001\). Bottom left: σRM for \({\bar{F}}_{11}=1.002\). Bottom right: σRM for \({\bar{F}}_{11}=1.004\).

The pixel distributions of the other input material properties are similar. As evident, the pixel distribution of σvM at each value of \({\bar{F}}_{11}\) is quasi-normal in character, as expected for randomly distributed material property values.

Neural network type, architecture, and training

The current ANN is based on the U-Net convolutional type and architecture introduced by Ronneberger et al.29. As shown by Mianroodi et al.23, this architecture is suitable for surrogate modeling the stress field in solid mechanics problems, and in particular the von Mises stress σvM. Fig. 12 schematically depicts the U-Net network architecture, (referred to in ref. 29 as U-Net due to its shape).

Fig. 12: U-Net-based ANN architecture adopted in this work.
figure 12

Depicted are the number of channels as well as the size of the images in each layer. Employed are two-dimensional (2D) separable convolution with 9 × 9 kernel, ReLU activation, batch normalization, 2D max pooling, and upsampling with bilinear interpolation, all as implemented in TensorFlow30. All weights and biases are initialized via the Glorot algorithm31. See text for discussion.

Network input consists of 64 × 64 pixel images for E, ν, ξ0, h0, and σvM, and the network outputs one 64 × 64 pixel image for σvM. As usual, both input and output are normalized to 1. In addition, 32 filters capture the main features from the input images. As shown in the figure, four types of operation are performed in the U-Net, namely, separable two-dimensional (2D) convolution with a kernel size of 9, batch normalization, 2D max pooling, and 2D upsampling with bilinear interpolation.

For the training process, Adam optimization is employed with a learning rate of 0.001 and a momentum of 0.9. As discussed above, the loss function for training, validation, and testing is given by the mean absolute error (MAE) based on the difference σtCNN − σRM. TensorFlow30 is used to set up and train the network. As usual, the dataset is divided into training (80%) and (20%) testing subsets. Training is based on 500 epochs and has an MAE of 1.733 MPa; the corresponding MAE for the validation dataset is 1.743 MPa. No sign of overfitting was observed.