Introduction

Many deformation and phase transformation processes in solids exhibit nonlinear pathways in the 6-dimensional strain space. For instance, while dislocation glide is generally considered as a pure shear process without volume change, there is indeed a significant transient dilatation associated with the activated state of dislocation motion, confirmed in both atomistic calculations1 and experiments.2, 3 The αω martensitic transformation (MT) in titanium also follows a complex nonlinear pathway,4, 5 which is revealed in the martensite transformations in lithium6 and iron7 as well. Figure 1 presents the results from atomistics-based nudged elastic band (NEB) calculations of the Bain path for the γ(FCC)→α(BCC) transformation in iron. (Li, J. Private communication, 2001). During this structural transformation, it is assumed that the two \({\left\langle {110} \right\rangle _\gamma }\) axes concurrently expand equally while the \({\left\langle {001} \right\rangle _\gamma }\) axis shrinks to form a BCC structure. The calculated minimum energy path (MEP) and lattice transformation pathway are shown, respectively, in Fig. 1a and b. Following the spirit of the phase-field (PF) approach, we can introduce an order parameter (OP) in the reaction coordinate along the MEP to characterize this phase transformation, with its value coupled with the transformation strain (calculated based on the lattice correspondence in Fig. 1b) shown in Fig. 1c. Although it seems that the transformation strain pathway is mainly linearly coupled with the OP, the corresponding volume change (i.e., the trace of the transformation strain tensor), shown in Fig. 1d, indicates that the system initially undergoes increasing hydrostatic compression but eventually ends up with the martensitic state subject to a hydrostatic tension. Such a complex variation is completely missed when linear coupling between the transformation strain and the OP is assumed as shown in Fig. 1d; on the contrary, even a simple quadratic function can capture the essential nonlinear pathway as shown in Fig 1c and d. (More accuracy can be obtained when higher order terms are involved.)

Fig. 1
figure 1

Atomistic calculation of a free energy and b lattice parameters, and the corresponding c transformation strain components and d volume change along the Bain path in iron γα transformation

The nonlinearity of these atomistically/experimentally revealed transformation pathways are, however, missing in the current formulation of the phase-field microelasticity (PFM) theory developed by Khachaturyan,8 which has been the foundation for studying microstructure evolution during deformation and phase transformations in solids.9,10,11 In PF modeling, each distinctive phase is represented by a unique set of OPs that usually take the value of 0 before transformation (i.e., the parent phase) and 1 after transformation (i.e., the product phase). The intermediate values of the OP (e.g., between 0 and 1), which represent the phase boundary region under the assumption of diffuse interfaces, are usually assumed to be linearly coupled with physical quantities such as density/volume, concentration, transformation strain, atomic shuffle, degree of order, and magnetic/electric polarization. The total free energy can then be formulated as a functional of the OP fields, and a variational approach is employed to study the dynamics towards equilibrium. The linear coupling between the OPs and the physical quantities can be considered as a good approximation when the final microstructure is of interest and the interface contribution is relatively small. However, if the focus is on nucleation and/or the subsequent growth that is dominated by the creation of interfaces, how the OPs and physical quantities are coupled could make a considerable difference, especially in solids. For instance, the size, shape, and activation energy of a critical nucleus could be significantly different along a non-linear strain path as compared to those along a linear path. It also needs to be pointed out that although one may think of lumping the nonlinearity into the “chemical” (bulk) free energy instead of an explicit consideration, terms such as the elastic strain energy, related to coupling between the OP and the transformation strains, are inherently non-local and cannot be described by a local chemical free energy term.

Clearly, an explicit PF formulation that can directly account for the nonlinear coupling between OPs and physical quantities is highly desired so that results from atomistic calculations (e.g., Fig. 1) can be integrated directly with PF to provide multiscale modeling in a quantitative manner. Nonlinear couplings between OPs and physical quantities have been previously used in PF modeling of ferroelectric transformations,12 atomic ordering,13 dislocations,14 etc. In the area of PF modeling of MT, the work in ref. 15 considered a linear coupling between the square of the OPs and the eigen-strain tensor; the mean-field Landau theory of MT in refs. 16, 17 employed a more complicated polynomial function of OPs that are coupled with the eigen-strain tensor. However, both methods assumed a collinear coupling in the six-dimensional strain space, i.e., the OP is coupled equally to every component of the eigen-strain tensor. Recently, Vattré and Denoual18 proposed a PF MT model with the strain pathway explicitly described in the six-dimensional strain space, which, however, still assumed a linear (and collinear) transformation path from the parent to the product phase. In some other PF models of MT (e.g., refs. 19,20,21) the structural OPs are formulated directly using a linear combination of strain components (regardless of small-strain19 or finite-strain20), which lead to a framework significantly different from the PFM based models. Nevertheless, the direct effect of nonlinear and noncollinear coupling between OPs and transformation strain pathways on MT dynamics has not been considered in any of these studies.

In this paper, we consider both nonlinear and noncollinear coupling between OPs and transformation strains during MTs in the PFM framework. In previous PF modelings of MTs, the resulting strain energy is calculated using the PFM theory based on linear coupling between the OPs and the stress-free-transformation-strain (SFTS). The PFM theory can be traced back to Eshelby’s classical work22 on transformation-induced elasticity and has been employed in PF simulations of solid-solid phase transformations and even the modeling of deformation in amorphous alloys.23, 24 It is worth pointing out that Eshelby’s pioneering work is also the basis for fast-Fourier-transform (FFT) based schemes for computing the micromechanical fields of periodic heterogeneous materials directly from an image of the microstructure (i.e., image-based approaches),25, 26 as well as the recently emerged FFT-based crystal plasticity models.27,28,29 Here we extend the PFM theory by taking into account a general nonlinear coupling between OPs and SFTS tensors. Using the generic cubic→tetragonal MT, previously studied by Shen et al. using the original PFM theory,30 as an example, we quantify the differences in the fundamental properties of a critical nucleus and growth kinetics. It will be shown that while the characteristic features of the final microstructure remain the same for the new PFM theory, hereafter called generalized PFM (GPFM) theory, the configuration and activation energy of a critical nucleus differ significantly when nonlinear coupling is considered. In addition, the far-field hydrostatic stress associated with the critical nucleus of a pure-shear martensite is actually nonzero if nonlinear transformation pathways are considered. This case study indicates the significance of nonlinear transformation pathways when considering solid-solid phase transformations. The GPFM formulated in this study provides a general framework to incorporate directly atomistic pathways into mesoscale microstructure modeling during solid state phase transformations.

Results

The results of this work are presented in two major sections. First, we will discuss the development of our methodology and formulation. Then, we will demonstrate the significance by applying the new theory to a cubic→tetragonal martensite transformation.

GPFM incorporating nonlinear transformation strain pathways

Starting with the original PFM theory, the total elastic energy E el for a given microstructure, considered as a configuration with distributed Eshelby inclusions22 (product phases) coherently embedded in the original elastic medium (parent phase), is given as a functional8

$$ {E^{{\rm{el}}}} = \, \frac{1}{2}{\int} {{\it{d}}{\bf{x}}{C_{ijkl}}{\varepsilon _{ij}}\left( {\bf{x}} \right){\varepsilon _{kl}}\left( {\bf{x}} \right)}\hfill \\ + \frac{V}{2}{C_{ijkl}}{{\bar \varepsilon }_{ij}}{{\bar \varepsilon }_{kl}} - {{\bar \varepsilon }_{ij}}{\int} {d{\bf{x}}{C_{ijkl}}{\varepsilon _{kl}}\left( {\bf{x}} \right)}\hfill \\ - \frac{1}{2} {\fint} \frac{{d{\bf{g}}}}{{{{\left( {2\pi } \right)}^3}}}{n_i}\tilde \sigma _{ij}^T({{\rm g}}){\Omega _{jk}}{\bf{(n)}}\tilde \sigma _{kl}^{T*}\left( {\bf{g}} \right){n_l}\hfill \\ $$
(1)

where ε ij (x) is the transformation strain tensor field at position x and \(\tilde \sigma _{ij}^T\left( {\bf g} \right)\) is the Fourier transform of \(\sigma _{ij}^T\left( {\bf{x}} \right) \equiv {C_{ijkl}}{\varepsilon _{kl}}\left( {\bf{x}} \right)\), where g is the reciprocal vector and C ijkl is the elastic stiffness tensor. It is assumed here that the elastic modulus is homogeneous and the same to both parent and product phase. \({\bar \varepsilon _{ij}}\) is the overall homogeneous strain of the material and V is the volume, and \(\left[ {\Omega \left( {\bf{n}} \right)} \right]_{jk}^{ - 1} \equiv {C_{ijkl}}{n_i}{n_l}\) where \({\bf{n}} = {\bf{g}}/\left| {\bf{g}} \right|\). The symbol indicates that the integration excludes g = 0 point in the reciprocal space.

For a given microstructure, defined as a collection of a hierarchy of structural and chemical non-uniformities (imperfections or defects),11 a unique distribution of transformation strain can be identified. Within the non-uniformity such as a second phase or slipped region, ε ij (x) is equal to the pre-defined SFTS, the so-called “eigen transformation strain”, while at interfaces, unless the transient states (or the transformation pathway) are also pre-determined, ε ij (x) is generally unknown. Since the philosophy of PF is to formulate a total energy functional in terms of a set of OPs {η i (x)} that completely describe the microstructure, a coupling between ε(x) and {η i (x)} is required so that Eq. (1) can be written in terms of OPs and thus incorporated into the total energy functional of PF models.

Considering p = 1,…, N v with N v being the total number of product phases, we have a set of pre-defined SFTS tensors \(\left\{ {{{\rm{\epsilon }}_{ij}^p}} \right\}_{p = 1}^{{N_\nu }}\) assigned to the N v product phases, and correspondingly a set of N v OPs, \(\left\{ {{\eta _p}\left( {\bf{x}} \right)} \right\}_{p = 1}^{{N_\nu }}\) to describe the microstructural evolution. In the original PFM, the OPs are linearly coupled with the SFTS tensors and we have the transformation strain ε ij (x) obtained as

$${\varepsilon _{ij}}\left( {\bf{x}} \right)\; = \;\mathop {\sum}\limits_p^{{N_\nu }} {\epsilon _{ij}^p{\eta _p}\left( {\bf{x}} \right).} $$
(2)

This has been considered as a sound approximation, given that the transformation strain is overall small in the same sense when people use, e.g., Vegard’s law for concentration-strain relationship. Substituting Eq. (2) into Eq. (1) gives rise to the energy functional in terms of OPs.

What if the OPs are not necessarily linearly coupled with the SFTS tensors, just like what is shown in Fig. 1? In such cases, nonlinear couplings between η p (x) and \( \epsilon _{ij}^p\) are obviously needed to give rise to the transformation strain field ε ij (x) instead of using Eq. (2). In Fig. 1c we show the fitting results using a second-order polynomial. Not only the transformation strain components exhibit a good fit by only introducing a quadratic, but also the nonlinear variation of volume change along the transformation pathway (Fig. 1d) is faithfully captured. If the linear coupling of Eq. (2) is assumed, the trace of the transformation strain \({\rm{tr}}\left( {\varepsilon_{ij} \left( {\bf{x}} \right)} \right) \equiv {\varepsilon _{kk}}\left( {\bf{x}} \right) = \mathop {\sum}\nolimits_p^{{N_\nu }} {\epsilon _{kk}^p{\eta _p}\left( {\bf{x}} \right)} \) where summation over repeated indices is assumed. As a result, if the SFTS tensors are traceless, i.e., \(\epsilon _{kk}^p = 0\), the transition states will always be a volume-conserved state, excluding physics such as the transient dilatation of dislocations and MT. (Even for cases where \(\epsilon _{kk}^p \, \ne \, 0\), a nonlinear volumetric change phenomenon, shown in Fig. 1d, cannot be captured either.)

To account for complex nonlinear couplings, we formally write

$${\varepsilon _{ij}}\left( {\bf{x}} \right)\; = \;\mathop {\sum}\limits_p^{{N_\nu }} {\epsilon _{ij}^{ \circ p}\Lambda _p^{ij}\left( {{\eta _p}\left( {\bf{x}} \right)} \right)} $$
(3)

where \(\Lambda _p^{ij}\left( {{\eta _p}\left( {\bf{x}} \right)} \right)\) is a function with the coefficient of the leading (linear) term being 1, e.g., \(\Lambda _p^{ij}\left( {{\eta _p}} \right) = {\eta _p} + \alpha _p^{ij}\eta _p^2\) for a quadratic form. In general, MTs can be classified into two types: proper and improper.15, 31 In the former the OPs are directly the components of SFTS (or their combinations), while in the latter the OPs represent the primary transformation mode of non-affine atomic shuffling that induces the SFTS as a secondary mode. Nevertheless, in both cases the SFTS tensors can be formally written as a function of the OP as in Eq. (3). Note that in the previous modeling of improper MT,15 \(\eta _p^2\) has been used to account for the degeneracy of two antiphase states (distinguished by ± η p ). In this case, the linearity is referred to the coupling between \(\eta _p^2\) and SFTS tensors and the current discussion and the formulation to be developed are still valid upon a simple substitution of \({\eta _p} \to \eta _p^2\). Since nonlinear term coefficients, e.g., \(\alpha _p^{ij}\), are usually small and \(\epsilon _{ij}^{ \circ p}\) is thus the first order approximation of the SFTS, \(\epsilon _{ij}^p\), hereafter we will not differentiate the symbol \(\epsilon _{ij}^{ \circ p}\) from \(\epsilon _{ij}^p\) for convenience as long as the context is clear. The indices i and j go to the superscript rather than the subscript, indicating that we are doing an entrywise product, i.e., Hadamard product without summation over the repeated index i and j. In other words, we take into account the fact that the coupling of OPs in the six-dimensional strain-space can in principle be noncollinear. It needs to be pointed out that the current form of Eq. (3) does not transform like a tensor in a general coordinate change due to the use of Hadamard product. A general reference-invariant vector/tensor equation in place of Eq. (3) requires adequate physical understanding of the usually complex transformation strain pathways and its existence may not be guaranteed; our current treatment relies on approximating the pathways in the coordinate used by atomistic calculation and maintain the same coordinate for the subsequent PF simulations.

The physical significance of allowing noncollinear coupling between OPs and transformation strains is based on the fact that in certain solid-solid transformations, one strain component (e.g., an in-plane shear) may be strongly coupled with other transformation mode such as atomic shuffle, which is, however, decoupled from the other strain components. In such cases, the form of \(\Lambda _p^{ij}\left( {{\eta _p}\left( {\bf{x}} \right)} \right)\) must differ significantly (i.e., in terms of nonlinear terms) for different strain components.

We now derive the GPFM theory by adopting the general form specified in Eq. (3). For a strain-controlled boundary condition, since the applied strain is fixed during the phase transformation, one may just assume \({\bar \varepsilon _{ij}} \equiv 0\). In addition, because of the decomposition \({\varepsilon _{ij}}\left( {\bf{x}} \right) = {\bar \varepsilon _{ij}} + \delta {\varepsilon _{ij}}\left( {\bf{x}} \right)\), we have ∫ ε ij (x)d x = ∫ δε ij (x)d x 0, and thus \({\tilde \varepsilon _{ij}}\left( {{\bf{g}} = 0} \right) = 0\). As a result, the 2nd and 3rd terms in Eq. (1) vanish. Substituting Eq. (3) into Eq. (1), we arrive at

$${E^{{\rm{el}}}}\; = \;\frac{1}{2}\mathop {\sum}\limits_{p,q} {{\int} {\frac{{dg}}{{{{\left( {2\pi } \right)}^3}}}{D_{tsmn}}\left( {\bf{n}} \right)\epsilon _{ts}^p\epsilon _{mn}^q\tilde \Lambda _p^{ts}\left( {\bf{g}} \right)\tilde \Lambda _q^{mn*}\left( {\bf{g}} \right)} } $$
(4)

where D tsmn (n) is a 4th-rank tensor field defined in the Fourier space –

$$\begin{array}{ccccc}{D_{tsmn}} = \left\{\begin{array}{l}{C_{tsmn}},\quad {\rm{within}}\,{\rm{a}}\,{\rm{reciprocal}}\,{\rm{volume}}\,{\rm{of}}{\left( {2\pi } \right)^3}/\, V\,{\rm{at}}\,{\bf{g}}{\rm{ = 0,}}\hfill \\ {C_{tsmn}} - {n_i}{C_{ijts}}{\Omega _{jk}}\left( {\bf{n}} \right){C_{klmn}}{n_l},\quad {\rm{elsewhere}}{\rm{.}}\hfill \\ \end{array}\right. \end{array}$$
(5)

Obviously the B pq (n) tensor8, 32 in the original PFM is simply retrieved by recognizing

$${B_{pq}}\left( {\bf{n}} \right)\; = \;{D_{tsmn}}\left( {\bf{n}} \right)\epsilon _{ts}^p\varepsilon _{mn}^q$$
(6)

where summation over four repeated indices are taken. What prevents us from doing the summation in Eq. (4) is the fact that the coupling between OPs and SFTS tensors differ from one component to another, i.e., noncollinear coupling. If all strain components are coupling collinearly with OPs, the same B pq is retrieved by dropping the superscript in \({\Lambda _p}\) in Eq. (4) and summing over repeated indices as in Eq. (6). The functional derivative that is used in integrating PF dynamics equations are then obtained as

$$\frac{{\delta {E^{{\rm{el}}}}}}{{\delta {\eta _p}\left( {\bf{x}} \right)}}\; = \;{\mathop {\sum}\limits_{t,s} {\left\{ {\mathop {\sum}\limits_q {{D_{tsmn}}\left( {\bf{n}} \right)\epsilon _{mn}^q\tilde \Lambda _q^{mn}\left( {\bf{g}} \right)} } \right\}} _r}{\left. {\epsilon _{ts}^p\frac{{\partial \Lambda _p^{ts}}}{{\partial {\eta _p}}}} \right|_x}$$
(7)

where {…} r represents the inverse Fourier transformation to real space. If \(\alpha _p^{ts} = 0\) in the previous quadratic form, the above result reduces to the original PFM result as is expected.

If a given applied traction \(\sigma _{ij}^{{\rm{a}}ppl}\), which is taken to be zero here as for a “relaxed boundary”, is ascribed, the total elastic energy then becomes

$${E^{{{\rm el}}}}\; = \;\frac{1}{2}{\int} {d{\bf{x}}{C_{ijkl}}{\varepsilon _{ij}}\left( {\bf{x}} \right){\varepsilon _{kl}}\left( {\bf{x}} \right) - \frac{1}{{2V}}{C_{ijkl}}{\int} {d{\bf{x}}{{\bar \varepsilon }_{ij}}\left( {\bf{x}} \right){\int} {d{{\bf{x}}^{{'}}}{\varepsilon _{kl}}\left( {{{\bf{x}}^{{'}}}} \right)} } } \\ - \frac{1}{2} {\fint} \frac{{d{\bf{g}}}}{{{{\left( {2\pi } \right)}^3}}}{n_i}\tilde \sigma _{ij}^T\left( {\bf{g}} \right){\Omega _{jk}}\left( {\bf{n}} \right)\tilde \sigma _{kl}^{T*}\left( {\bf{g}} \right){n_l}\hfill\\ = \frac{1}{2}\mathop {\sum}\limits_{p,q} {\int\frac{{d{\bf{g}}}}{{{{\left( {2\pi } \right)}^3}}}{D_{tsmn}}\left( {\bf{n}} \right)\epsilon _{ts}^p\epsilon _{mn}^q {\tilde{\Lambda}_p^{ts}}\left( {\bf{g}} \right) {\tilde{\Lambda}_q^{mn*}}\left( {\bf{g}} \right)}\hfill $$
(8)

where D tsmn (n) is now defined as

$$\\ {D_{tsmn}}\; = \;\left\{ \begin{array}{l} 0,\quad {\rm{within}}\,{\rm{a}}\,{\rm{reciprocal}}\,{\rm{volume}}\,{\rm{of}}\,{\left( {2\pi } \right)^3}/V\,{\rm{at}}\,{\bf{g}} = 0,\\ {C_{tsmn}} - {n_i}{C_{ijts}}{\Omega _{jk}}\left( {\bf{n}} \right){C_{klmn}}{n_l},\quad {\rm{elsewhere}}{\rm{.}}\hfill \\ \end{array} \right.$$
(9)

Again one can check that the above results turn into those of the original PFM when Eq. (3) is reduced to Eq. (2). Finally the derivative for this relaxed-boundary will remain the same as in Eq. (7).

Application to a cubic→tetragonal martensite transformation

Irrespective of the types of transformations, as long as ab initio calculations are available for constructing (e.g., spline fitting using the ab initio sampling points) the free energy surface, our GPFM formulation presented in the previous section can then be utilized to perform PF simulations of the phase transformations. This is similar to incorporating the generalized stacking fault energy surface into PF dislocation dynamics simulations, which has been shown recently to be able to predict exactly the same defect structure and energy obtained from atomistic calculations.33, 34 In this study, however, in order to focus on illustrating the effect of the new GPFM, we choose a cubic to tetragonal MT that has been studied previously using the original microelasticity theory30 and compare simulation results obtained from the two approaches. It also needs to be pointed out that the FCC→BCC data shown in the Introduction section is only one-dimensional showing the free energy data along one transformation path and, thus, cannot be directly used in our later simulations that involve multiple variants. Once atomistic data are available to characterize the free energy landscape in the multidimensional transformation strain space, carrying out multiscale PF simulations based on GPFM should be straightforward, as has been demonstrated in refs. 33, 34.

The kinetics of cubic to tetragonal MT consists of nucleation and growth stages that occur at very distinctive time and length scales and thus require different methods to study. For the growth, we carry out PF simulations with both PFM and GPFM theories being incorporated to study the influence of different transformation pathways. For the nucleation, it is essential to determine the critical nucleus. This is done by following the previous PF-based NEB method.30 Since the transformation pathways determine the transition states and hence the elastic properties and strain energy of austenite/martensite interfaces, which usually place a friction force to MT and are believed to control the martensite nucleation,35 it is critical for any model not to have any prior constraint about the transformation pathway.11 This is exactly the goal of introducing GPFM to formulate the strain energy in PF.

The cubic to tetragonal MT has three variants with the following three SFTS tensors

$$\begin{array}{ccccc} {{ \epsilon }}_{ij}^{\left( 1 \right)}\; & = \;{{{ \epsilon }}_0}\left( {\begin{array}{*{20}{c}}2 & 0 & 0 \\ 0 & { - 1} & 0 \\ 0 & 0 & { - 1} \\ \end{array}} \right),\quad {{ \epsilon }}_{ij}^{\left( 2 \right)}\;= \;{{{ \epsilon }}_0}\left( {\begin{array}{*{20}{c}}{ - 1} & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & { - 1} \\ \end{array}} \right),\quad\hfill \\ \\ {{ \epsilon }}_{ij}^{\left( 3 \right)}\; & = \;{{{ \epsilon }}_0}\left( {\begin{array}{*{20}{c}}{ - 1} & 0 & 0 \\ 0 & { - 1} & 0 \\ 0 & 0 & 2 \\ \end{array}} \right).\hfill \end{array} $$
(10)

To investigate the influence of the incorporated nonlinear path, we use the quadratic nonlinear form in terms of \(\eta _p^2\) for modeling the improper MT,15 i.e., \(\Lambda _p^{ij}\left( {{\eta _p}} \right) = \eta _p^2 + \alpha _p^{ij}\eta _p^4\) in equations developed previously. In particular, cases of different nonlinear term coefficients listed in Table 1 are considered for the current generic MT. The coefficients \(\alpha _p^{ij}\) in Table 1 have the same order of magnitude as compared to the fitted values using the atomistic transformation strain pathway of γα MT in iron as shown in Fig. 1c. Thus, the following results should bear certain practical significance.

Table 1 Different transformation pathways in the form \({\Lambda_{p}^{ij}}\left( {{\eta _p}} \right) = {\eta _p^2} + {\alpha_{p}^{ij}}\eta _p^4\)

To study the cubic to tetragonal martensite transformation, we use the same chemical free energy density as in ref. 30, i.e.,

$${\fint} \left( {\left\{ {{\eta _p}\left( {\bf{x}} \right)} \right\}} \right) = \Delta {f_0}\left[ {\frac{{{A_1}}}{2}\mathop {\sum}\limits_p^\nu {\eta _p^2\left( {\bf{x}} \right)} - \frac{{{A_2}}}{4}\mathop {\sum}\limits_p^\nu {\eta _p^4\left( {\bf{x}} \right)} + \frac{{{A_3}}}{6}{{\left( {\mathop {\sum}\limits_p^\nu {\eta _p^2\left( {\bf{x}} \right)} } \right)}^3}} \right]$$
(11)

with coefficients A 1 = 0.2, A 2 = 12.8, and A 3 = 12.6, which has a minimum at η = 0 representing the cubic phase and two minima at η = ±1 representing the tetragonal phase (“±” corresponding to two possible antiphase states), and the energy difference between the cubic and tetragonal phases equals to Δf 0. Using the coefficients of exactly the same Landau free energy (Eq. (11)) that are obtained by first-principle calculations in ref. 36, together with the equation \(\Delta {f_0} = {Q_{{\rm{lat}}}}\left[ {T - {T_0}} \right]/T\) where Q lat is the experimentally measured latent heat of MT,31 it can be shown that the current values used for the coefficients in Eq. (11) are consistent with the first-principle calculations. The total free energy of the system is formulated as a functional of the OPs

$$G\; = \;G\left( {\left\{ {{\eta _p}\left( {\bf{x}} \right)} \right\}} \right)\; = \;\int {\left[ { {\fint} \left( {\left\{ {{\eta _p}\left( {\bf{x}} \right)} \right\}} \right) + \mathop {\sum}\limits_{p,q} {{\kappa _{pq}}\nabla {\eta _p}\left( {\bf{x}} \right)\nabla {\eta _q}\left( {\bf{x}} \right)} } \right]} d{\bf{x}}\hfill \\ + \frac{1}{2}\mathop {\sum}\limits_{p,q} {{\int} {\frac{{d{\bf{g}}}}{{{{\left( {2\pi } \right)}^3}}}{D_{tsmn}}\left( {\bf{n}} \right)\epsilon _{ts}^p\epsilon _{mn}^q{\tilde{\Lambda}_p^{ts}}\left( {\bf{g}} \right){\tilde{\Lambda}_q^{mn*}}\left( {\bf{g}} \right)} }\hfill \\ - {\int} {{\sigma_{ij}^{{{\rm ext}}}}\left( {\bf{x}} \right)\varepsilon_{ij} \left( {\left\{ {{\eta _p}\left( {\bf{x}} \right)} \right\}} \right)d{\bf{x}},}\hfill $$
(12)

which contains the contributions of local free energy density, spatial gradient of the OP fields (in the first integral), the coherency elastic strain energy (second integral, i.e., Eq. (4)), and the interaction with an external stress field σext(x). The gradient coefficients κ pq can in principle be formulated to reflect the interfacial energy anisotropy; in our current study, for the sake of simplicity but without loss of generality, we assume isotropic interfacial energy by reducing κ pq to a scalar constant.

In the first subsection of the following part, we use the stochastic Langevin equation based on the time-dependent Ginzburg-Landau kinetic equation15 to simulate qualitatively the MT. In the second subsection, we employ a PF-based NEB method, which uses the well-developed MT microstructure obtained from the first subsection as the end image of the NEB calculation, to determine quantitatively the properties of the critical nucleus, as well as the MEP (see more details in ref. 30). In these simulations, the physical length scale of the computational cell can be determined by evaluating the dimensionless interfacial energy of a well relaxed polytwin martensite microstructure obtained in the simulation, γ *, which is related to the dimensional physical quantities of a given system by \({\gamma ^*} = \frac{\gamma }{{\Delta {f_0}{l_0}}}\) with γ being the twin boundary energy and l 0 the physical length of one computational gridpoint.31 Using the typical values of Δf 0 ~ 1 × 108 J/m3 and γ ~ 0.01J/m2 for MT,15, 31 it is found that in our current simulations l 0 ~ 0.2 Å, which is the appropriate length scale for studying the nucleation and the early stage of growth, as presented in the following.

Growth kinetics

With the complete PF free energy functional, a set of PF simulations using the same initial configuration but different coefficient \(\alpha _p^{ij}\) listed in Table 1 are carried out. The ratio of elastic energy to chemical energy, defined as \(\xi \equiv \mu \epsilon _0^2/\Delta {f_0}\), is used to characterize the undercooling or the “strength” of MT15, 37 and set as ξ = 0.5. In the framework of PFM, the derivation of ξ is readily seen30 by writing the transformation strain \({{\it{ \epsilon }}_{ij}} = {{\it{ \epsilon }}_0}\mathop {\sum}\nolimits_p^\nu {\widetilde {\it{ \epsilon }}_{ij}^p{\eta _p}} \) where \(\widetilde {\it{ \epsilon }}_{ij}^p\) is a normalized strain tensor that can be identified in Eq. (10). In the current GPFM, it can be shown that the transformation strain can be written in a similar form, i.e., \({{\it{ \epsilon }}_{ij}} = {{\it{ \epsilon }}_0}\mathop {\sum}\nolimits_p^\nu {\widetilde {\it{ \epsilon }}_{ij}^{p,n}\Lambda _p^{ij}} \) where the new normalized strain tensor \(\widetilde {\it{ \epsilon }}_{ij}^{p,n} \equiv \widetilde {\it{ \epsilon }}_{ij}^p/ \left( {1 + \alpha _p^{ij}} \right) \). The relationship between the two normalized strain tensors are owing to the constraint that when η p  = 1 the two corresponding transformation strain tensors must be equal. Substituting into the strain energy formulation in Eq. (12), it is easy to see that the definition and the meaning of ξ remain the same when nonlinear coupling is considered.

In order to separate the growth from the nucleation stage, Langevin fluctuations (to model thermal fluctuation and the resulting nucleation) is applied to the initially homogeneous austenite phase during PF simulations, which are run for the same number of PF steps using the transformation pathways listed in Table 1 to compare the resulting growth kinetics. The final microstructures are shown in Fig. 2, where polytwin structures consisting of alternating layers of two tetragonal variants are observed. Two domains in antiphase relationship (corresponding to plus and minus sign in OP) are also observed, which is represented by different colors in one martensite layer as shown in Fig. 2. The corresponding growth kinetics of different pathways can be compared by plotting the fraction of total transformed volume against the simulation time, as shown in Fig. 3a. It suggests that the growth kinetics can differ significantly, depending on the exact transformation pathways. Taking LP as the reference, the growth rate of martensite can either increase or decrease (significantly) by adopting a nonlinear transformation pathways. In order to have a quantitative comparison between different pathways, which in the current GPFM theory become more complicated due to the noncollinear coupling, we plot the von Mises strain of the transformation strain tensor along the pathways, as shown in Fig. 3b. It is suggested that the overall ranking of the four pathways (in Table 1), in terms of the magnitude of von Mises strain, is NP1 < NP3 < LP < NP2, even though the difference among the latter three is significantly smaller as indicated by Fig. 3b. This ranking of von Mises strain is the same as that of the growth kinetics shown in Fig. 3a, which is not surprising because von Mises strain is an equivalent scalar measure of the original full strain tensor and directly indicates the amount of the resulting strain energy density. While the chemical energy density difference between the austenite and martensite remains the same for all the pathways, the interface-related strain energy, as well as the stress field in the vicinity of the diffuse interface can vary significantly and a change of the growth rate is thus expected. A previous model38 has also showed that mobility of martensite interfaces are largely determined by the corresponding elastic property.

Fig. 2
figure 2

Polytwin structures consisting of two orientational variants of martensite are obtained from phase filed dynamics using transformation pathway of a LP, b NP1, c NP2, and d NP3 as defined in Table 1. Note that in each martensite layer, two domains (represented by different colors) in antiphase relationship (corresponding to plus and minus sign in OP) are observed

Fig. 3
figure 3

a Volume fraction of martensite during PF simulations as in cases shown in Fig. 2. b Von Mises strain of transformation strain tensor along different transformation pathways, with inset showing an enlarged portion to indicate the ranking

Critical nucleus and MEP

To investigate the effect of nonlinear transformation pathways on the properties of a critical nucleus (e.g., size, shape, and activation energy), we determine the critical nucleus of the MT using the PF functional-based NEB method.30 (For other computational tools of determining the nucleus in phase transformation, see a recent review in ref. 39.) Owing to the usage of the so-called “free-end” treatment,40 which is also used in ref. 30, we can take the well-relaxed configurations like those obtained in Fig. 2 as the end-node images in our NEB calculation, and use a linear interpolation between the start-node image (homogeneous austenite phase) and the end-node image to set up the rest initial nodes. NEB relaxation is then performed to drive the node images to converge to the MEP. Figure 4 shows the obtained MEP for different elastic to chemical energy ratio ξ. In all cases, the transformation pathway NP1 exhibits a nucleation barrier that is significantly lower than those obtained for the rest pathways, which is actually consistent with its fastest growth rate shown in Fig. 3a. The nucleation barrier along NP3 is the second lowest one, although relatively much closer to that along LP and NP2, which two are almost the same.

Fig. 4
figure 4

MEPs calculated for elastic to chemical energy ratio of a ξ = 0.5, b ξ = 0.8, and c ξ = 1.0

Apart from the energy barriers, the underlying critical nucleus (or saddle-point) configurations also differ significantly, as can be seen in Fig. 5. When ξ = 0.5, representing a relatively larger undercooling compared with the cases that will be studied later, the saddle-point configurations (Fig. 5a–d) for all pathways considered are made of a single variant. The OPs in these nuclei take mostly intermediate values between 0 and 1, suggesting a typical nonclassical nucleation scenario, same as in ref. 30.

Fig. 5
figure 5

Critical nucleus configuration for different transformation pathways (from left to right column: LP, NP1, NP2, and NP3) when elastic to chemical energy ratio is (ad) ξ = 0.5, (eh) ξ = 0.8, and (il) ξ = 1.0. Two colors represent two different orientational variants

When ξ increases from 0.5 to 0.8, the critical nucleus configuration changes from single-variant to two-variant for transformation pathways LP (same as in ref. 30), NP2, and NP3, but remains a single-variant for NP1, as shown in Fig. 5e–h. This suggests that although an increased ξ certainly promotes the dominant role played by the elastic energy in determining the critical nucleus configuration in LP, NP2, and NP3, it is still not high enough for NP1 to switch to a MEP that has a two-variant saddle-point, owing to its specific coupling between the OPs and eigen strains. More interestingly, the symmetrical two-variant critical nucleus along LP loses symmetry when the transformation pathway is changed to either NP2 or NP3, even though the difference in the nucleation barriers is relatively small as compared to NP1, as shown in Fig. 4b. The two variants now appear as two separate plates with an acute angle being formed between them. In addition, the two variants have apparently different sizes.

Finally, when ξ = 1.0, the morphology of the nucleus will change to the results shown in Fig. 5i–l. As is expected, the nucleation requires a configuration with a much larger volume (consequently larger energy) to overcome the much higher strain energy penalty compared with cases of ξ = 0.5 and ξ = 0.8. For the case of LP, a critical nucleus of internally twinned two-variant configuration is obtained, same as in ref. 30. For NP1, the critical nucleus is still single-variant, while for NP2, the critical nucleus configuration is similar to its corresponding case when ξ = 0.8 (Fig. 5g), except that the two variant plates becomes much larger and thinner with nearly equal size. The critical nucleus configuration along NP3 is similar to the internally twinned two-variant along LP, with the symmetry slightly changed as well. Thus depending on the actual transformation pathway, the configuration of a critical nucleus and the nucleation barriers may change significantly, which is expected to have a profound influence on the subsequent transformation kinetics and the final microstructure.

Discussion

Nucleation and growth are vital phenomena for understanding solid-state phase transformations and deformations. The emergence of nanostructured materials,41 wherein the time and length scale of the involved nucleation and growth may be significantly confined, prompts further need for more detailed understanding of the transformation pathways. Our above simulations, though still in a parametric manner, have revealed the significant role played by the transformation pathways in determining the properties of a critical nuclei during structural transformations. To further assess the significance of these results, we analyze the stress state, in particular, the hydrostatic stress, σ Hyd ≡ (σ 11 + σ 22 + σ 33)/3, in Fig. 6, which is associated with the saddle-point configurations in Fig. 5i and k. It is shown that in the vicinity of the critical nucleus, there are regions of either under hydrostatic tension or under hydrostatic compression, which is true for both transformation pathways. However, the symmetry of σ Hyd is different: center-symmetry for LP but mirror-symmetry (about the twin boundary) for NP3. In both cases, the hydrostatic tension tends to build up at the tip where the two variants are closing up, while the hydrostatic compression tends to be built up at the tip where the two variants are “branching”. These results are consistent with the exhibited morphology of the critical nuclei. We further compare the resulting far-field σ Hyd (which is calculated by volume-averaging σ Hyd over the entire computational cell) of different pathways in the case of ξ = 1.0 and show the result in Table 2. For a quantitative comparison, we calculate the ratio of far-field σ Hyd to far-field von Mises stress σ VM, which is expected for all shear-dominant MTs subject to clapped boundary conditions. Table 2 suggests that there is clearly a nonzero far-field hydrostatic stress for pathways NP2 and NP3, but zero for LP and NP1. In other words, the embryos of a pure-shear MT can actually be hydrostatically “charged”, depending on the transformation strain pathways

Table 2 Ratio of the far-field hydrostatic stress σ Hyd to von Mises stress σ VM associated with the critical nucleus of different transformation pathways.
Fig. 6
figure 6

Hydrostatic stress distribution in a two-dimensional cross-section through the saddle-point configuration for a LP (Fig. 5i) and b NP2 (Fig. 5k). Note that boundaries of two variants are represented, respectively, by closed green and red lines and colorbar is in reduced unit

.

In practice, it is the far-field σ Hyd that drives other crystalline defects such as point defects, which also have nonzero hydrostatic component of the resulting stress field, to interact with the critical nucleus and may fundamentally change the subsequent kinetics and lead to very distinctive morphology and properties. For instance, the diffusion potential42 due to vacancy-exchange mechanism is proved to be \({\mu _v}\; = \;{k_B}T\ln \left( {\frac{{{X_v}}}{{X_v^e}}} \right) - {\Omega _r}{\sigma _{{\rm{Hyd}}}}\), where k B is the Boltzmann constant, T is temperature, \(X_v^e\) is the vacancy concentration that is in equilibrium with a stress-free flat surface, X v is the local vacancy concentration, and Ω r is the vacancy relaxation volume. As a result, the interaction between vacancy and martensite/austenite interfaces can be drastically changed by varying the transformation pathway (hence σ Hyd according to Table 2). In fact, it has been proposed based on experiments that excess quenched-in vacancies may migrate to martensite/austenite interfaces as a sink and reduce the interface mobility, resulting in a significant change of M s and M f temperatures.43 Another example would be solute segregation at coherent precipitate/matrix interfaces (or twin/grain boundaries) that has been confirmed by many experiments.44,45,46,47 The exact driving forces for solute migration to coherent interfaces/boundaries (with relatively low interfacial energy as compared to that of semi-coherent and incoherent ones) are still unclear and deserve more systematic study. Static first-principle and/or continuum calculations have suggested that either electronic (chemical) effects45, 46 or elastic interaction,47 similar to our finding here, can lead to the observed segregation. All these examples indicate that an accurate way of considering the elastic property of an interface (in particular, the hydrostatic component that is not expected based on a linear transformation pathway) is essential for studying the interface-dominated or related microstructure evolution at nanoscale.

Regarding the change of the critical nucleus configuration, i.e., from a single-variant to a multiple-variant configuration, qualitative analysis and understanding may be achieved by employing the following dimensionless ratio

$$\zeta \; = \;\frac{{{{\bar E}_s}}}{{{{\bar E}_e}}} = \frac{{{\gamma _s}A}}{{\mu \epsilon _0^2V}} = \frac{{{r_0}}}{L}$$
(13)

where \({\bar E_s}\) and \({\bar E_e}\) represent, respectively, the interfacial and elastic energy of a critical nucleus. The former is estimated as the product of the specific interfacial energy, γ s , and the interfacial area, A, and the latter the product of the average elastic energy density, \(\mu \epsilon _0^2\), and the critical nucleus volume, V. Based on the dimensional analysis, we can further define two characteristic length scales. One is \({r_0} = {\gamma _s}/\mu \epsilon _0^2\) characterizing a material property, and the other is L = V/A characterizing the size of the critical nucleus. The ratio ξ has been used to analyze the equilibrium shape of a coherent particle.31, 48, 49 It can be seen from the simulation results (Fig. 5) that as the elastic energy contribution increases (increasing ξ and decreasing ζ ), the volume of the critical nucleus increases for all four pathways considered; more interestingly, for LP, NP2, and NP3, this leads to the change of the critical nucleus from a single-variant configuration (when the interfacial energy contribution dominates) to a self-accommodated multiple-variant one (when the elastic energy contribution dominates). These results are consistent with the physical argument behind Eq. (13), e.g., when the elastic energy contribution becomes dominant over the interfacial energy contribution, the critical nucleus maintains a strain energy accommodating multiple-variant structure, and when the situation is reversed, the critical nucleus changes to a single-variant structure to eliminate the energy associated with the variant-variant boundary. A special case is NP1, where the critical nucleus remains as single-variant configuration even at ξ = 1.0. This can be understood by recalling that the intermediate von Mises strain along NP1 is significantly lower than those of the other pathways, as shown in Fig. 3b. Considering that the elastic energy is proportional to the square of strain and the OP of the critical nucleus in Fig. 5 are right in the intermediate range (~ 0.5), it can be expected that for NP1 the interfacial energy can still dominate over the elastic energy and the critical nucleus remains as single-variant configuration. Since the critical nucleus configurations predicted in the simulations are highly non-classical,50 it is difficult to quantify the ratio of Eq. (13) for the critical nucleus.

Finally, it is worth pointing out some of the limitations in the current work that can be addressed in the future. First of all, the nonlinear and noncollinear coupling is confined in the interface region, and the thickness of the interface region is controlled by the Landau free energy landscape and the gradient energy coefficients in PF modeling; to obtain a quantitative result, direct incorporation of atomistically informed transformation strain pathway, free energy, and the gradient energy coefficients should be used. Secondly, it is assumed that austenite and martensite have the same elastic modulus, which can be improved by using the inhomogeneous elasticity solver for PFM.51 Finally, small-strain framework is used in the microelasticity theory, which prevents the proposed nonlinear and noncollinear transformation strain pathways from being investigated at finite strains. A GPFM framework utilizing the finite strain theory still using spectral method28 is currently under development.

Conclusions

The PFM theory is re-examined, in particular, in terms of the assumption of linear mapping between phase fields and SFTS tensor components. Motivated by the fact that many experiments and direct ab initio calculations on solid-state deformation and phase transformations such as dislocation motion, MT, and the deviation from Vegard’s law in solid solution model have revealed complex nonlinear transformation pathways, we have developed a GPFM theory to account for general nonlinear (and noncollinear) couplings between OPs and transformation strains. PF simulations incorporating this newly-developed GPFM have been carried out in a cubic to tetragonal martensite transformation and show that the nonlinear transformation pathways can significantly change the martensite growth rate. The GPFM-based PF energy functional is then incorporated in NEB calculations to determine the corresponding critical nucleus of martensite. It is shown that the incorporation of nonlinear transformation pathways can significantly change the morphology and activation energy of the critical nuclei. In particular, the critical nuclei can possess nonzero far-field hydrostatic stress, even though the final product phase is a pure-shear martensite with no volume change, which could never be expected by the original PFM theory. The corresponding physical consequence is that martensite embryos can be hydrostatically “charged” and interact with other embryos and crystalline defects with nonzero hydrostatic stress components, e.g., point defects, leading to possible interesting phenomena such as vacancy clustering at austenite/martensite interfaces and/or solute segregation at coherent precipitate/matrix interfaces. The newly developed GPFM theory provides a framework of incorporating general transformation pathways, determined either from atomistic calculation or experimental characterization, into PF simulations, which leads to a multiscale quantitative modeling scheme to systematically study solid-solid phase transformations.

Methods

The PF governing differential equations for the time evolution of martensite transformation are numerically solved using the (first-order) forward Euler method. Our GPFM formulation is solved using Fourier spectral method; the numerical solution is implemented using the open-source FFTW MPI library codes (http://www.fftw.org/fftw2_doc/) for the FFT algorithm on distributed-memory machines supporting MPI. The determination of the critical nucleus of martensite employs the NEB method52 combined with PF energy functional30 and the numerical implementation follows the “free-end” treatment40.

Regarding the calculation procedure used in obtaining the atomistic data for the FCC-to-BCC martensite transformation in iron presented in Fig. 1, the empirical potential of ref. 53 is used to compute the total energy and stress. The calculations are started with an FCC lattice and an incremental compression strain ε zz is applied at each step, allowing the other five strains (ε xx , ε yy , …) to relax fully so that only stress component σ zz is non-zero and all the other five stress components are zero. This is continued until ε zz reaches the Bain strain, i.e., when σ zz  = 0 again and the system finds itself in another locally stable state (i.e., the BCC state).