A universal Bayesian inference framework for complicated creep constitutive equations

Evaluating the creep deformation process of heat-resistant steels is important for improving the energy efficiency of power plants by increasing the operating temperature. There is an analysis framework that estimates the rupture time of this process by regressing the strain–time relationship of the creep process using a regression model called the creep constitutive equation. Because many creep constitutive equations have been proposed, it is important to construct a framework to determine which one is best for the creep processes of different steel types at various temperatures and stresses. A Bayesian model selection framework is one of the best frameworks for evaluating the constitutive equations. In previous studies, approximate-expression methods such as the Laplace approximation were used to develop the Bayesian model selection frameworks for creep. Such frameworks are not applicable to creep constitutive equations or data that violate the assumption of the approximation. In this study, we propose a universal Bayesian model selection framework for creep that is applicable to the evaluation of various types of creep constitutive equations. Using the replica exchange Monte Carlo method, we develop a Bayesian model selection framework for creep without an approximate-expression method. To assess the effectiveness of the proposed framework, we applied it to the evaluation of a creep constitutive equation called the Kimura model, which is difficult to evaluate by existing frameworks. Through a model evaluation using the creep measurement data of Grade 91 steel, we confirmed that our proposed framework gives a more reasonable evaluation of the Kimura model than existing frameworks. Investigating the posterior distribution obtained by the proposed framework, we also found a model candidate that could improve the Kimura model.

www.nature.com/scientificreports www.nature.com/scientificreports/ the range of a parameter, into the model. The posterior distributions are the estimation results of parameters based on measurement data. From the posterior distributions, we can obtain the estimation reliability of the parameters or obtain the properties of the model, such as the correlations among the parameters. These cannot be obtained by a maximum-likelihood method. Such knowledge helps to design a measurement plan or comprehend the model 11,12 .
In a previous study 13 , we applied a Bayesian model selection framework for the evaluation of creep constitutive equations constructed from linear sums of basis functions. We computed two simple creep constitutive equations with and without a steady-state term, and found that the equation with a steady-state term was selected over a wide temperature and stress range for Grade 91 (Gr.91) steel 14 16 . Our previous framework is not applicable to such sophisticated or complicated creep constitutive equations. This is because the parameters of the exponential part of each basis function are not treated as probabilistic variables in the previous framework: the parameters are optimized by a grid search and point estimation based on the empirical Bayes method 10 . If the number of grids per axis is G and the number of parameters is d, the computational cost to execute Bayesian model selection is G d . Therefore, it is not realistic to apply the previous framework to a model with many basis functions, such as the Kimura model. Treating the parameters of a basis function as deterministic parameters is an approximation-expression method in which the posterior distribution of the parameters is a delta function, which is called as the empirical Bayes approximation. If the parameters of the basis function are not well-determined, the empirical Bayes approximation generates a bias in the model selection result. The exchange symmetry of parameter pairs, such as a 1a 2 , b 1b 2 , c 1c 2 , and d 1d 2 in the Kimura model, is an example of parameters that are not well-determined. Keitel et al. also applied a Bayesian model selection framework for the model selection of creep constitutive equations of concrete 17 , where they approximated the posterior distribution of parameters as a Gaussian. Such the approximate-expression method is called the Laplace approximation. The Laplace approximation also leads to a false conclusion if the model parameters are not well-determined 10 . To evaluate the complex creep constitutive equations, it is necessary to perform Bayesian creep model selection without an approximate-expression method. There is another difficulty in Bayesian creep model selection. In the measurement of the creep deformation process in steel, it is difficult to estimate the measurement noise intensity correctly; instead, a noise range is given. In creep model selection, it is important to set the noise intensity correctly, because the noise intensity is related to a criterion that determines whether the signal is to be regressed or considered as noise. To set the noise intensity as a range, it is necessary to set the noise intensity as a probabilistic variable and set its range as a prior distribution. However, in the existing creep model selection frameworks, it is difficult to treat the noise intensity as a probabilistic value. Thus, there is no Bayesian creep model selection framework that can be used to evaluate all types of creep constitutive equations correctly.
In this study, using the replica exchange Monte Carlo method 18 , we propose a universal Bayesian creep model selection framework that can be applied to various types of creep constitutive equations without the approximate-expression method or the ability to set the range of the measurement noise intensity as a prior distribution. By applying the proposed framework to the evaluation of the Kimura model using the creep measurement data of Gr.91 steel 14 , we confirmed that our proposed framework gives a more reasonable evaluation of the Kimura model than existing frameworks. From the accurate posterior distribution obtained by the proposed framework, we also found a way to simplify the Kimura model without losing the model likelihood.

Methods
In the proposed framework, the criterion to achieve a Bayesian creep model selection is obtained by numerical integration. The numerical integration consists of a sequential run of numerical integration using the REMC method and the Riemann sum. The flowchart of the framework is shown in Fig. 1. Bayesian creep model selection. We evaluate K creep constitutive equations M k in terms of their ability to represent the creep deformation data , where ε is the strain and t is the time. The likelihood of a given creep constitutive equation where D P( ) is a normalization constant. In this study, we assume that there is no prior knowledge about the likelihood of the model. Thus, we set the prior probability M P( ) k as a uniform distribution; in this study, it is equal to K 1 . We also assume that t in the dataset ε = D t { , } is given deterministically, that is, non-probabilistically. Then, on the basis of Bayes' theorem, the likelihood of the model is transformed as P( , , ) k k of Eq. (4) is a stochastic generative model of the creep constitutive equation M k . When the measurement noise of the creep deformation data is given as an independent and identically distributed Gaussian with average 0 and standard deviation σ, the conditional probability can be expressed as  www.nature.com/scientificreports www.nature.com/scientificreports/ The probability ε|M P( ) k is often referred to as the marginal likelihood and is proportional to the likelihood of the recognition model M k . The negative log-likelihood is often referred to as the Bayesian free energy. In this way, F M ( ) k is proportional to the negative log-likelihood of the recognition model M k . Therefore, the creep constitutive equation M k with the smallest F M ( ) k value represents the best model.

Replica exchange Monte Carlo sampling method.
To obtain the value of F M ( ) k , we need to execute the integration in Eq. (4). However, it is difficult to analytically execute the integration owing to the complicated relationship between t ( ; ) k i k ε θ and θ k . We overcame this difficulty by numerical integration. The numerical integration was performed in two steps. The first step is integration with respect to the model parameter set k θ to obtain the value of f M ( , ) k σ with σ [Eq. (12)], and the second step is, by using the obtained f M ( , ) k σ value, integration with respect to the noise intensity σ [Eq. (10)] to obtain F M ( ) k . Since step 2 is a one-variable integral with respect to σ [Eq. (10)], the numerical integration can be performed as a Riemann sum. On the other hand, since the k θ integral is high-dimensional, we carried out the integration by the sampling method. In this section, we explain how to calculate σ f M ( , ) k with a given noise intensity σ by the sampling method. Markov chain Monte Carlo (MCMC) methods 19 are efficient sampling methods for estimating the expectation value of a probability distribution in a high-dimensional space. σ f M ( , ) k is given using an auxiliary variable β; obtained by dividing the region from 0 β = to 1 β = into L pieces in some manner, and each NE( , ) is obtained by performing MCMC sampling independently at each inverse temperature l β . However, MCMC sampling often results in trapping at local minima.
The replica exchange Monte Carlo (REMC) method is an algorithm of an MCMC method used to avoid trapping at local minima. The REMC method takes samples from the joint density (18). The REMC method performs sampling from the joint density on the basis of the following updates.
1 Sampling from each density Sampling from a distribution with a smaller β corresponds to sampling from a distribution with a larger intensity of noise; thus, the distribution tends not to have a local minimum. Hence, sampling from the joint density overcomes the local minima in distriubtions with large β and enables the rapid convergence of sampling.
Using the sampling result of the β = 1 state, we can obtain the posterior distribution of the parameter (18)] for the noise intensity σ. From the posterior distribution of θ k , we can estimate the model parameters θ k of M k and the related information, such as the estimation accuracy.
From the sampling result of Eq. (20), Here, we describe how to k s σ can be rewritten by using σ as k www.nature.com/scientificreports www.nature.com/scientificreports/ s L k s s s 1

Application example
Here, as an application example to verify the effectiveness of the proposed framework, we evaluate the Kimura model, which is one of the most successful creep constitutive equations, and the modified Kimura model, which was created as a model for comparison with the Kimura model. These models are difficult to evaluate using the existing framework.
Creep constitution models and material. The modified Kimura model was set on the basis of the following assumptions. There are a number of creep constitutive models, which are roughly classified into two types according to the way the steady state, where the deformation rate is constant, is modeled (Fig. 2). One is a steady-state creep model, in which a linear region with a constant deformation rate is represented as an independent linear term. The other is the unsteady creep model, in which the steady state is generated by the balance between the deceleration of the primary creep and the acceleration of the tertiary creep. The Kimura model is formulated as Eq. represents the tertiary creep, and the linear region is represented by its balance. By regression analysis using the creep data set of Gr.91 steel, it was previously found that b 2 takes a value close to 1 1 . This implies that the Kimura model models the steady state as a linear term rather than a balance. In our previous study 13 , it was also found that, using the measurement data 14 of Gr.91 steel, the likelihood of another type of creep constitutive model, the theta projection model 4 , was improved by adding a linear term. On the other hand, it was also reported by Kimura et al. that b 2 can take a value larger than 1 by changing the fitting method 2 . To determine whether a steady state is required, we designed the following steady-state creep model by replacing b 2 in the Kimura model with 1. Figure 2. Creep curves and three time domains. Primary creep represents the zone of creep rate deceleration, steady-state creep represents the creep rate zone of constant velocity, and tertiary creep represents the final acceleration zone.  14 . In this verification, on the basis of the regression results shown in refs. 1 and 2 , the prior distributions of the Kimura model and the modified Kimura model were set as shown in Table 2. In Bayesian inference, priors are an important part of the model. To examine the effect of the prior distribution in Bayesian model selection, several different prior distributions of the noise intensity σ were prepared. The relationship between the prior and the model selection result was verified. Concretely, the prior distribution of the noise intensity σ was set as a uniform distribution with a certain value as the lower limit and 10 times the lower limit as the upper limit. Then, 100 different prior distributions were prepared with values of the lower limit from 3 0 10 5 . × − to . × − 3 3 10 3 at equal intervals in logarithmic space. We applied the proposed framework under these conditions.
When we set the prior distributions of a 1 , a 2 , b 1 , b 2 , c 1 , c 2 , d 1 , and d 2 as described in Table 2, the exchange symmetry of parameter pairs in the Kimura model disappears. The exchange symmetry is one of the reasons why the Kimura model is an undetermined model. On the other hand, the Kimura model continues to have an undetermined structure around the parameters estimated as the maximum-likelihood solution for the following reasons. To model the transition of a creep curve under a change in stress and temperature conditions, Kimura et al. estimated the stress and temperature dependences of parameters. They reported that the distribution of c 2 in stress and temperature space has a significantly wider dispersion than the other parameters 1,2 . This behavior can be understood from the fact that the estimation of parameter c 2 is unstable, and, therefore, the parameters of the Kimura model are not well-determined around the maximum-likelihood solution.
In the REMC sampling, we adopted the Metropolis-Hastings algorithm 20 to sample each state of the inverse temperature. The states of the inverse temperature were determined using the following exponential function 21 : where = L 311 and 1 05 γ = . . The approximate error Er l of the numerical integration, i.e., the Riemann sum, between l β and l 1 β + is given as Therefore, it is necessary to make the division l β ∆ sufficiently smaller than E l ∆ . In the Kimura model, which is a linear sum of exponential functions and power functions, slight parameter variations cause large functional changes. In particular, at l 1 = , where sampling is performed over the entire range of the prior distribution, the l 1 = state takes a very large average energy E 1 because of such the nature of this model. On the other hand, in the region of l 1 > , the parameter region with a small squared error is sampled. Therefore, the energy in the = l 2 state becomes  E E 2 1 . As a result, E 1 ∆ becomes very large and the approximation error Er 1 becomes non-negligible. Therefore, in this study, to reduce the approximation error Er 1 , the following temperature states are inserted between 1 β and β 13 so that β ∆ < ∆ E  In the sampling, we abandoned the first 100,000 steps and sampled the next 100,000 steps.

Results
In this study, model parameters were estimated from the posterior distributions using the maximum a posteriori (MAP) method. The MAP method estimates parameters on the basis of the following equations: www.nature.com/scientificreports www.nature.com/scientificreports/ We examined the fitting results by the MAP solution with two types of noise prior. One was a small-noise-intensity prior, which was set as a uniform distribution from . × − 4 0 10 5 to . × − 4 0 10 4 , and the other was a large-noiseintensity prior, which was set as a uniform distribution from 4 0 10 4 . × − to . × − 4 0 10 3 . From the regression results (Fig. 3), it was confirmed that both the Kimura model and the modified Kimura model provided a good fitting regardless of the prior distribution of noise. To examine the regression results more precisely, we compared the mean square error (MSE) for the fitting results: For the small-noise-intensity prior distribution, the MSE of the Kimura model was . × − 1 65 10 9 and that of the modified Kimura model was . × − 8 31 10 9 . For the large-noise-intensity prior distribution, the MSE of the Kimura model was . × − 2 69 10 9 and that of the modified Kimura model was . × − 8 44 10 9 . Thus, the Kimura model always has a smaller MSE regardless of the noise intensity of the prior distribution. Moreover, the MSE did not change significantly with the noise intensity of the prior distribution. It was also confirmed that the squared error between the experimental data and the regression curve at each time (gray area of Fig. 3) has the same error distribution in each model regardless of the noise intensity. The creep rate curve was calculated by differentiating this regression curve. The estimated creep rate curves closely fit the creep rate data points. The creep rate data points was estimated as the difference in the adjacent points of strain data (black dots in Fig. 4). The value of the coefficient a 2 of the modified Kimura model [Eq. (27)], representing stationary creep, was similar to the minimum creep rate estimated by the experimenter as the creep rate in the steady area 13 . This result is consistent with the assumption that the second term of the modified Kimura model models the steady-state creep.
Using the proposed framework, it was confirmed that the selected model switched from the Kimura model to the modified Kimura model at the point where the prior distribution of the noise intensity σ is set as a uniform distribution from 3 88 10 4 . × − to . × − 3 88 10 3 (Fig. 5(a)). On the other hand, it was also confirmed that the MSE of the Kimura model was always smaller than that of the modified Kimura model regardless of the noise prior distribution (Fig. 5(b)). For comparison with the existing framework 17 , model evaluation based on the existing framework was performed. In the existing framework 17 , the Bayesian free energy is calculated on the basis of the Laplace approximation (see Appendix A for a more general theory of the Laplace approximation including σ). Using the MAP solution θ k MAP , the Laplace approximation is given by     Model selection was performed using this approximated free energy. In the model selection results using this approximated free energy, selected models are switched at a higher noise intensity than the switched noise intensity of the proposed framework ( Fig. 5(a)). This means that the existing framework more often evaluates the Kimura model as the more likely model than the proposed framework.
To visualize a posterior distribution with three or more dimensions, we calculated the following marginal posterior distribution, which marginalizes the posterior distribution of the parameters θ ¬ This integration can be approximated from the sum of the sampling result of the REMC method. The density function of the marginal posterior can be estimated by the kernel density estimation method using Gaussian kernels. We determined the bandwidth of the Gaussian kernels using Scott's rule 22 . We compared the marginalized posterior distribution focusing on one parameter for the small-noise-intensity prior, where the Kimura model was selected (Figs. 6 and 7), and the large-noise-intensity prior, where the modified Kimura model was selected (Figs. 8 and 9). From the visualization results of the marginalized posterior distribution for the small-noise-intensity prior (Figs. 6 and 7), it was confirmed that the parameters' posterior distribution of the Kimura model has a multimodal and complex structure, which is different from the posterior distribution of the modified Kimura model. The posterior distribution of c 2 with a small-noise-intensity prior has a particularly multimodal and broadened distribution (Fig. 6(a4)). Whereas, in the large-noise-intensity prior, it was observed www.nature.com/scientificreports www.nature.com/scientificreports/ that the posterior distribution of the Kimura model also became unimodal distribution (Figs. 8 and 9). Next, we examined the relationship between two parameters from the marginalized posterior distribution. Many marginalized posterior distributions were unstructured isotropic Gaussian distributions such as the marginalized posterior distributions of c 1 and 0 ε (Figs. 10(a1,b1)). On the other hand, the marginalized posterior distributions between parameters of the exponential part and the coefficient part of the basis function have structured distributions, such as c 2 and d log 10 2 (Figs. 10(a2,b2)).

Summary and Discussion
The results of the application of the proposed framework to the Kimura model show that the model selection results vary depending on the noise intensity range. Furthermore, it is clarified for the first time that the posterior distribution of the Kimura model is multimodal in the small-noise-intensity region. In this section, after discussing the validity of the model selection results and their properties, we compare the Kimura model and the modified Kimura model in terms of multimodality of them posterior distributions. Similarly to their work 1,2 , our proposed framework achieved data fitting with high accuracy. This suggests that our framework performed under the same conditions as the Kimura model. In addition, the closeness of the minimum creep rate and the slope a 2 of the second term of the modified Kimura model (Eq. (27)) suggests that the steady state was modeled by the linear term a t 2 , which was assumed when we set the modified Kimura model. Thus, it was confirmed that the regression results using the MAP solution of the proposed framework were reasonable.
The MSE of the Kimura model was smaller than that of the modified Kimura model, even in the large-noise-intensity prior distribution, in which the modified Kimura model was selected. This result indicates that the proposed framework gives model selection results that do not depend solely on the magnitude of the error. The results are explained using the following approximated Bayesian free energy formula. By extracting terms higher than order log N ( ) from the result of the Laplace approximation, the following equation is obtained (see Appendix A): where d is the number of model parameters k θ . The first term in Eq. (38) represents the regression error of the model and the second term represents the complexity of the model. In general, the more complex the model, the smaller the value of the first term; however, overfitting occurs in more complex models. The presence of the second term in the Bayesian free energy makes it possible to evaluate the model while preventing overfitting.
Thus, the Laplace approximation is useful for a rough clarification of the property of Bayesian free energy. However, the Laplace approximation is based on the hypothesis that the posterior distribution of the parameters is a unimodal Gaussian distribution. On the other hand, the proposed method selects the model by assuming that the posterior distributions of the Kimura model with small-noise-intensity prior are multimodal. The difference in the model selection results between the frameworks (Fig. 5(a)) is due to this difference in the property of the Laplace approximation and the proposed framework. This presumption is validated from the behavior of mod kimura in the Laplace approximation and the proposed framework ( Fig. 5(a)), which tends to be small in a large-noise-intensity prior situation where the posterior distribution of the Kimura model is estimated as unimodal in the proposed framework (Figs. 8(a1-4) and 9(a1-4)). This might be the mechanism by which the switching noise intensity of the selected model differs between the proposed framework and the Laplace approximation (Fig. 5(a)).
Using the proposed framework, it is possible to set the prior knowledge of the measurement noise range as the prior distribution. In this research, we developed an efficient marginalization method of the noise intensity σ www.nature.com/scientificreports www.nature.com/scientificreports/ using the sampling result of the REMC method. Such a method is novel not only in creep model selection but also in general Bayesian model selection. As a result of the model selection with this prior noise distribution, it was confirmed that the model selection results of the creep constitutive equation markedly change with the prior distribution of the noise. The measurement noise intensity of the creep data in this study was estimated to be more than . × − 1 4 10 4 from the resolution of the linear gauge (see Appendix B). Since the influences of temperature and humidity fluctuations are added to this noise, the lower bound of the noise-intensity estimate is exist around the region where the model selection results are switched ( Fig. 5(a)). This result shows the importance of specifying the measurement noise intensity in advance, even if the measurement noise intensity is only estimated as a range.
Our analysis revealed that for the Kimura model the posterior distributions of all parameters are multimodal with a small-noise-intensity prior (Figs. 6 and 7). The multimodality should result in unstable optimization when the parameters are estimated by the least-squares method, in which the parameters are optimized at a zero-noise limit. The posterior distribution of c 2 with a small-noise-intensity prior has a particularly multimodal and broadened distribution (Fig. 6(a4)). In fact, according to the work of Kimura et al. 1,2 , the estimation of c 2 is particularly unstable, as we explained in the previous section. Considering the present result, it would be very difficult to stably estimate not only c 2 but also the other parameters by the least-squares method. By contrast, for the modified Kimura model, the posterior distributions of all the parameters were unimodal regardless of the noise-intensity prior. The modified Kimura model was simplified by omitting only one parameter b 2 . The posterior distribution of b 2 with small-noise-intensity prior (Fig. 7(a2)) shows that b 2 has a value greater than 1. This suggests that the secondary term of the Kimura model contributes to the tertiary creep as well as the secondary creep. Inevitably, the Kimura model represents the tertiary creep by three basis functions. Perhaps this is the reason for the multi modality of the posterior distribution of the Kimura model with the small-noise-intensity prior. This demonstrates that the present simplification is efficient for modifying the Kimura model toward a well-determined one. The obtained unimodality of the posterior distributions should allow us to stably estimate the parameters even by the least-squares method. Furthermore, the stability of the parameter estimation would yield a better regression of the parameters in terms of the test temperature and applied stress. Consequently, we consider that the modified Kimura model has the potential to improve the estimation of creep curves under a wide range of test conditions. These results are due to the ability of the proposed framework to capture the complex relationships among the parameters. The previous frameworks for creep, such as the framework using the Laplace approximation 13,17 , could not capture such complex relationships among the parameters.
From the posterior distributions of two parameters, we can also obtain other information about the model properties. The posterior distributions of the parameter of the exponent part and the coefficient parameter are correlated distributions (Figs. 10(a2,b2)). Mathematically, if the parameter of the exponent part increases or decreases, the coefficient parameter must decrease or increase, respectively, to maintain the same trend of the basis function. Such an analysis is useful for quantitative analysis to understand the property of the creep model.
Lastly, we discuss the applicability of the proposed framework to other types of creep constituent equation. The framework is computationally expensive, especially for sampling by the REMC method. In the proposed framework, the samples should be sufficiently representative such that the approximation using Eq. (26), i.e., reusing the samples for estimating the different σ f M ( , ) k s , would be accurate. The REMC method, which is one of the generalized-ensemble algorithms, allows us to efficiently perform the marginalization of the multi modal probabilistic distribution compared with the other sampling methods 23 . We have now conservatively sampled more than 200,000 steps, but we can probably reduce it to 50,000, as shown in Fig. 11 of Appendix C. Nevertheless, applying the proposed framework to a more complex model than the Kimura model will be difficult in terms of sampling convergence. Fortunately, as we mentioned in Introduction, the Kimura model is one of the most complex creep constitutive equations 15, 16 . Thus, one can apply the proposed framework to a wide range of creep constitutive equations that have been proposed so far.
Another issue is how this method applies to the case where the constituent equations are given by simultaneous differential equations. Such types of constitutive equation have been proposed on the basis of damage mechanics, such as the Hayhurst model 24 , K-R model 25 , L-M model 26 , and W-T model 27 . In principle, the www.nature.com/scientificreports www.nature.com/scientificreports/ proposed method is applicable to these models because the strain ε can be computed for a given time t. However, it is necessary to run a numerical simulation in each sampling step to calculate the squared error σ θ E( , ) k (Eq. (11)), which takes a long computational time.
In this paper, we proposed a universal Bayesian creep model selection framework that can evaluate any type of creep constitutive equation. We also showed the effectiveness of our proposed framework using the measurement data of Gr.91 steel, a high-Cr, ferritic, heat-resistant steel. Through this evaluation, we found that the modified Kimura model could be a candidate model to improve the estimation stability of creep curves compared with the Kimura model. This was achieved by accurate estimation of the posterior distribution obtained only using the proposed framework. By applying the proposed method to different steel types at various temperatures and stresses, in-depth knowledge could be obtained about the creep deformation process and how to improve its prediction.