A modification in Weibull parameters to achieve a more accurate probability distribution function in fatigue applications

Risk evaluation for fatigue failure of the engineering components is an important aspect of the engineering design. Weibull distributions are often used in preference to the log-normal distribution to analyze probability aspects of fatigue results. This study presents a probabilistic model for calculating Weibull distribution parameters to reduce the effect of percentage discretization error of experimental fatigue life and R–S–N curves for three reliability levels. By considering any result of standard fatigue test as an equivalent Weibull distribution, artificial data are generated and the accuracy of common Weibull distribution model can be improved. The results show error reduction in the Kolmogorov–Smirnov test and R-square values. Also, the Basquin model is used for different reliability levels with the same error order for risk evaluation of fatigue failure. The coefficient of variation for fatigue life increases at higher stress levels and has a linear relation with stress level for a high-cycle fatigue regime.

The probability distributions are described with statistical parameters like mean and standard deviation, as shown in Table 1.Normal distribution, named Gaussian distribution, is the most frequently used distribution function in statistical analysis.The normal distribution, which has a bell-shaped curve, has been used for independent, random variables in the survey reports, Technical Stock Market, and scientific study of many observable phenomena in nature like human height or IQ distribution.Sinclair and Dolan 8 conducted a comprehensive statistical fatigue investigation engaging 174 nominal identical, extremely polished, smooth 7075-T6 aluminum alloy specimens.They worked on 6 alternating applied stress levels at the fully reversed test conditions.A normal distribution in the logarithmic scale for the experimental results at every applied stress amplitude gives the impression of being reasonable.Derived from that and the other performed statistical fatigue test results, a logarithmic scale normal distribution of failure is usually considered in fatigue analysis.At every applied stress range, a group of S-N diagrams at different percentages of failure probabilities are established from the probability distribution functions.
The Pareto distribution is one of the power-law probability distributions in probability theory and mathematics.Pareto distribution is usually used in the explanation of data scatter in the many types of scientific studies.Particularly applied to describing the distribution of wealth in a society, fitting the trend that a large portion of wealth is held by a small fraction of the population 9 .
The Gumbel distribution usually has been applied to figure the extremum (Max or Min) distribution of a number of samples of different statistical societies.Forecasting a temblor, torrent, or other types of natural disaster is of great use and value.For instance, if the database of a parameter of a phenomenon like water in the river for 10 years ago is available, Gumbel distribution can be applied to predetermine the distribution of the variation of extremum of water level at a river in the specific time.For the Gumbel distribution, the special importance is to predict the extremum distribution according to extreme value theory 10 .
John Tukey presented a continuous symmetric probability distribution model in which the Tukey lambda distribution function was specified in terms of its quantile function.The Tukey lambda distribution function is used to recognize a suitable distribution.Therefore, the Tukey lambda distribution usually has no direct application in statistical models.The Tukey lambda distribution has a single shape parameter that can be rearranged and defined in terms of the standard distribution.
The exponential distribution is a continuous distribution for modeling events that occur at a constant time rate.Two important applications of the exponential distribution are the modeling of radioactive decay in physics and the modeling of the posterior default probability for a set of financial assets in finance 11 .The exponential distribution can be applied to analyze the relationship between the unobservable actual values and measurement values 12 .In some cases the lifetime of a manufacturing item may fallows a mixed distribution models such as the half-normal distribution and the half-exponential distribution 13 .
Among the probability distributions used to analyze fatigue problems 14,15 , the Weibull distribution is one of the most common models in the logarithmical scale.Weibull progressed a new approach and used it to study fatigue actual data 16,17 .Weibull distributions have 2-parameter and 3-parameter models.The 2-parameter model is widely extended in fatigue problems and design.In this approach, the expected fatigue life range starts from zero cycles.Indeed, the 2-parameter Weibull distribution is a simplified 3-parameter Weibull distribution with a minimum expected life of zero.At the same time, the 3-parameter distribution is characterized by a finite minimum life greater than zero.For 3.3 ≤ b ≤ 3.5, the Weibull distribution function is approximately normal or Gaussian, while it is exponential for b = 1.The coefficient of variation (standard deviation/mean) is approximately C = l/b for the two-parameter Weibull distribution.For b values between 3 and 6, (typical of fatigue), the error from this approximation is about 10 to 15% 6 .In modern life testing analysis to obtain information about fatigue life of a component, new method of experimental process is conducted, where products are tested under Table 1.Common models representing a probability distribution.

Model Equation
Normal distribution or Gaussian distribution x < 0 higher stress than normal to get their failure information.For example, new methods such as adaptive type-I progressive hybrid censoring is planned to evaluate the failure parameters assuming that the failure causes are independent Weibull variables 18 .
In a study of statistical fatigue analysis, Zhao and Liu 4 proposed a Weibull approach to the probabilistic study.They investigated stress-life for rolling contact fatigue.The study shows that the 2-parameter and 3-parameter Weibull equations have reasonable results.However, the 3-Parameter Weibull model has a lower standard deviation for fatigue life.This standard deviation decreases at higher applied stress levels.These results are consistent with those of classic fatigue studies.
Xionga et al. 19 investigated multiaxial fatigue results of magnesium alloy using the modified Smith-Watson-Topper (SWT) theories and the multiaxial Jiang criterion.The results of both theories were acceptable.Jiang et al. 20 used the Markov chain Monte Carlo method to estimate the parameters of a modified Weibull distribution.They suggested the use of Markov chain Monte Carlo estimation instead of maximum-likelihood estimation for point estimation when the sample size is less than 100.Canteli et al. 21studied 3 types of fatigue models, namely, LCF, HCF, and VHCF, which are usually used in mechanical parts design.The study presented the actual results of the stress-based and strain-based approaches in a single methodology.Strzelecki 22 used 2-P and 3-P Weibull distribution and presented features of the S-N curve for fatigue limit investigation.The fatigue test results were used for rotary bending of S355J2 + C and C45 + C steels, and the S-N curves were specified.Acosta et al. 23 used measurement techniques based on temperature and magnetics to describe the fatigue behavior of metallic materials.Furthermore, they reduced the effort required to generate and provide S-N curves using valuable input parameters for short-time fatigue life calculation methods.

S-N Relation
In the S-N approach, many models have been developed for evaluating the S-N relation, with some shown in Table 2. Basquin 24 suggested a linear relation in the logarithmical scale between the applied stress (S) and the fatigue life (N).Basquin's equation is generally developed in the standards such as ASTM, ASME, and UNE and Guidelines such as FKM, DNV, and GL.Vidovic 25 performed an analytical study of the maximum-likelihood estimations for the parameters of a modified Weibull distribution model and indicated that their implementation in practice follows a rather simple pattern.Usabiaga et al. 26 implemented a model on the NCode2020 software to demonstrate the probable implementation in general commercial codes by major applications on fatigue design.
Crack propagation has always been a source of concern in determining inspection routines in different industries.Crack propagation at the higher applied stress amplitude can cause great uncertainty in the fatigue life estimation.Crack propagation has been studied widely to characterize different types of cracks, including edge, surface, subsurface, etc. [27][28][29] .Focus on these studies shows the uncertainty and wide data scatter in the fatigue life compared to other mechanical properties may be due to diversity in the crack initiation location and different crack types.
Stromeyer 30 published the empirical relation for the mathematical description of fatigue.Basquin, in his fatigue relation, had not considered the idea of the fatigue limit.Stromeyer 30 studied Wöhler's fatigue test data.To verify the existence of the fatigue limit concept, they conducted advanced rotating-beam fatigue tests on several materials to verify the existence of a definite fatigue limit.The Stromeyer law represents the Stress-Life curve by truncating the Basquin relation at the fatigue limit by plotting the load and fatigue life.In addition, Stromeyer presented a relation between fatigue samples temperature increase and fatigue limit.However, the knee point (N knee ) 31 was not specified explicitly.
Palmgren 32 presented a new theory resembling the Basquin method and the Stromeyer method.The equation in Table 2 presents the results of the fatigue test of rolling bearings.The fundamental relation of this method involves the stressed volume of material in the rolling bearing raceway sub surfaces as the main parameter."This volume of material is simplistically determined to have a nearly rectangular subsurface cross-sectional area bounded by the length of the maximum contact area ellipse and the depth at which the maximum failurecausing stress occurs".
In fatigue design, an adequate quantification of ISO 12107 inherent variation is one of the essential parameters for calculating the fatigue property in the various mechanical parts of systems and components.Also, it is essential to compare materials in fatigue properties, including their variation in engineering design.In this respect, statistical methods have been used widely to compare material properties.This International Standard includes a full methodology for the application of the Bastenaire model as well as other more sophisticated relationships.
Table 2. Common methods to estimate the S-N curves 24 .www.nature.com/scientificreports/It also addresses the analysis of runout (censored) data 33 .Ling and Pan 34 presented a new method to determine R-S-N curves to minimize the cost and the number of samples needed for laboratory testing.The stress-life curves were considered in a 3-parameter form.Kohout and Vechet 35 presented a different method to define S-N curves in the whole cyclic load domain in fatigue problems.This method incorporates all the fatigue-affected regions from ultimate strength of material to fatigue endurance limit, which is generally expressed by the Palmgren function.For every region, this method is similar to one of the previous theories, i.e., when the applied load is approximately large, the Kohout and Vechet model converts into the Basquin model.On the other hand, when the applied load is smaller than the fatigue endurance limit, this model converts to the Stromeyer function for almost infinite life and high-cycle fatigue region.Compared to the models specified above, the Kohout and Vechet model has some precedence.This method has a better curve fitting of fatigue test results, and this coefficient has unambiguous technical and geometrical meaning, which can be calculated with higher accuracy.In addition, this model is more appropriate for extrapolation and interpolation for fitted curves in the low-cycle and very-high-cycle regions.

Model S-N curves relation
The present work tries to develop the probabilistic S-N relationships for existing fatigue data in the following three steps: (1) collecting the fatigue test data; (2) estimating the probabilistic curves for every specified test condition (the key task in the present work is to determine the Weibull equation coefficient for scattered test data); and (3) evaluating the Basquin equation's coefficient from the previous step's data (to this end, a regression analysis will be done on the estimated fatigue life from the previous step).

Life distribution
Tolerance limits Fatigue data are subjected to considerable scatter.In statistical analysis, a sample with a random data set is chosen.Obtaining data from the entire population is usually impossible or has very high undue costs.Due to sample size limitations, the sample statistical parameters, including mean median or variance values, are different from the source population.Designating a confidence level assigns a quantitative value of uncertainty or confidence.Lower and upper tolerance limits in a Weibull distributed model can be calculated using Eq. ( 1a) and (1b) 6 : where k is a function of the sample size.
Replacement of sample statistical properties with source population properties involves some degree of uncertainty.This uncertainty is determined using the percent error.If the sample average is x 1 , the percent error will be 4 : where X 1 is the sample logarithmical average life, n is the sample size, ϑ is the degree of freedom, α 2 is the degree of confidence, and confidence is equal to (1 − α ), S 1 is the logarithmical standard deviation of sample life.Note that the value of t statistics is available in standard tables.
The present study reviewed the effect of stress level on the data scatter and coefficient of variation.S 1 and X 1 values were evaluated in the linear scale.

Probability distributions of samples
For the evaluation of the distribution of each sample data set, two-parameter and three-parameter Weibull distribution functions can be established as 6 : where F N f is the failure fraction in the test data set N f , N f 0 is the minimum expected fatigue life, θ = ∅ − N f 0 is characteristic fatigue life (cycles when 63.2% have failed), and b is the Weibull slope or shape parameter.The terms N f 0 , θ , and b are 3-parameter Weibull model, and for the 2-parameter Weibull model, the parameter of N f 0 is zero, N f 0 = 0.
To determine the Weibull equation coefficient for scattered test data (which is the key task in the present work), a step function, i.e., (i − 0.3)/(n + 0.4) is commonly used as a percent of failure.In the present work, every failure life of N f i in test results is considered as a Weibull distribution with a median value of N f i and cumulative probability of (i − 0.3)/(n + 0.4) .The failure fraction for median value of N f i in Eq. (3) will be 50%, so charac- teristic fatigue life, θ i , the minimum expected fatigue life, γ and the Weibull slope, b for any failure life, N i will be: Then: www.nature.com/scientificreports/ The Weibull equation for every test result from Eq. (4b) is: where P i (N@S ca ) is the expected failure fraction for fatigue life N at applied stress level of S ca .By applying Eq. (4c) and weight factor, the modified Weibull function is derived: Weibull equation coefficients are determined based on the flowchart in Fig. 1.First, Weibull parameters will be calculated by the common Weibull model (Stages 1 to 3 in the flowchart).Then, for every single test data, an equivalent Weibull distribution with the mean value of the same test fatigue life and cumulative probability of common Weibull method and modified Weibull parameters will be calculated using Eqs.(4a), (4b), (4c) and ( 5) (Stages 4 and 5 in the flowchart).Finally, test data were compared with calculated parameters and presented probability distribution model using the Kolmogorov-Smirnov test (K-S test), as follows (Stages 6 and 7): This process will be repeated until the error valve becomes acceptable.

R-S-N relation
In Basquin power law, a log-log straight linear relationship is considered between the applied stress cycles and the number of cycles to failure.R-S-N curves are depicted by determining the fatigue life using the Weibull (4b) www.nature.com/scientificreports/equation of every data set for each reliability at every specified stress level.Then, regression analysis is done to fit a linear curve to the S-N data.
In statistical theory, regression analysis is a process of identifying data trends.Regression analysis commonly uses regression analysis to evaluate relationships between some factors, including a dependent factor and other independent factors (variables).This process shows which variable is essential and which f variable can be ignored.It also shows how these variables affect each other.Linear regression is usually applied to estimate data trends.This method depends on the problem and determines one or more lines to fit the data with minimum error according to a specific mathematical calculation like ordinary least squares.This technique calculates a unique line with a minimum difference between the true data and that line compared to the sum of squared differences.In the statistical analysis of fatigue problem, selecting the best curve fitting method sometimes become very complicated.To avoid uncertainty about a divergent solution, we applied the K-S test to the presented sequences method at the final step.

Results and discussion
The test data used to evaluate the modified Weibull probability distribution parameters are extracted from the literature 4,[36][37][38][39] and presented as six grouped sets of fatigue life data in Table 3.Every data set is a set of fatigue life experimentally measured at same condition and every data group is some data set at different stress levels.C and F data groups have been accomplished only in a single stress level and thus in this study only are applied to modify Weibull parameters.The other data groups in Table 3 are fatigue life scatter at different applied stress levels and are used to modify Weibull parameters and also evaluate stress-life relation for various reliability levels.For numerical solution of the mentioned procedure, a simple M-File code was written in Matlab and Matlab curve fitting tool was applied for regression analysis.

Survival probabilities of test data
The statistical parameters can be determined by repeating the steps of flowchart in Fig. 1.Also, the P-N f (survival probability-fatigue life) relation and Weibull parameters can be evaluated using Eqs.( 3), (4c), and (5).Applying regression analysis and curve fitting, Weibull parameters of b , θ , and N f 0 can be determined in terms of the least square method (Fig. 2).The difference between test results and the distribution model shows good compliance in the modified Weibull model.The effect of sample size and deviation from Weibull distribution was reduced in the test points.
Further, by applying Eqs. ( 3), (4c), and ( 5), curve fitting, and determining Weibull parameters, the error between curves and test data sets can be obtained using Eq. ( 6) for the K-S test and R-square value, respectively.The results of the present modeling are given in Table 4. Here, the value of θ in Weibull parameters is almost constant.All the K-S test and R-square values show less error in the modified Weibull model, which complies with curves and test data in Fig. 2.
As shown in Table 4, using the modified method, the error values of the K-S test declined, and the R-square value approached 1.In all cases, the Weibull slope or shape parameter (b) decreased.The characteristic life θ has changed from 1.5 to 14%, but the minimum time or cycles to failure N f 0 has changed considerably up to 10  www.nature.com/scientificreports/times.The values of minimum expected cycles to failure obtained from Weibull Distribution in this approach should not be used directly for design.
For the set of rolling contact fatigue life (i.e., data set 1), the smaller fatigue life will have a larger survival probability.Survival probability data can be evaluated using Eqs.(4a) and (5). Figure 2 shows the results of regression analysis and curve fitting.For the test data, reliability of fatigue life is calculated based on the modified and common Weibull distributions.As indicated in Fig. 2, modified Weibull parameters have better compliance Here, the effect of the step function for percent of failure and deviation from the Weibull probability distribution function has been eliminated.Figure 3 shows the R-Square values of common Weibull and presented model with respect to sample size.It can be seen that as the sample size increases, the error of the common Weibull model becomes negligible.Small sample size cases had a greater jump in the percent failure function, leading to higher error values.
The effect of normalized stress level (with respect to the maximum stress in each data set) on the fatigue life scatter is presented in Fig. 4. As can be seen from Fig. 4, the lower applied stress amplitudes have less data scattering.

R-S-N relation
By applying Eq. ( 7), Weibull parameters of datasets 5-7 from Table 3, and using regression analysis, Basquin parameters A and B can be determined in terms of the least square method.The S-N curves in Fig. 5 show that for the higher survival probability, the expected fatigue life would decrease at every stress level.Engineering designs usually are in the 0.01 percent probability of failure range 6 .Therefore, extrapolation is required.By extrapolating these curves to the helpful percent probability of failure range, the curves would intersect, which is unreasonable.Also, Table 4 shows that the value of least fatigue life has not reasonable behavior with an increase in the stress level.
The deviation from the linear relation in the logarithmic scale is calculated considering the Basquin S-N relation and using regression analysis.Figure 6 shows the curve fitting error in the 3 reliability levels.As can be seen, the error increased at a higher reliability level. https://doi.org/10.1038/s41598-023-44907-9

1 .
Figure 1.Flowchart of parameters calculation for Modified Weibull distribution.

Figure 2 .
Figure 2. Calculated reliability for (a) Common Weibull distribution, (b) Modified Weibull distribution and (c) Modified Weibull distribution with artificially increase of test data.

Figure 3 .
Figure 3.Effect of sample size on the R-Square value for Weibull distribution (data sets 5, 6, 7 and 8).

Figure 4 .
Figure 4. Effect of stress level on the coefficient of variation of fatigue life.

Table 3 .
Test data used to evaluate the modified Weibull parameters.

Table 4 .
Results of modified Weibull parameters.